[GitHub] spark pull request #17721: [SPARK-20013][SQL]merge renameTable to alterTable...

2017-04-21 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17721

[SPARK-20013][SQL]merge renameTable to alterTable in ExternalCatalog

## What changes were proposed in this pull request?

merge renameTable to alterTable in ExternalCatalog has some reasons:

In Hive, we rename a Table by alterTable
Currently when we create / rename a managed table, we should get the 
defaultTablePath for them in ExternalCatalog, then we have two defaultTablePath 
logic in its two subclass HiveExternalCatalog and InMemoryCatalog, additionally 
there is also a defaultTablePath in SessionCatalog, so till now we have three 
defaultTablePath in three classes.
we'd better to unify them up to SessionCatalog
To unify them, we should move some logic from ExternalCatalog to 
SessionCatalog, renameTable is one of this.

while limit to the simple parameters in renameTable
```
  def renameTable(db: String, oldName: String, newName: String): Unit
```
even if we move the defaultTablePath logic to SessionCatalog, we can not 
pass it to renameTable.

So we can merge the renameTable to alterTable, and rename it in alterTable.


## How was this patch tested?

delete some tests in ExternalCatalogSuite which already existed in 
SessionCatalogSuite,
and move some other tests in ExternalCatalogSuite which does not exist in 
SessionCatalogSuite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark mergeAlterRenameTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17721.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17721


commit 2ea645c6efef36ad2833d8eced0150d4062b1efd
Author: windpiger <song...@outlook.com>
Date:   2017-04-21T14:52:30Z

[SPARK-20013][SQL]merge renameTable to alterTable in ExternalCatalog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17354: [SPARK-20024] [SQL] [test-maven] SessionCatalog reset ne...

2017-03-21 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17354
  
late LGTM~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-20 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
@gatorsmile  sbt test succeed, and mvn `create temporary view using` passed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
mvn, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL]create a temp file for file in test.ja...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
test sbt, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
add #17290 again, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
@gatorsmile  I have merge this pr with master and revert #17290, and 
waiting for the jenkins to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
ok, let me do it~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
@gatorsmile we can find all the test failed here [maven test 
result](https://spark-tests.appspot.com/builds/spark-master-test-maven-hadoop-2.7/2799)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17344: [SPARK-19990][TEST][test-maven][WIP] Use the database af...

2017-03-19 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17344
  
 LGTM,but why maven test has this problem?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17341: [SPARK-20013][SQL]add a newTablePath parameter for renam...

2017-03-18 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17341
  
cc @cloud-fan @gatorsmile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-18 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
yes,it is. 
It (jar:file://) will be re-resolved by new Path later, and will throw an 
exception described in the jira.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17341: [SPARK-20013][SQL]add a newTablePath parameter fo...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17341#discussion_r106782193
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -732,40 +696,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 assert(!exists(db.locationUri))
   }
 
-  test("create/drop/rename table should create/delete/rename the 
directory") {
--- End diff --

move to `SessionCatalogSuite`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17341: [SPARK-20013][SQL]add a newTablePath parameter fo...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17341#discussion_r106782178
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,30 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
--- End diff --

already existed in 
[SessionCatalogSuite](https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala#L393)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17341: [SPARK-20013][SQL]add a newTablePath parameter fo...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17341#discussion_r106782179
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,30 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("unknown_db", "unknown_table", "unknown_table")
-}
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "unknown_table", "unknown_table")
-}
-  }
-
-  test("rename table when destination table already exists") {
--- End diff --

move to `SessionCatalogSuite`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17341: [SPARK-20013][SQL]add a newTablePath parameter fo...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17341#discussion_r106782161
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,30 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
--- End diff --

already existed in 
[SessionCatalogSuite](https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala#L361)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17341: [SPARK-20013][SQL]add a newTablePath parameter fo...

2017-03-18 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17341

[SPARK-20013][SQL]add a newTablePath parameter for renameTable in 
ExternalCatalog

## What changes were proposed in this pull request?

Currently when we create / rename a managed table, we should get the 
defaultTablePath for them in ExternalCatalog, then we have two defaultTablePath 
logic in its two subclass HiveExternalCatalog and InMemoryCatalog, additionally 
there is also a defaultTablePath in SessionCatalog, so till now we have three 
defaultTablePath in three classes. 
we'd better to unify them up to SessionCatalog

To unify them, we should move some logic from ExternalCatalog to 
SessionCatalog, renameTable is one of this.

while limit to the simple parameters in renameTable 
```
  def renameTable(db: String, oldName: String, newName: String): Unit
```
even if we move the defaultTablePath logic to SessionCatalog, we can not 
pass it to renameTable.

So we can add a newTablePath parameter  for renameTable in ExternalCatalog 
## How was this patch tested?
delete some tests in ExternalCatalogSuite which already existed in 
SessionCatalogSuite,
and move some other tests  in ExternalCatalogSuite which does not exist in 
SessionCatalogSuite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark modifyRenameTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17341


commit 80110e5783ab38d18e40ced06c3d926e6017b03f
Author: windpiger <song...@outlook.com>
Date:   2017-03-18T14:33:49Z

[SPARK-20013][SQL]add a newTablePath parameter for renameTable in 
ExternalCatalog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL][WIP] merge renameTable to alte...

2017-03-18 Thread windpiger
Github user windpiger closed the pull request at:

https://github.com/apache/spark/pull/17340


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780117
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -732,40 +676,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 assert(!exists(db.locationUri))
   }
 
-  test("create/drop/rename table should create/delete/rename the 
directory") {
--- End diff --

move to `SessionCatalogSuite`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780097
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,50 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
--- End diff --

already existed in 
[SessionCatalog](https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala#L393)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780090
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,50 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
--- End diff --

already existed in 
[SessionCatalog](https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala#L361)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780038
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,50 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("unknown_db", "unknown_table", "unknown_table")
-}
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "unknown_table", "unknown_table")
-}
-  }
-
-  test("rename table when destination table already exists") {
--- End diff --

move to `SessionCatalog`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780033
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,50 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("unknown_db", "unknown_table", "unknown_table")
-}
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "unknown_table", "unknown_table")
-}
-  }
-
-  test("rename table when destination table already exists") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "tbl1", "tbl2")
-}
-  }
-
-  test("alter table") {
-val catalog = newBasicCatalog()
-val tbl1 = catalog.getTable("db2", "tbl1")
-catalog.alterTable(tbl1.copy(properties = Map("toh" -> "frem")))
-val newTbl1 = catalog.getTable("db2", "tbl1")
-assert(!tbl1.properties.contains("toh"))
-assert(newTbl1.properties.size == tbl1.properties.size + 1)
-assert(newTbl1.properties.get("toh") == Some("frem"))
-  }
-
-  test("alter table when database/table does not exist") {
--- End diff --

already existed in SessionCatalog


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17340#discussion_r106780030
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -196,50 +196,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 catalog.dropTable("db2", "unknown_table", ignoreIfNotExists = true, 
purge = false)
   }
 
-  test("rename table") {
-val catalog = newBasicCatalog()
-assert(catalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-catalog.renameTable("db2", "tbl1", "tblone")
-assert(catalog.listTables("db2").toSet == Set("tblone", "tbl2"))
-  }
-
-  test("rename table when database/table does not exist") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("unknown_db", "unknown_table", "unknown_table")
-}
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "unknown_table", "unknown_table")
-}
-  }
-
-  test("rename table when destination table already exists") {
-val catalog = newBasicCatalog()
-intercept[AnalysisException] {
-  catalog.renameTable("db2", "tbl1", "tbl2")
-}
-  }
-
-  test("alter table") {
--- End diff --

already existed in SessionCatalog


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17340: [SPARK-20013][SQL] merge renameTable to alterTabl...

2017-03-18 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17340

[SPARK-20013][SQL] merge renameTable to alterTable in ExternCatalog

## What changes were proposed in this pull request?


## How was this patch tested?
delete some tests in ExternalCatalogSuite which already existed in 
SessionCatalogSuite,
and move some other tests  in ExternalCatalogSuite which does not exist in 
SessionCatalogSuite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark renameTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17340


commit 70f2365bec31fa3c3a53eebe73991871cdd77e56
Author: windpiger <song...@outlook.com>
Date:   2017-03-09T08:03:19Z

use alterTable to implement renameTabe

commit e8b9ea36c586ce1af63ee7eb3dbc052455cfcdd0
Author: windpiger <song...@outlook.com>
Date:   2017-03-18T10:17:33Z

Merge branch 'master' into renameTable

commit 26c6c7b05d1c7020cbc59784ce7ea4864c9c2b2b
Author: windpiger <song...@outlook.com>
Date:   2017-03-18T11:53:58Z

merge with master and move some tests to SessionCatalogSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-18 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
According to the jenkins' log, the test case `  test("create temporary view 
using")` in `DDLSuite` passed the test. 
This pr resolved the test failed reported in 
[SPARK-19990](https://issues.apache.org/jira/browse/SPARK-19990)

there are some other test failed is not related with this PR, I will take a 
look at them.
@kayousterhout @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17338: [SPARK-19990][SQL][test-maven]create a temp file ...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17338#discussion_r106778719
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -700,20 +701,33 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 
   test("create temporary view using") {
 val csvFile =
-  
Thread.currentThread().getContextClassLoader.getResource("test-data/cars.csv").toString
-withView("testview") {
-  sql(s"CREATE OR REPLACE TEMPORARY VIEW testview (c1 String, c2 
String)  USING " +
-"org.apache.spark.sql.execution.datasources.csv.CSVFileFormat  " +
-s"OPTIONS (PATH '$csvFile')")
+  
Thread.currentThread().getContextClassLoader.getResourceAsStream("test-data/cars.csv")
--- End diff --

`getResourceAsStream` is  used to read resource files from jar's resource, 
it is ok.
the point is the return value not the relative path `test-data/cars.csv`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17338: [SPARK-19990][SQL][test-maven]create a temp file for fil...

2017-03-18 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17338
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17338: [SPARK-19990][SQL][maven]create a temp file for f...

2017-03-18 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17338#discussion_r106777091
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -700,20 +701,33 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 
   test("create temporary view using") {
 val csvFile =
-  
Thread.currentThread().getContextClassLoader.getResource("test-data/cars.csv").toString
-withView("testview") {
-  sql(s"CREATE OR REPLACE TEMPORARY VIEW testview (c1 String, c2 
String)  USING " +
-"org.apache.spark.sql.execution.datasources.csv.CSVFileFormat  " +
-s"OPTIONS (PATH '$csvFile')")
+  
Thread.currentThread().getContextClassLoader.getResourceAsStream("test-data/cars.csv")
--- End diff --

the point is that the return value of `getResource.toString` is 
'jar:file:/xxx.jar!/test-data/cars.csv'.If we use this path to set table 
location as following code in the test case, this path will cause exception 
when it is used to construct `new Path()` in 
[here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L344).

So I copy this file from resource in jar to a temp file, which path is 
'file:/tmppath/cars.csv', and use this normal path to set table location ,it is 
ok for the following code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17338: [SPARK-19990][SQL]create a temp file for file in ...

2017-03-17 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17338

[SPARK-19990][SQL]create a temp file for file in test.jar's resource when 
run test accross different modules

## What changes were proposed in this pull request?

After we have merged the `HiveDDLSuite` and `DDLSuite` in 
[SPARK-19235](https://issues.apache.org/jira/browse/SPARK-19235), we have two 
subclasses of `DDLSuite`, that is `HiveCatalogedDDLSuite` and 
`InMemoryCatalogDDLSuite`.

While `DDLSuite` is in `sql/core module`, and `HiveCatalogedDDLSuite` is in 
`sql/hive module`, if we test 
`HiveCatalogedDDLSuite`, it will run the test in its parent class 
`DDLSuite`, this will cause some test case failed which will get and use the 
test file path in `sql/core module` 's `resource`.

Because the test file path getted will start with 'jar:' like 
"jar:file:/home/jenkins/workspace/spark-master-test-maven-hadoop-2.6/sql/core/target/spark-sql_2.11-2.2.0-SNAPSHOT-tests.jar!/test-data/cars.csv",
 which will failed when new Path() new Path in datasource.scala

This PR fix this by copy file from resource to  a temp dir.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark fixtestfailemvn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17338.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17338


commit 64c6b223840307506468430ed5b50d24bf76218d
Author: windpiger <song...@outlook.com>
Date:   2017-03-18T04:07:25Z

[SPARK-19990][SQL]create a temp file for file in test.jar's resource when 
run test accross different modules

commit 09b5e4065d23fbc8ddc64bc1dc92557648d03818
Author: windpiger <song...@outlook.com>
Date:   2017-03-18T04:17:15Z

fix a comment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106570185
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -27,41 +27,67 @@ import 
org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{Range, SubqueryAlias, 
View}
 
+class InMemorySessionCatalogSuite extends SessionCatalogSuite {
+  protected val utils = new CatalogTestUtils {
+override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
+override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
+override val defaultProvider: String = "parquet"
+override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
+  }
+}
+
 /**
- * Tests for [[SessionCatalog]] that assume that [[InMemoryCatalog]] is 
correctly implemented.
+ * Tests for [[SessionCatalog]]
  *
  * Note: many of the methods here are very similar to the ones in 
[[ExternalCatalogSuite]].
  * This is because [[SessionCatalog]] and [[ExternalCatalog]] share many 
similar method
  * signatures but do not extend a common parent. This is largely by design 
but
  * unfortunately leads to very similar test code in two places.
  */
-class SessionCatalogSuite extends PlanTest {
-  private val utils = new CatalogTestUtils {
-override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
-override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
-override val defaultProvider: String = "parquet"
-override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
-  }
+abstract class SessionCatalogSuite extends PlanTest {
+  protected val utils: CatalogTestUtils
+
+  protected val isHiveExternalCatalog = false
 
   import utils._
 
+  private def withBasicCatalog(f: SessionCatalog => Unit): Unit = {
+val catalog = new SessionCatalog(newBasicCatalog())
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
+try {
+  f(catalog)
+} finally {
+  catalog.reset()
+}
+  }
+
+  private def withEmptyCatalog(f: SessionCatalog => Unit): Unit = {
+val catalog = new SessionCatalog(newEmptyCatalog())
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
--- End diff --

According to all the test cases in this Suite, it create the default 
database with `ignoreIfExists=true`, we have a default database existed here 
will not effect the test case


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106569991
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -27,41 +27,67 @@ import 
org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{Range, SubqueryAlias, 
View}
 
+class InMemorySessionCatalogSuite extends SessionCatalogSuite {
+  protected val utils = new CatalogTestUtils {
+override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
+override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
+override val defaultProvider: String = "parquet"
+override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
+  }
+}
+
 /**
- * Tests for [[SessionCatalog]] that assume that [[InMemoryCatalog]] is 
correctly implemented.
+ * Tests for [[SessionCatalog]]
  *
  * Note: many of the methods here are very similar to the ones in 
[[ExternalCatalogSuite]].
  * This is because [[SessionCatalog]] and [[ExternalCatalog]] share many 
similar method
  * signatures but do not extend a common parent. This is largely by design 
but
  * unfortunately leads to very similar test code in two places.
  */
-class SessionCatalogSuite extends PlanTest {
-  private val utils = new CatalogTestUtils {
-override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
-override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
-override val defaultProvider: String = "parquet"
-override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
-  }
+abstract class SessionCatalogSuite extends PlanTest {
+  protected val utils: CatalogTestUtils
+
+  protected val isHiveExternalCatalog = false
 
   import utils._
 
+  private def withBasicCatalog(f: SessionCatalog => Unit): Unit = {
+val catalog = new SessionCatalog(newBasicCatalog())
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
+try {
+  f(catalog)
+} finally {
+  catalog.reset()
+}
+  }
+
+  private def withEmptyCatalog(f: SessionCatalog => Unit): Unit = {
+val catalog = new SessionCatalog(newEmptyCatalog())
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
+try {
+  f(catalog)
+} finally {
+  catalog.reset()
+}
+  }
   // 
--
   // Databases
   // 
--
 
   test("basic create and list databases") {
-val catalog = new SessionCatalog(newEmptyCatalog())
-catalog.createDatabase(newDb("default"), ignoreIfExists = true)
-assert(catalog.databaseExists("default"))
-assert(!catalog.databaseExists("testing"))
-assert(!catalog.databaseExists("testing2"))
-catalog.createDatabase(newDb("testing"), ignoreIfExists = false)
-assert(catalog.databaseExists("testing"))
-assert(catalog.listDatabases().toSet == Set("default", "testing"))
-catalog.createDatabase(newDb("testing2"), ignoreIfExists = false)
-assert(catalog.listDatabases().toSet == Set("default", "testing", 
"testing2"))
-assert(catalog.databaseExists("testing2"))
-assert(!catalog.databaseExists("does_not_exist"))
+withEmptyCatalog { catalog =>
+  catalog.createDatabase(newDb("default"), ignoreIfExists = true)
--- End diff --

here the ignoreIfExists = true, I think it is ok that the default database 
have already created.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106569958
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -27,41 +27,67 @@ import 
org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{Range, SubqueryAlias, 
View}
 
+class InMemorySessionCatalogSuite extends SessionCatalogSuite {
+  protected val utils = new CatalogTestUtils {
+override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
+override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
+override val defaultProvider: String = "parquet"
+override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
+  }
+}
+
 /**
- * Tests for [[SessionCatalog]] that assume that [[InMemoryCatalog]] is 
correctly implemented.
+ * Tests for [[SessionCatalog]]
  *
  * Note: many of the methods here are very similar to the ones in 
[[ExternalCatalogSuite]].
  * This is because [[SessionCatalog]] and [[ExternalCatalog]] share many 
similar method
  * signatures but do not extend a common parent. This is largely by design 
but
  * unfortunately leads to very similar test code in two places.
  */
-class SessionCatalogSuite extends PlanTest {
-  private val utils = new CatalogTestUtils {
-override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
-override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
-override val defaultProvider: String = "parquet"
-override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
-  }
+abstract class SessionCatalogSuite extends PlanTest {
+  protected val utils: CatalogTestUtils
+
+  protected val isHiveExternalCatalog = false
 
   import utils._
 
+  private def withBasicCatalog(f: SessionCatalog => Unit): Unit = {
+val catalog = new SessionCatalog(newBasicCatalog())
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
--- End diff --

here the `ignoreIfExists = true`, I think it is ok that the default 
database have already created.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17317: [SPARK-19329][SQL][BRANCH-2.1]Reading from or wri...

2017-03-16 Thread windpiger
Github user windpiger closed the pull request at:

https://github.com/apache/spark/pull/17317


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17317: [SPARK-19329][SQL][BRANCH-2.1]Reading from or writing to...

2017-03-16 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17317
  
cc @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17317: [SPARK-19329][SQL][BRANCH-2.1]Reading from or wri...

2017-03-16 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17317

[SPARK-19329][SQL][BRANCH-2.1]Reading from or writing to a datasource table 
with a non pre-existing location should succeed

## What changes were proposed in this pull request?

This is a backport pr of https://github.com/apache/spark/pull/16672 into 
branch-2.1.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark backport-insertnotexists

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17317.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17317






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16672: [SPARK-19329][SQL]Reading from or writing to a datasourc...

2017-03-16 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16672
  
ok thanks,its my pleasure~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17287: [SPARK-19945][SQL]add test suite for SessionCatalog with...

2017-03-16 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17287
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106342082
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -1270,6 +1376,7 @@ class SessionCatalogSuite extends PlanTest {
   }
 
   assert(cause.getMessage.contains("Undefined function: 
'undefined_fn'"))
+  catalog.reset()
--- End diff --

yes, you are right, let me add a try catch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106336735
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -76,468 +102,500 @@ class SessionCatalogSuite extends PlanTest {
   }
 
   test("create databases using invalid names") {
-val catalog = new SessionCatalog(newEmptyCatalog())
-testInvalidName(name => catalog.createDatabase(newDb(name), 
ignoreIfExists = true))
+withEmptyCatalog { catalog =>
+  testInvalidName(
+name => catalog.createDatabase(newDb(name), ignoreIfExists = true))
+}
   }
 
   test("get database when a database exists") {
-val catalog = new SessionCatalog(newBasicCatalog())
-val db1 = catalog.getDatabaseMetadata("db1")
-assert(db1.name == "db1")
-assert(db1.description.contains("db1"))
+withBasicCatalog { catalog =>
+  val db1 = catalog.getDatabaseMetadata("db1")
+  assert(db1.name == "db1")
+  assert(db1.description.contains("db1"))
+}
   }
 
   test("get database should throw exception when the database does not 
exist") {
-val catalog = new SessionCatalog(newBasicCatalog())
-intercept[NoSuchDatabaseException] {
-  catalog.getDatabaseMetadata("db_that_does_not_exist")
+withBasicCatalog { catalog =>
+  intercept[NoSuchDatabaseException] {
+catalog.getDatabaseMetadata("db_that_does_not_exist")
+  }
 }
   }
 
   test("list databases without pattern") {
-val catalog = new SessionCatalog(newBasicCatalog())
-assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", 
"db3"))
+withBasicCatalog { catalog =>
+  assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", 
"db3"))
+}
   }
 
   test("list databases with pattern") {
-val catalog = new SessionCatalog(newBasicCatalog())
-assert(catalog.listDatabases("db").toSet == Set.empty)
-assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", "db3"))
-assert(catalog.listDatabases("*1").toSet == Set("db1"))
-assert(catalog.listDatabases("db2").toSet == Set("db2"))
+withBasicCatalog { catalog =>
+  assert(catalog.listDatabases("db").toSet == Set.empty)
+  assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", 
"db3"))
+  assert(catalog.listDatabases("*1").toSet == Set("db1"))
+  assert(catalog.listDatabases("db2").toSet == Set("db2"))
+}
   }
 
   test("drop database") {
-val catalog = new SessionCatalog(newBasicCatalog())
-catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = false)
-assert(catalog.listDatabases().toSet == Set("default", "db2", "db3"))
+withBasicCatalog { catalog =>
+  catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = 
false)
+  assert(catalog.listDatabases().toSet == Set("default", "db2", "db3"))
+}
   }
 
   test("drop database when the database is not empty") {
 // Throw exception if there are functions left
-val externalCatalog1 = newBasicCatalog()
-val sessionCatalog1 = new SessionCatalog(externalCatalog1)
-externalCatalog1.dropTable("db2", "tbl1", ignoreIfNotExists = false, 
purge = false)
-externalCatalog1.dropTable("db2", "tbl2", ignoreIfNotExists = false, 
purge = false)
-intercept[AnalysisException] {
-  sessionCatalog1.dropDatabase("db2", ignoreIfNotExists = false, 
cascade = false)
+withBasicCatalog { catalog =>
+  catalog.externalCatalog.dropTable("db2", "tbl1", ignoreIfNotExists = 
false, purge = false)
+  catalog.externalCatalog.dropTable("db2", "tbl2", ignoreIfNotExists = 
false, purge = false)
+  intercept[AnalysisException] {
+catalog.dropDatabase("db2", ignoreIfNotExists = false, cascade = 
false)
+  }
 }
-
-// Throw exception if there are tables left
-val externalCatalog2 = newBasicCatalog()
-val sessionCatalog2 = new SessionCatalog(externalCatalog2)
-externalCatalog2.dropFunction("db2", 

[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106336045
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -999,257 +1083,279 @@ class SessionCatalogSuite extends PlanTest {
   expectedParts: CatalogTablePartition*): Boolean = {
 // ExternalCatalog may set a default location for partitions, here we 
ignore the partition
 // location when comparing them.
-actualParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet ==
-  expectedParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet
+val actualPartsNormalize = actualParts.map(p =>
+  p.copy(parameters = Map.empty, storage = p.storage.copy(
+properties = Map.empty, locationUri = None, serde = None))).toSet
+
+val expectedPartsNormalize = expectedParts.map(p =>
+p.copy(parameters = Map.empty, storage = p.storage.copy(
+  properties = Map.empty, locationUri = None, serde = None))).toSet
+
+actualPartsNormalize == expectedPartsNormalize
+//actualParts.map(p =>
+//  p.copy(storage = p.storage.copy(
+//properties = Map.empty, locationUri = None))).toSet ==
+//  expectedParts.map(p =>
+//p.copy(storage = p.storage.copy(properties = Map.empty, 
locationUri = None))).toSet
--- End diff --

sorry, let me remove it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106335967
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -1270,6 +1376,7 @@ class SessionCatalogSuite extends PlanTest {
   }
 
   assert(cause.getMessage.contains("Undefined function: 
'undefined_fn'"))
+  catalog.reset()
--- End diff --

here the `SessionCatalog` is instanced with different `conf` parameter.
In `withBasicCatalog`, it just leave it default


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106335778
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -999,257 +1083,279 @@ class SessionCatalogSuite extends PlanTest {
   expectedParts: CatalogTablePartition*): Boolean = {
 // ExternalCatalog may set a default location for partitions, here we 
ignore the partition
 // location when comparing them.
-actualParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet ==
-  expectedParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet
+val actualPartsNormalize = actualParts.map(p =>
--- End diff --

Yes, it is~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17287: [SPARK-19945][SQL]add test suite for SessionCatalog with...

2017-03-15 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17287
  
@gatorsmile @cloud-fan I have fix some code view, could you help to 
continue reivew this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17305: [SPARK-19961][SQL][minor]unify a erro msg when drop data...

2017-03-15 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17305
  
yes,it is a trivial jira~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17305: [SPARK-19961][SQL][minor]unify a erro msg when drop data...

2017-03-15 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17305
  
cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17305: [SPARK-19961][SQL][minor]unify a erro msg when dr...

2017-03-15 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17305

[SPARK-19961][SQL][minor]unify a erro msg when drop databse for 
HiveExternalCatalog and InMemoryCatalog


## What changes were proposed in this pull request?

unify a exception erro msg for dropdatabase when the database still have 
some tables.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark unifyErromsg

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17305.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17305


commit e25c3fb33fd5cf7f07fa927d075a6ae06a4c3df9
Author: windpiger <song...@outlook.com>
Date:   2017-03-15T14:55:17Z

[SPARK-19961][SQL]unify a erro msg when drop databse for 
HiveExternalCatalog and InMemoryCatalog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-15 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106145746
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -27,41 +27,83 @@ import 
org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{Range, SubqueryAlias, 
View}
 
+class InMemorySessionCatalogSuite extends SessionCatalogSuite {
+  protected val utils = new CatalogTestUtils {
+override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
+override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
+override val defaultProvider: String = "parquet"
+override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
+  }
+}
+
 /**
- * Tests for [[SessionCatalog]] that assume that [[InMemoryCatalog]] is 
correctly implemented.
+ * Tests for [[SessionCatalog]]
  *
  * Note: many of the methods here are very similar to the ones in 
[[ExternalCatalogSuite]].
  * This is because [[SessionCatalog]] and [[ExternalCatalog]] share many 
similar method
  * signatures but do not extend a common parent. This is largely by design 
but
  * unfortunately leads to very similar test code in two places.
  */
-class SessionCatalogSuite extends PlanTest {
-  private val utils = new CatalogTestUtils {
-override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
-override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
-override val defaultProvider: String = "parquet"
-override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
-  }
+abstract class SessionCatalogSuite extends PlanTest {
+  protected val utils: CatalogTestUtils
+
+  protected val isHiveExternalCatalog = false
 
   import utils._
 
+  val BASIC = "basic"
+  val EMPTY = "empty"
+
+  private def withSessionCatalog(
--- End diff --

ok~ thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-14 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r106065119
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -76,468 +118,499 @@ class SessionCatalogSuite extends PlanTest {
   }
 
   test("create databases using invalid names") {
-val catalog = new SessionCatalog(newEmptyCatalog())
-testInvalidName(name => catalog.createDatabase(newDb(name), 
ignoreIfExists = true))
+withSessionCatalog(EMPTY) { catalog =>
+  testInvalidName(
+name => catalog.createDatabase(newDb(name), ignoreIfExists = true))
+}
   }
 
   test("get database when a database exists") {
-val catalog = new SessionCatalog(newBasicCatalog())
-val db1 = catalog.getDatabaseMetadata("db1")
-assert(db1.name == "db1")
-assert(db1.description.contains("db1"))
+withSessionCatalog() { catalog =>
+  val db1 = catalog.getDatabaseMetadata("db1")
+  assert(db1.name == "db1")
+  assert(db1.description.contains("db1"))
+}
   }
 
   test("get database should throw exception when the database does not 
exist") {
-val catalog = new SessionCatalog(newBasicCatalog())
-intercept[NoSuchDatabaseException] {
-  catalog.getDatabaseMetadata("db_that_does_not_exist")
+withSessionCatalog() { catalog =>
+  intercept[NoSuchDatabaseException] {
+catalog.getDatabaseMetadata("db_that_does_not_exist")
+  }
 }
   }
 
   test("list databases without pattern") {
-val catalog = new SessionCatalog(newBasicCatalog())
-assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", 
"db3"))
+withSessionCatalog() { catalog =>
+  assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", 
"db3"))
+}
   }
 
   test("list databases with pattern") {
-val catalog = new SessionCatalog(newBasicCatalog())
-assert(catalog.listDatabases("db").toSet == Set.empty)
-assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", "db3"))
-assert(catalog.listDatabases("*1").toSet == Set("db1"))
-assert(catalog.listDatabases("db2").toSet == Set("db2"))
+withSessionCatalog() { catalog =>
+  assert(catalog.listDatabases("db").toSet == Set.empty)
+  assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", 
"db3"))
+  assert(catalog.listDatabases("*1").toSet == Set("db1"))
+  assert(catalog.listDatabases("db2").toSet == Set("db2"))
+}
   }
 
   test("drop database") {
-val catalog = new SessionCatalog(newBasicCatalog())
-catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = false)
-assert(catalog.listDatabases().toSet == Set("default", "db2", "db3"))
+withSessionCatalog() { catalog =>
+  catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = 
false)
+  assert(catalog.listDatabases().toSet == Set("default", "db2", "db3"))
+}
   }
 
   test("drop database when the database is not empty") {
 // Throw exception if there are functions left
-val externalCatalog1 = newBasicCatalog()
-val sessionCatalog1 = new SessionCatalog(externalCatalog1)
-externalCatalog1.dropTable("db2", "tbl1", ignoreIfNotExists = false, 
purge = false)
-externalCatalog1.dropTable("db2", "tbl2", ignoreIfNotExists = false, 
purge = false)
-intercept[AnalysisException] {
-  sessionCatalog1.dropDatabase("db2", ignoreIfNotExists = false, 
cascade = false)
+withSessionCatalogAndExternal() { (catalog, externalCatalog) =>
+  externalCatalog.dropTable("db2", "tbl1", ignoreIfNotExists = false, 
purge = false)
+  externalCatalog.dropTable("db2", "tbl2", ignoreIfNotExists = false, 
purge = false)
+  intercept[AnalysisException] {
+catalog.dropDatabase("db2", ignoreIfNotExists = false, cascade = 
false)
+  }
 }
-
-// Throw exception if there are tables left
-val externalCatalog2 = newBasicCatalog()
-val sessionCatalog2 = new SessionCatalog(externalCatal

[GitHub] spark issue #17272: [SPARK-19724][SQL]create a managed table with an existed...

2017-03-14 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17272
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-14 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17272#discussion_r105889902
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1892,7 +1892,7 @@ def test_list_tables(self):
 self.assertEquals(spark.catalog.listTables(), [])
 self.assertEquals(spark.catalog.listTables("some_db"), [])
 spark.createDataFrame([(1, 1)]).createOrReplaceTempView("temp_tab")
-spark.sql("CREATE TABLE tab1 (name STRING, age INT) USING parquet")
+spark.sql("CREATE TABLE tbl1 (name STRING, age INT) USING parquet")
--- End diff --

python test failed 
```
the 
location('file:/home/jenkins/workspace/SparkPullRequestBuilder/spark-warehouse/tab1')
 of table('`default`.`tab1`') already exists.;
```
the location of tab1 does not deleted , I have tried to found out in which 
test case it forget to delete it(search all test cases containing `tab1` and 
run it), but they are all ok to delete the location of tab1, so here we change 
the table name to work around it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-14 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r105860834
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -999,257 +1094,279 @@ class SessionCatalogSuite extends PlanTest {
   expectedParts: CatalogTablePartition*): Boolean = {
 // ExternalCatalog may set a default location for partitions, here we 
ignore the partition
 // location when comparing them.
-actualParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet ==
-  expectedParts.map(p => p.copy(storage = p.storage.copy(locationUri = 
None))).toSet
+val actualPartsNormalize = actualParts.map(p =>
--- End diff --

this is the expected part.

https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala#L857

here add some normalize logic


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-14 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17287#discussion_r105858890
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 ---
@@ -27,41 +27,83 @@ import 
org.apache.spark.sql.catalyst.parser.CatalystSqlParser
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{Range, SubqueryAlias, 
View}
 
+class InMemorySessionCatalogSuite extends SessionCatalogSuite {
+  protected val utils = new CatalogTestUtils {
+override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
+override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
+override val defaultProvider: String = "parquet"
+override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
+  }
+}
+
 /**
- * Tests for [[SessionCatalog]] that assume that [[InMemoryCatalog]] is 
correctly implemented.
+ * Tests for [[SessionCatalog]]
  *
  * Note: many of the methods here are very similar to the ones in 
[[ExternalCatalogSuite]].
  * This is because [[SessionCatalog]] and [[ExternalCatalog]] share many 
similar method
  * signatures but do not extend a common parent. This is largely by design 
but
  * unfortunately leads to very similar test code in two places.
  */
-class SessionCatalogSuite extends PlanTest {
-  private val utils = new CatalogTestUtils {
-override val tableInputFormat: String = 
"com.fruit.eyephone.CameraInputFormat"
-override val tableOutputFormat: String = 
"com.fruit.eyephone.CameraOutputFormat"
-override val defaultProvider: String = "parquet"
-override def newEmptyCatalog(): ExternalCatalog = new InMemoryCatalog
-  }
+abstract class SessionCatalogSuite extends PlanTest {
+  protected val utils: CatalogTestUtils
+
+  protected val isHiveExternalCatalog = false
 
   import utils._
 
+  val BASIC = "basic"
+  val EMPTY = "empty"
+
+  private def withSessionCatalog(
+  catalogTpe: String = BASIC)(f: SessionCatalog => Unit): Unit = {
+val catalog = catalogTpe match {
+  case BASIC =>
+new SessionCatalog(newBasicCatalog())
+  case EMPTY =>
+new SessionCatalog(newEmptyCatalog())
+}
+catalog.createDatabase(newDb("default"), ignoreIfExists = true)
+try {
+  f(catalog)
+} finally {
+  catalog.reset()
--- End diff --

reset the catalog


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...

2017-03-14 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17287

[SPARK-19945][SQL]add test suite for SessionCatalog with HiveExternalCatalog


## What changes were proposed in this pull request?

Currently `SessionCatalogSuite` is only for `InMemoryCatalog`, there is no 
suite for `HiveExternalCatalog`.
And there are some ddl function is not proper to test in 
`ExternalCatalogSuite`, because some logic are not full implement in 
`ExternalCatalog`, these ddl functions are full implement in `SessionCatalog`, 
it is better to test it in `SessionCatalogSuite`

So we should add a test suite for `SessionCatalog` with 
`HiveExternalCatalog`
## How was this patch tested?
add `HiveExternalSessionCatalogSuite`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark sessioncatalogsuit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17287.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17287


commit 44684461c86d2810ffdfae23485d74fba3801e12
Author: windpiger <song...@outlook.com>
Date:   2017-03-14T09:11:57Z

[SPARK-19945][SQL]add test suite for SessionCatalog with HiveExternalCatalog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17272: [SPARK-19724][SQL]create a managed table with an existed...

2017-03-14 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17272
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-14 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17272#discussion_r105834140
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1901,7 +1901,7 @@ def test_list_tables(self):
 self.assertEquals(len(tables), 2)
 self.assertEquals(len(tablesSomeDb), 2)
 self.assertEquals(tables[0], Table(
-name="tab1",
+name="t1",
--- End diff --

python test failed 
```
the 
location('file:/home/jenkins/workspace/SparkPullRequestBuilder/spark-warehouse/tab1')
 of table('`default`.`tab1`') already exists.;
```
the location of tab1 does not deleted , I have tried to found out in which 
test case it forget to delete it(search all test cases containing `tab1` and 
run it), but they are all ok to delete the location of tab1, so here we change 
the table name to work around it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17272: [SPARK-19724][SQL]create a managed table with an existed...

2017-03-13 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17272
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17272: [SPARK-19724][SQL]create a managed table with an existed...

2017-03-13 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17272
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-13 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17272#discussion_r105601636
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
 ---
@@ -155,6 +154,8 @@ case class CreateDataSourceTableAsSelectCommand(
   } else {
 table.storage.locationUri
   }
+
+  sparkSession.sessionState.catalog.checkTableOrPathExists(table, 
ignoreIfExists = false)
--- End diff --

because in following `saveDataIntoTable` will create the table path and 
store data in it, we should not check the path in following `createTable`, we 
check it here before `saveDataIntoTable` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-13 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17272#discussion_r105601455
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -164,15 +164,6 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 assert(actual.tableType === CatalogTableType.EXTERNAL)
   }
 
-  test("create table when the table already exists") {
--- End diff --

since in `InMemoryCatalog` and `HiveExternalCatalog` does not check if the 
table exists, here move the test case to `SessionCatalogSuite` and 
`HiveExternalCatalogSuite`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-13 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17272#discussion_r105601271
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala
 ---
@@ -187,38 +187,32 @@ class InMemoryCatalog(
 val db = tableDefinition.identifier.database.get
 requireDbExists(db)
 val table = tableDefinition.identifier.table
-if (tableExists(db, table)) {
--- End diff --

we have check if the table exists in SessionCatalog.createtable


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17057: [SPARK-19724][SQL][WIP]create a managed table wit...

2017-03-13 Thread windpiger
Github user windpiger closed the pull request at:

https://github.com/apache/spark/pull/17057


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-03-13 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17272

[SPARK-19724][SQL]create a managed table with an existed default table 
should throw an exception

## What changes were proposed in this pull request?

This JIRA is a follow up work after 
[SPARK-19583](https://issues.apache.org/jira/browse/SPARK-19583)

As we discussed in that [PR](https://github.com/apache/spark/pull/16938)

The following DDL for a managed table with an existed default location 
should throw an exception:
```
CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ...
CREATE TABLE ... (PARTITIONED BY ...)
```
Currently there are some situations which are not consist with above logic:

1. CREATE TABLE ... (PARTITIONED BY ...) succeed with an existed default 
location
situation: for both hive/datasource(with 
HiveExternalCatalog/InMemoryCatalog)

2. CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ...
situation: hive table succeed with an existed default location

This PR is going to make above two situations consist with the logic that 
it should throw an exception
with an existed default location.

## How was this patch tested?
unit test added

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark managedTableLocationExists

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17272


commit a0ba4190a96e2698e9c485b0ccbb7298a360f1dd
Author: windpiger <song...@outlook.com>
Date:   2017-03-13T07:24:18Z

[SPARK-19724][SQL]create a managed table with an existed default table path 
should throw an exception




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15918: [SPARK-18122][SQL][WIP]Fallback to Kryo for unsupported ...

2017-03-12 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/15918
  
this change of `schemaFor` will affect the default behavior.
For example:
[udf in 
functions.scala](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L3177)
if the input type is not supported and fall back to kryo with BinaryType, 
here `inputTypes` before will be `Nil` which is ok for the following 
expression's typecheck, but after fall back to kryo it will failed when 
typecheck(BinaryType not equal to ReturnType), so I left logic to throw 
exception in the `schemaFor` for `un-nested` unsupported type , while for 
complex nested including `Map` etc, it will fall back to kryo. 

even if I use `schemaForDefaultBinaryType `  not  `schemaFor` in 
`deserializerFor ` , and I think it is ok for this serde situation. But I am 
not sure the change in `schmaFor` will affect other logics like `udf` described 
above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-03-12 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17001
  
@gatorsmile thanks for your suggestion~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17211: [SPARK-19869][SQL]move table related ddl from ddl...

2017-03-12 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17211#discussion_r105576214
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -167,599 +153,6 @@ case class DescribeDatabaseCommand(
   }
 }
--- End diff --

ok~ thanks, another problem is now  the lines of tables.scala is 1594, this 
is too large, should we need to split it to reduce the  lines of a file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17254
  
cc @cloud-fan @gatorsmile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17254: [SPARK-19917][SQL]qualified partition path stored in cat...

2017-03-11 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17254
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17055: [SPARK-19723][SQL]create datasource table with an...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17055#discussion_r105526116
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -2016,48 +2016,79 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
+  test("create datasource table with a non-existing location") {
+withTable("t", "t1") {
+  withTempPath { dir =>
+spark.sql(s"CREATE TABLE t(a int, b int) USING parquet LOCATION 
'$dir'")
+
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(table.location == makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t SELECT 1, 2")
+assert(dir.exists())
+
+checkAnswer(spark.table("t"), Row(1, 2))
+  }
+  // partition table
+  withTempPath { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t1(a int, b int) USING parquet PARTITIONED 
BY(a) LOCATION '$dir'
+   """.stripMargin)
+
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+assert(table.location == makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t1 PARTITION(a=1) SELECT 2")
+
+val partDir = new File(dir, "a=1")
+assert(partDir.exists())
+
+checkAnswer(spark.table("t1"), Row(2, 1))
+  }
+}
+  }
+
   Seq(true, false).foreach { shouldDelete =>
-val tcName = if (shouldDelete) "non-existent" else "existed"
+val tcName = if (shouldDelete) "non-existing" else "existed"
 test(s"CTAS for external data source table with a $tcName location") {
   withTable("t", "t1") {
-withTempDir {
-  dir =>
-if (shouldDelete) {
-  dir.delete()
-}
-spark.sql(
-  s"""
- |CREATE TABLE t
- |USING parquet
- |LOCATION '$dir'
- |AS SELECT 3 as a, 4 as b, 1 as c, 2 as d
-   """.stripMargin)
-val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
-assert(table.location == 
makeQualifiedPath(dir.getAbsolutePath))
+withTempDir { dir =>
+  if (shouldDelete) {
+dir.delete()
+  }
--- End diff --

ok thanks~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17254: [SPARK-19917][SQL]qualified partition path stored...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17254#discussion_r105526016
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -2180,6 +2181,13 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 
   withTempDir { dir =>
 assert(!dir.getAbsolutePath.startsWith("file:/"))
+spark.sql(s"ALTER TABLE t SET LOCATION '$dir'")
--- End diff --

ALTER TABLE SET LOCATION
should be also qualified


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17254: [SPARK-19917][SQL]qualified partition path stored...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17254#discussion_r105526010
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1357,13 +1360,11 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   "PARTITION (a='2', b='6') LOCATION 'paris' PARTITION (a='3', b='7')")
 assert(catalog.listPartitions(tableIdent).map(_.spec).toSet == 
Set(part1, part2, part3))
 assert(catalog.getPartition(tableIdent, 
part1).storage.locationUri.isDefined)
-val partitionLocation = if (isUsingHiveMetastore) {
-  val tableLocation = 
catalog.getTableMetadata(tableIdent).storage.locationUri
-  assert(tableLocation.isDefined)
-  makeQualifiedPath(new Path(tableLocation.get.toString, 
"paris").toString)
-} else {
-  new URI("paris")
--- End diff --

ALTER TABLE ADD PARTITION LOCATION
relative location will be quallified with parent path using table location



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17254: [SPARK-19917][SQL]qualified partition path stored...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17254#discussion_r105526000
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1201,7 +1202,9 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 verifyLocation(new URI("/swanky/steak/place"))
 // set table partition location without explicitly specifying database
 sql("ALTER TABLE tab1 PARTITION (a='1', b='2') SET LOCATION 'vienna'")
-verifyLocation(new URI("vienna"), Some(partSpec))
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("tab1"))
--- End diff --

relative location will be quallified with parent path using table location


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17254: [SPARK-19917][SQL]qualified partition path stored...

2017-03-10 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17254

[SPARK-19917][SQL]qualified partition path stored in catalog

## What changes were proposed in this pull request?

partition path should be qualified to store in catalog. 
There are some scenes:
1. ALTER TABLE t PARTITION(b=1) SET LOCATION '/path/x' 
   qualified: file:/path/x
2. ALTER TABLE t PARTITION(b=1) SET LOCATION 'x' 
 qualified: file:/tablelocation/x
3. ALTER TABLE t ADD PARTITION(b=1) LOCATION '/path/x'
   qualified: file:/path/x
4. ALTER TABLE t ADD PARTITION(b=1) LOCATION 'x'
 qualified: file:/tablelocation/x

Currently only  ALTER TABLE t ADD PARTITION(b=1) LOCATION for hive serde 
table has the expected qualified path. we should make other scenes to be 
consist with it.

Another change is for alter table location.

## How was this patch tested?
modify existing TestCases


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark qualifiedPartitionPath

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17254.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17254


commit 51dcf4092345e173a63e088e779fa44c4fc963b1
Author: windpiger <song...@outlook.com>
Date:   2017-03-11T04:39:02Z

[SPARK-19917][SQL]qualified partition path stored in catalog

commit 996a84d8d1ece4d311240e71a1a9be47828621f8
Author: windpiger <song...@outlook.com>
Date:   2017-03-11T04:42:32Z

remove empty line




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17055: [SPARK-19723][SQL]create datasource table with an...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17055#discussion_r105520260
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -1663,95 +1663,127 @@ class HiveDDLSuite
 }
   }
 
+  test("create hive table with a non-existing location") {
+withTable("t", "t1") {
+  withTempPath { dir =>
+spark.sql(s"CREATE TABLE t(a int, b int) USING hive LOCATION 
'$dir'")
+
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(table.location == makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t SELECT 1, 2")
+assert(dir.exists())
+
+checkAnswer(spark.table("t"), Row(1, 2))
+  }
+  // partition table
+  withTempPath { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t1(a int, b int)
+ |USING hive
+ |PARTITIONED BY(a)
+ |LOCATION '$dir'
+   """.stripMargin)
+
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+assert(table.location == makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t1 PARTITION(a=1) SELECT 2")
+
+val partDir = new File(dir, "a=1")
+assert(partDir.exists())
+
+checkAnswer(spark.table("t1"), Row(2, 1))
+  }
+}
+  }
+
   Seq(true, false).foreach { shouldDelete =>
-val tcName = if (shouldDelete) "non-existent" else "existed"
+val tcName = if (shouldDelete) "non-existing" else "existed"
 test(s"CTAS for external data source table with a $tcName location") {
--- End diff --

oh, it is. after we also qualified the location path for InMemoryCatalog, 
the code for equal two location are the same with HiveExternalCatalog. thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17055: [SPARK-19723][SQL]create datasource table with an...

2017-03-10 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17055#discussion_r105519542
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -2016,48 +2016,79 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
+  test("create datasource table with a non-existing location") {
+withTable("t", "t1") {
+  withTempPath { dir =>
+spark.sql(s"CREATE TABLE t(a int, b int) USING parquet LOCATION 
'$dir'")
+
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(table.location == makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t SELECT 1, 2")
+assert(dir.exists())
+
+checkAnswer(spark.table("t"), Row(1, 2))
+  }
+  // partition table
+  withTempPath { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t1(a int, b int) USING parquet PARTITIONED 
BY(a) LOCATION '$dir'
+   """.stripMargin)
--- End diff --

exceed 100 characters. code style check failed...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17055: [SPARK-19723][SQL]create datasource table with an non-ex...

2017-03-09 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17055
  
OK thanks a lot~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17055: [SPARK-19723][SQL]create datasource table with an non-ex...

2017-03-09 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17055
  
@gatorsmile I have merged with master,if it is ok,  could you help to 
merge it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17095: [SPARK-19763][SQL]qualified external datasource table lo...

2017-03-09 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17095
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in ...

2017-03-09 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16592#discussion_r105111257
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala ---
@@ -306,6 +306,11 @@ private[sql] trait SQLTestUtils
 val fs = hadoopPath.getFileSystem(spark.sessionState.newHadoopConf())
 fs.makeQualified(hadoopPath).toUri
   }
+
+  def makeQualifiedPath(path: Path): URI = {
--- End diff --

I have fix the above two issues in 
https://github.com/apache/spark/pull/17095


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuit...

2017-03-08 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16592
  
greate~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17211: [SPARK-19869][SQL]move table related ddl from ddl.scala ...

2017-03-08 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17211
  
cc @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17211: [SPARK-19869][SQL]move table related ddl from ddl...

2017-03-08 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17211

[SPARK-19869][SQL]move table related ddl from ddl.scala to tables.scala

## What changes were proposed in this pull request?

move table related ddl from ddl.scala to tables.scala

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark moveToTables

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17211


commit 6f0e24a61bab73532cd50d71a065c1bbf348024c
Author: windpiger <song...@outlook.com>
Date:   2017-03-08T14:24:02Z

[SPARK-19869][SQL]move table related ddl from ddl.scala to tables.scala

commit c31ae7fa495ac6068446def66cb8d1128c1448c6
Author: windpiger <song...@outlook.com>
Date:   2017-03-08T14:25:59Z

remove a todo comment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17055: [SPARK-19723][SQL]create datasource table with an non-ex...

2017-03-08 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17055
  
@gatorsmile this pr could be merged? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17204: [SPARK-19864][SQL]provide a makeQualifiedPath fun...

2017-03-07 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17204

[SPARK-19864][SQL]provide a makeQualifiedPath functions to optimize some 
code


## What changes were proposed in this pull request?

Currently there are lots of places to make the path qualified, it is better 
to provide a function to do this, then the code will be more simple.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark addQualifiledPathUtil

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17204.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17204


commit 7125a1bce9e013e3b45a02a87c2850cf41812348
Author: windpiger <song...@outlook.com>
Date:   2017-03-08T04:20:53Z

[SPARK-19864][SQL]provide a makeQualifiedPath functions to optimize some 
code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-03-07 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17001
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PART...

2017-03-07 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17176#discussion_r104660965
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -159,36 +159,11 @@ class HadoopTableReader(
 def verifyPartitionPath(
 partitionToDeserializer: Map[HivePartition, Class[_ <: 
Deserializer]]):
 Map[HivePartition, Class[_ <: Deserializer]] = {
-  if (!sparkSession.sessionState.conf.verifyPartitionPath) {
--- End diff --

after this pr https://github.com/apache/spark/pull/17187, read hive table 
which does not use `stored by` will not use `HiveTableScanExec`.

this function has  a bug ,that if the partition path is custom path
1. it will still do filter for all partition path in the parameter 
`partitionToDeserializer`,
2. it will scan the path which does not belong to the table ,e.g. custom 
path is `/root/a`
and the partitionSpec is `b=1/c=2`, this will lead to scan `/` because of 
the `getPathPatternByPath `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17187: [SPARK-19847][SQL] port hive read to FileFormat A...

2017-03-07 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17187#discussion_r104643185
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -170,6 +169,40 @@ object HiveAnalysis extends Rule[LogicalPlan] {
   }
 }
 
+class ReadHiveDataSource(session: SparkSession) extends Rule[LogicalPlan] {
+  import 
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE
+
+  override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+case c: CatalogRelation if DDLUtils.isHiveTable(c.tableMeta) &&
+// Hive tables with storage handler with be handled in a different 
way, in `HiveTableScans`.
--- End diff --

with be -> will be


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PARTITION_P...

2017-03-06 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17176
  
why jenkins failed...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17097: [SPARK-19765][SQL] UNCACHE TABLE should un-cache ...

2017-03-06 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17097#discussion_r104570109
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala ---
@@ -88,46 +90,81 @@ class CacheManager extends Logging {
   query: Dataset[_],
   tableName: Option[String] = None,
   storageLevel: StorageLevel = MEMORY_AND_DISK): Unit = writeLock {
-val planToCache = query.queryExecution.analyzed
+val planToCache = query.logicalPlan
 if (lookupCachedData(planToCache).nonEmpty) {
   logWarning("Asked to cache already cached data.")
 } else {
   val sparkSession = query.sparkSession
-  cachedData +=
-CachedData(
-  planToCache,
-  InMemoryRelation(
-sparkSession.sessionState.conf.useCompression,
-sparkSession.sessionState.conf.columnBatchSize,
-storageLevel,
-
sparkSession.sessionState.executePlan(planToCache).executedPlan,
-tableName))
+  cachedData.add(CachedData(
+planToCache,
+InMemoryRelation(
+  sparkSession.sessionState.conf.useCompression,
+  sparkSession.sessionState.conf.columnBatchSize,
+  storageLevel,
+  sparkSession.sessionState.executePlan(planToCache).executedPlan,
+  tableName)))
 }
   }
 
   /**
-   * Tries to remove the data for the given [[Dataset]] from the cache.
-   * No operation, if it's already uncached.
+   * Un-cache all the cache entries that refer to the given plan.
+   */
+  def uncacheQuery(query: Dataset[_], blocking: Boolean = true): Unit = 
writeLock {
--- End diff --

there is another scenario that after alter datasource table location, we 
can't uncache the table,
[JIRA](https://issues.apache.org/jira/browse/SPARK-19845)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PARTITION_P...

2017-03-06 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17176
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PART...

2017-03-06 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17176#discussion_r104400143
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -159,36 +159,37 @@ class HadoopTableReader(
 def verifyPartitionPath(
 partitionToDeserializer: Map[HivePartition, Class[_ <: 
Deserializer]]):
 Map[HivePartition, Class[_ <: Deserializer]] = {
-  if (!sparkSession.sessionState.conf.verifyPartitionPath) {
-partitionToDeserializer
-  } else {
-var existPathSet = collection.mutable.Set[String]()
-var pathPatternSet = collection.mutable.Set[String]()
-partitionToDeserializer.filter {
-  case (partition, partDeserializer) =>
-def updateExistPathSetByPathPattern(pathPatternStr: String) {
-  val pathPattern = new Path(pathPatternStr)
-  val fs = pathPattern.getFileSystem(hadoopConf)
-  val matches = fs.globStatus(pathPattern)
-  matches.foreach(fileStatus => existPathSet += 
fileStatus.getPath.toString)
-}
-// convert  /demo/data/year/month/day  to  /demo/data/*/*/*/
-def getPathPatternByPath(parNum: Int, tempPath: Path): String 
= {
+  var existPathSet = collection.mutable.Set[String]()
+  var pathPatternSet = collection.mutable.Set[String]()
+  partitionToDeserializer.filter {
+case (partition, partDeserializer) =>
+  def updateExistPathSetByPathPattern(pathPatternStr: String) {
+val pathPattern = new Path(pathPatternStr)
+val fs = pathPattern.getFileSystem(hadoopConf)
+val matches = fs.globStatus(pathPattern)
+matches.foreach(fileStatus => existPathSet += 
fileStatus.getPath.toString)
+  }
+  // convert  /demo/data/year/month/day  to  /demo/data/*/*/*/
+  def getPathPatternByPath(parNum: Int, tempPath: Path, 
partitionName: String): String = {
+// if the partition path does not end with partition name, we 
should not
--- End diff --

if the partition location has been altered to another location, we should 
not do this pattern, or we will list pattern files which does not belong to the 
partition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PART...

2017-03-06 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17176

[SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PARTITION_PATH, always return 
empty when the location does not exists

## What changes were proposed in this pull request?

In SPARK-5068, we introduce a SQLConf spark.sql.hive.verifyPartitionPath,
if it is set to true, it will avoid the task failed when the patition 
location does not exists in the filesystem.

this situation should always return emtpy and don't lead to the task 
failed, here we remove this conf.

## How was this patch tested?
modify a test case

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark removeHiveVerfiyPath

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17176


commit 95aa9317b228220961074c04df06e1d08d2d8556
Author: windpiger <song...@outlook.com>
Date:   2017-03-06T09:16:05Z

[SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PARTITION_PATH, always return 
empty when the location does not exists




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17173: [SPARK-19832][SQL]DynamicPartitionWriteTask get partitio...

2017-03-06 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17173
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-03-06 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17149#discussion_r104361311
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -146,7 +147,7 @@ class DDLSuite extends QueryTest with SharedSQLContext 
with BeforeAndAfterEach {
 try {
   sql(s"CREATE DATABASE $dbName")
   val db1 = catalog.getDatabaseMetadata(dbName)
-  val expectedLocation = 
makeQualifiedPath(s"spark-warehouse/$dbName.db")
+  val expectedLocation = new 
Path(makeQualifiedPath(s"spark-warehouse/$dbName.db")).toUri
--- End diff --

ok, maybe we should implement them all? option[string] also exists in many 
places, and more complicate


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-03-06 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17149#discussion_r104361050
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala
 ---
@@ -162,6 +164,28 @@ object CatalogUtils {
 BucketSpec(numBuckets, normalizedBucketCols, normalizedSortCols)
   }
 
+  /**
+   * Convert URI to String.
+   * Since URI.toString does not decode for the uri string, we need to use
+   * Path(uri).toString to decode it.
+   * @param uri the URI of the path
+   * @return the String of the path
+   */
+  def URIToString(uri: Option[URI]): Option[String] = {
+uri.map(new Path(_).toString)
--- End diff --

what about StringToURI? it still import Path


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-03-06 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17149#discussion_r104359591
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -146,7 +147,7 @@ class DDLSuite extends QueryTest with SharedSQLContext 
with BeforeAndAfterEach {
 try {
   sql(s"CREATE DATABASE $dbName")
   val db1 = catalog.getDatabaseMetadata(dbName)
-  val expectedLocation = 
makeQualifiedPath(s"spark-warehouse/$dbName.db")
+  val expectedLocation = new 
Path(makeQualifiedPath(s"spark-warehouse/$dbName.db")).toUri
--- End diff --

I just implement this function with a option[String], I think for a String 
without a option, it is already simple, we still to implement one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-03-05 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17149#discussion_r104358279
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -883,7 +885,7 @@ abstract class CatalogTestUtils {
 
   def newFunc(): CatalogFunction = newFunc("funcName")
 
-  def newUriForDatabase(): String = 
Utils.createTempDir().toURI.toString.stripSuffix("/")
+  def newUriForDatabase(): URI = new 
URI(Utils.createTempDir().toURI.toString.stripSuffix("/"))
--- End diff --

@cloud-fan  after I try to use Path to Compare , I think stripSuffix here 
is the simple way.

Path with a `String` type constructor will be equal when one has a `/`, and 
another does not.
```
scala> val x = new Path("/ab/c/")
x: org.apache.hadoop.fs.Path = /ab/c

scala> val y = new Path("/ab/c")
y: org.apache.hadoop.fs.Path = /ab/c

scala> x == y
res0: Boolean = true
```

Path with a `URI` type constructor will be not equal when one has a `/`, 
and another does not.
```
scala> val x =new URI("/a/b/c/")
x: java.net.URI = /a/b/c/

scala> val y =new URI("/a/b/c")
y: java.net.URI = /a/b/c

scala> x == y
res1: Boolean = false

scala> val x1 =new Path(x)
x1: org.apache.hadoop.fs.Path = /a/b/c/

scala> val y1 =new Path(y)
y1: org.apache.hadoop.fs.Path = /a/b/c

scala> x1 == y1
res2: Boolean = false
```

So just the Path with `String` type can ignore the suffix `/`, then if we 
have a `URI` in hand, and we want to compare with another `URI`, we should  
first transform them to `String` , and use this String to constructor a Path, 
after this two actions, we can compare them with ignore the suffix `/`.

But I think it is more complicate, here we normalize the URI with 
stripSuffix from the Orignal then compare two URI directly, it is more simple.

should we must to convert it to Path to compare?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17173: [SPARK-19832][SQL]DynamicPartitionWriteTask get p...

2017-03-05 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/17173

[SPARK-19832][SQL]DynamicPartitionWriteTask get partitionPath should escape 
the partition name

## What changes were proposed in this pull request?

Currently in DynamicPartitionWriteTask, when we get the paritionPath of a 
parition, we just escape the partition value, not escape the partition name.

this will cause some problems for some  special partition name situation, 
for example :
1) if the partition name contains '%' etc,  there will be two partition 
path created in the filesytem, one is for escaped path like '/path/a%25b=1', 
another is for unescaped path like '/path/a%b=1'.
and the data inserted stored in unescaped path, while the show partitions 
table will return 'a%25b=1' which the partition name is escaped. So here it is 
not consist. And I think the data should be stored in the escaped path in 
filesystem, which Hive2.0.0 also have the same action.

2) if the partition name contains ':', there will throw exception that new 
Path("/path","a:b"), this is illegal which has a colon in the relative path.

```
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
path in absolute URI: a:b
  at org.apache.hadoop.fs.Path.initialize(Path.java:205)
  at org.apache.hadoop.fs.Path.(Path.java:171)
  at org.apache.hadoop.fs.Path.(Path.java:88)
  ... 48 elided
Caused by: java.net.URISyntaxException: Relative path in absolute URI: a:b
  at java.net.URI.checkPath(URI.java:1823)
  at java.net.URI.(URI.java:745)
  at org.apache.hadoop.fs.Path.initialize(Path.java:202)
  ... 50 more
```
## How was this patch tested?
unit test added

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark 
fixDatasourceSpecialCharPartitionName

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17173.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17173


commit deb63bd9f3586b7f9c2e203f8486d6a7eb49bc72
Author: windpiger <song...@outlook.com>
Date:   2017-03-06T06:57:25Z

[SPARK-19832][SQL]DynamicPartitionWriteTask get partitionPath should escape 
the partition name




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16910: [SPARK-19575][SQL]Reading from or writing to a hi...

2017-03-05 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16910#discussion_r104309883
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -1588,6 +1596,147 @@ class HiveDDLSuite
 }
   }
 
+  test("insert data to a hive serde table which has a non-existing 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a string, b int)
+ |USING hive
+ |LOCATION '$dir'
+   """.stripMargin)
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(new Path(table.location) == 
makeQualifiedPath(dir.getAbsolutePath))
+
+val tableLocFile = new File(new URI(table.location))
+tableLocFile.delete()
+assert(!tableLocFile.exists())
+spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
+assert(tableLocFile.exists())
+checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
+
+Utils.deleteRecursively(dir)
+assert(!tableLocFile.exists())
+spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
+assert(tableLocFile.exists())
+checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
+
+val newDirFile = new File(dir, "x")
+val newDirPath = newDirFile.getAbsolutePath
+spark.sql(s"ALTER TABLE t SET LOCATION '$newDirPath'")
+spark.sessionState.catalog.refreshTable(TableIdentifier("t"))
+
+val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(table1.location == newDirPath)
+assert(!newDirFile.exists())
+
+spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
+checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
+assert(newDirFile.exists())
+  }
+}
+  }
+
+  test("insert into a hive serde table with non-existing partition 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a int, b int, c int, d int)
+ |USING hive
+ |PARTITIONED BY(a, b)
+ |LOCATION "$dir"
+   """.stripMargin)
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(new Path(table.location) == 
makeQualifiedPath(dir.getAbsolutePath))
+
+spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 3, 4")
+checkAnswer(spark.table("t"), Row(3, 4, 1, 2) :: Nil)
+
+val partLoc = new File(dir, "a=1")
+Utils.deleteRecursively(partLoc)
+assert(!partLoc.exists())
+// insert overwrite into a partition which location has been 
deleted.
+spark.sql("INSERT OVERWRITE TABLE t PARTITION(a=1, b=2) SELECT 7, 
8")
+assert(partLoc.exists())
+checkAnswer(spark.table("t"), Row(7, 8, 1, 2) :: Nil)
+
+val newDirFile = new File(dir, "x")
+val newDirPath = newDirFile.getAbsolutePath
+spark.sql(s"ALTER TABLE t PARTITION(a=1, b=2) SET LOCATION 
'$newDirPath'")
+assert(!newDirFile.exists())
+
+// insert into a partition which location does not exists.
+spark.sql("INSERT INTO TABLE t PARTITION(a=1, b=2) SELECT 9, 10")
+assert(newDirFile.exists())
+checkAnswer(spark.table("t"), Row(9, 10, 1, 2) :: Nil)
+  }
+}
+  }
+
+  test("read data from a hive serde table which has a non-existing 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a string, b int)
+ |USING hive
+ |LOCATION "$dir"
+   """.stripMargin)
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+assert(new Path(table.location) == 
makeQualifiedPath(dir.getAbsolutePath))
+
+dir.delete()
+checkAnswer(spark.table("t"), Nil)
+
+val newDirFile = new File(dir, "x")
+val newDirPath = newDirFile.getAbsolutePath
+spark.sql(s"ALTER TABLE t SET LO

[GitHub] spark pull request #16910: [SPARK-19575][SQL]Reading from or writing to a hi...

2017-03-04 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16910#discussion_r104285450
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -1588,6 +1590,153 @@ class HiveDDLSuite
 }
   }
 
+  test("insert data to a hive serde table which has a non-existing 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a string, b int)
+ |USING hive
+ |LOCATION '$dir'
--- End diff --

ok thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >