[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-11-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r236005686
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2370,4 +2370,17 @@ class HiveDDLSuite
   ))
 }
   }
+
+  test("SPARK-25464 create a database with a non empty location") {
--- End diff --

Do we have a test case to check "create a database with an empty location"?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-19 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226673513
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,14 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (!externalCatalog.databaseExists(dbName) && fs.exists(dbPath)
+  && fs.listStatus(dbPath).nonEmpty) {
--- End diff --

as per my knowledge this the only way to check for empty condition, and 
also I have moved this logic to HiveExternalCatalog


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226655438
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,14 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (!externalCatalog.databaseExists(dbName) && fs.exists(dbPath)
+  && fs.listStatus(dbPath).nonEmpty) {
--- End diff --

if the behavior needs to be consistent over all the external catalog 
implementations(it should be), then I think putting it here is reasonable.

But listing files might be too expensive(e.g. several hourse), is there a 
better way to check if a directory is empty?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226624218
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,14 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (!externalCatalog.databaseExists(dbName) && fs.exists(dbPath)
+  && fs.listStatus(dbPath).nonEmpty) {
--- End diff --

Oh, btw, I noticed that @gatorsmile is asking for doing the file existence 
check in the external catalog, not in the session catalog. Could you move this 
there?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-19 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226548140
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,14 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (!externalCatalog.databaseExists(dbName) && fs.exists(dbPath)
+  && fs.listStatus(dbPath).nonEmpty) {
--- End diff --

listing files is expensive, but create database command is not frequent. 
Same is mentioned here
https://github.com/apache/spark/pull/22466#discussion_r220795973


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536773
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -840,12 +840,19 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 
   test("create table in default db") {
-val catalog = spark.sessionState.catalog
-val tableIdent1 = TableIdentifier("tab1", None)
-createTable(catalog, tableIdent1)
-val expectedTableIdent = tableIdent1.copy(database = Some("default"))
-val expectedTable = generateTable(catalog, expectedTableIdent)
-checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+var tablePath: Option[URI] = None
--- End diff --

`var tablePath: URI = null`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536735
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,14 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (!externalCatalog.databaseExists(dbName) && fs.exists(dbPath)
+  && fs.listStatus(dbPath).nonEmpty) {
--- End diff --

Should we necessarily list up files? it's potentially expensive. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536626
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -840,12 +840,19 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 
   test("create table in default db") {
-val catalog = spark.sessionState.catalog
-val tableIdent1 = TableIdentifier("tab1", None)
-createTable(catalog, tableIdent1)
-val expectedTableIdent = tableIdent1.copy(database = Some("default"))
-val expectedTable = generateTable(catalog, expectedTableIdent)
-checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+var tablePath: Option[URI] = None
+try {
+  val catalog = spark.sessionState.catalog
+  val tableIdent1 = TableIdentifier("tab1", None)
+  createTable(catalog, tableIdent1)
+  val expectedTableIdent = tableIdent1.copy(database = Some("default"))
+  val expectedTable = generateTable(catalog, expectedTableIdent)
+  tablePath = Some(expectedTable.location)
+  checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+} finally {
+  // This is external table,so it is required to deleted the path
--- End diff --

@HyukjinKwon The first one is `e,s` -> `e, s` ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536585
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2370,4 +2370,17 @@ class HiveDDLSuite
   ))
 }
   }
+
+  test("SPARK-25464 create Database with non empty location") {
+val dbName = "dbwithcustomlocation"
+withTempDir { tmpDir =>
+  val parentDir = tmpDir.getParent
+  val expectedMsg = s"Cannot create database at location $parentDir 
because the path is not " +
+s"empty."
--- End diff --

leading `s` can be removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536555
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2370,4 +2370,17 @@ class HiveDDLSuite
   ))
 }
   }
+
+  test("SPARK-25464 create Database with non empty location") {
--- End diff --

`create a database with a non-empty location`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536442
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -840,12 +840,19 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 
   test("create table in default db") {
-val catalog = spark.sessionState.catalog
-val tableIdent1 = TableIdentifier("tab1", None)
-createTable(catalog, tableIdent1)
-val expectedTableIdent = tableIdent1.copy(database = Some("default"))
-val expectedTable = generateTable(catalog, expectedTableIdent)
-checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+var tablePath: Option[URI] = None
+try {
+  val catalog = spark.sessionState.catalog
+  val tableIdent1 = TableIdentifier("tab1", None)
+  createTable(catalog, tableIdent1)
+  val expectedTableIdent = tableIdent1.copy(database = Some("default"))
+  val expectedTable = generateTable(catalog, expectedTableIdent)
+  tablePath = Some(expectedTable.location)
+  checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+} finally {
+  // This is external table,so it is required to deleted the path
--- End diff --

`this is an external table`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536456
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -840,12 +840,19 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 
   test("create table in default db") {
-val catalog = spark.sessionState.catalog
-val tableIdent1 = TableIdentifier("tab1", None)
-createTable(catalog, tableIdent1)
-val expectedTableIdent = tableIdent1.copy(database = Some("default"))
-val expectedTable = generateTable(catalog, expectedTableIdent)
-checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+var tablePath: Option[URI] = None
+try {
+  val catalog = spark.sessionState.catalog
+  val tableIdent1 = TableIdentifier("tab1", None)
+  createTable(catalog, tableIdent1)
+  val expectedTableIdent = tableIdent1.copy(database = Some("default"))
+  val expectedTable = generateTable(catalog, expectedTableIdent)
+  tablePath = Some(expectedTable.location)
+  checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+} finally {
+  // This is external table,so it is required to deleted the path
--- End diff --

`it is required to delete`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226536304
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -840,12 +840,19 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 
   test("create table in default db") {
-val catalog = spark.sessionState.catalog
-val tableIdent1 = TableIdentifier("tab1", None)
-createTable(catalog, tableIdent1)
-val expectedTableIdent = tableIdent1.copy(database = Some("default"))
-val expectedTable = generateTable(catalog, expectedTableIdent)
-checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+var tablePath: Option[URI] = None
+try {
+  val catalog = spark.sessionState.catalog
+  val tableIdent1 = TableIdentifier("tab1", None)
+  createTable(catalog, tableIdent1)
+  val expectedTableIdent = tableIdent1.copy(database = Some("default"))
+  val expectedTable = generateTable(catalog, expectedTableIdent)
+  tablePath = Some(expectedTable.location)
+  checkCatalogTables(expectedTable, 
catalog.getTableMetadata(tableIdent1))
+} finally {
+  // This is external table,so it is required to deleted the path
--- End diff --

tiny nit: `e,` -> `e ,`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226245229
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2993,6 +2990,7 @@ def test_current_database(self):
 AnalysisException,
 "does_not_exist",
 lambda: spark.catalog.setCurrentDatabase("does_not_exist"))
+spark.sql("DROP DATABASE some_db")
--- End diff --

Submitted #22762.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-18 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r226236226
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2993,6 +2990,7 @@ def test_current_database(self):
 AnalysisException,
 "does_not_exist",
 lambda: spark.catalog.setCurrentDatabase("does_not_exist"))
+spark.sql("DROP DATABASE some_db")
--- End diff --

We still need to cleanup each test. Otherwise, if the test fails before 
dropping `some_db`, it will cause to fail other tests.
Seems like the current cleanup is not good. Let me refine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-15 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r225396925
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2993,6 +2990,7 @@ def test_current_database(self):
 AnalysisException,
 "does_not_exist",
 lambda: spark.catalog.setCurrentDatabase("does_not_exist"))
+spark.sql("DROP DATABASE some_db")
--- End diff --

create and drop should be part of test case,if there is any exception then 
test case will fail.So no need to put in the finally block


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-11 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r224667371
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2993,6 +2990,7 @@ def test_current_database(self):
 AnalysisException,
 "does_not_exist",
 lambda: spark.catalog.setCurrentDatabase("does_not_exist"))
+spark.sql("DROP DATABASE some_db")
--- End diff --

We should surround with try-finally?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-11 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r224666263
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -350,9 +350,6 @@ def test_sqlcontext_reuses_sparksession(self):
 def tearDown(self):
--- End diff --

Now we can remove this method?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223570144
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,16 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (fs.exists(dbPath)) {
+  val fileStatus = fs.listStatus(dbPath)
+  if (fileStatus.nonEmpty) {
--- End diff --

if (fs.exists(dbPath) && fs.listStatus(dbPath).nonEmpty) {


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223569270
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -407,6 +407,7 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 
   test("create a managed table with the existing non-empty directory") {
 withTable("tab1") {
+  Utils.createDirectory(spark.sessionState.conf.warehousePath)
--- End diff --

If we do not delete the directory for each test case, this line is not 
needed. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223569182
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -53,6 +53,7 @@ class HiveCatalogedDDLSuite extends DDLSuite with 
TestHiveSingleton with BeforeA
   // drop all databases, tables and functions after each test
   spark.sessionState.catalog.reset()
 } finally {
+  Utils.deleteRecursively(new 
File(spark.sessionState.conf.warehousePath))
--- End diff --

We dropped all the database in line 54, right? Could you fix the test cases 
instead of delete the directory here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223568531
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -351,7 +351,7 @@ def tearDown(self):
 super(SQLTests, self).tearDown()
 
 # tear down test_bucketed_write state
-self.spark.sql("DROP TABLE IF EXISTS pyspark_bucket")
+self.spark.sql("DROP DATABASE IF EXISTS some_db CASCADE")
--- End diff --

Could you update the corresponding test cases instead of dropping the 
database here? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-08 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223395695
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,16 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (fs.exists(dbPath)) {
+  val fileStatus = fs.listStatus(dbPath)
+  if (fileStatus.nonEmpty) {
+throw new AnalysisException(
+  s"Cannot create Database to the location which is not 
empty.Given path is ${dbPath}")
--- End diff --

That's fine, sure.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-08 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223394163
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,16 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (fs.exists(dbPath)) {
+  val fileStatus = fs.listStatus(dbPath)
+  if (fileStatus.nonEmpty) {
+throw new AnalysisException(
+  s"Cannot create Database to the location which is not 
empty.Given path is ${dbPath}")
--- End diff --

if the folder exists and it is non-empty then we allow to create 
database,so this statement does not conflict the user ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-08 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r223389089
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -207,6 +207,16 @@ class SessionCatalog(
   "you cannot create a database with this name.")
 }
 validateName(dbName)
+// SPARK-25464 fail if DB location exists and is not empty
+val dbPath = new Path(dbDefinition.locationUri)
+val fs = dbPath.getFileSystem(hadoopConf)
+if (fs.exists(dbPath)) {
+  val fileStatus = fs.listStatus(dbPath)
+  if (fileStatus.nonEmpty) {
+throw new AnalysisException(
+  s"Cannot create Database to the location which is not 
empty.Given path is ${dbPath}")
--- End diff --

The message here has a few minor typos. Let's try: `s"Cannot create 
database at location $dbPath because it already exists."`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-04 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r222891761
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -351,7 +351,7 @@ def tearDown(self):
 super(SQLTests, self).tearDown()
 
 # tear down test_bucketed_write state
-self.spark.sql("DROP TABLE IF EXISTS pyspark_bucket")
+self.spark.sql("DROP DATABASE IF EXISTS some_db CASCADE")
--- End diff --


test_current_database,test_list_databases,test_list_tables,test_list_functions 
and test_list_columns all these test case create the database and does not drop 
it,so the folder will be present which result in the test case failures


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22466: [SPARK-25464][SQL] Create Database to the locatio...

2018-10-04 Thread sandeep-katta
Github user sandeep-katta commented on a diff in the pull request:

https://github.com/apache/spark/pull/22466#discussion_r222891525
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -407,6 +407,7 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 
   test("create a managed table with the existing non-empty directory") {
 withTable("tab1") {
+  Utils.createDirectory(spark.sessionState.conf.warehousePath)
--- End diff --

As per line 414 it is trying to create empty directory under warehouse 
path,this statement will fail as parent folder(ware house) does not exists


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org