[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-27 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1259053324

   @cloud-fan,
   
   Thank you very much for your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-26 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1257736748

   > LGTM except for 2 minor comments
   
   Thanks @cloud-fan! I have implemented these two comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-25 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1257184837

   Hi @cloud-fan,
   
   All build issues have been fixed and all of your feedbacks have been 
implemented. Latest state:
   
   ```
   $ bin/spark-shell --conf 
spark.sql.catalog.spark_catalog.defaultDatabase=other_db
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   22/09/25 14:34:16 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   Spark context Web UI available at http://localhost:4040
   Spark context available as 'sc' (master = local[*], app id = 
local-1664109256655).
   Spark session available as 'spark'.
   Welcome to
   __
/ __/__  ___ _/ /__
   _\ \/ _ \/ _ `/ __/  '_/
  /___/ .__/\_,_/_/ /_/\_\   version 3.4.0-SNAPSHOT
 /_/

   Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 1.8.0_345)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala> spark.sql("show databases").show()
   org.apache.spark.SparkDefaultDatabaseNotExistsException: 
[DEFAULT_DATABASE_NOT_EXISTS] Default database other_db not exist, please 
create it first or change default database to 'default'.
 at 
org.apache.spark.sql.errors.QueryExecutionErrors$.defaultDatabaseNotExistsError(QueryExecutionErrors.scala:1936)
 at 
org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:156)
 at 
org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:147)
 at 
org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$catalog$1(BaseSessionStateBuilder.scala:154)
 at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:122)
 at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:122)
 at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.listDatabases(SessionCatalog.scala:323)
 at 
org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.listNamespaces(V2SessionCatalog.scala:232)
 at 
org.apache.spark.sql.execution.datasources.v2.ShowNamespacesExec.run(ShowNamespacesExec.scala:42)
 at 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
 at 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
 at 
org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
 at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:111)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:171)
 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
 at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
 at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-21 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1253877919

   Hi @cloud-fan,
   
   Thanks for the feedbacks, I have resolved all of them. Unfortunatelly I did 
a mistake during the version magament but I have already resolved with git 
rebase / git push as you can see.  This is the reason that a lot of new labels 
have been added which are not necessary by this pull request. Only the SQL 
label is the good one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-19 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1251364841

   Thanks @cloud-fan, I have implemented this and all tests passed. As I see we 
have resolved all of your feedbacks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-19 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1250958346

   > Is this a common behavior in other databases?
   
   @cloud-fan Good question. The reason that we cannot delete the user 
specified default database because we have the following if statement in the 
actual code:
   
   ```
   if (dbName == defaultDatabase)
   ```
   
   and this is the latest state of master:
   
   
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L286
   
   ```
 def dropDatabase(db: String, ignoreIfNotExists: Boolean, cascade: 
Boolean): Unit = {
   val dbName = format(db)
   if (dbName == DEFAULT_DATABASE) {
 throw QueryCompilationErrors.cannotDropDefaultDatabaseError
   }
   ```
   
   As you can see that I am just using the same logic.
   
   If you think that we should only deny the database drop for "default" and 
allow for the value of spark.sql.catalog.spark_catalog.defaultDatabase, it is 
ok for me. The change is very simple:
   
   ```
   +++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
   @@ -284,7 +284,7 @@ class SessionCatalog(

  def dropDatabase(db: String, ignoreIfNotExists: Boolean, cascade: 
Boolean): Unit = {
val dbName = format(db)
   -if (dbName == defaultDatabase) {
   +if (dbName == "default") {
  throw QueryCompilationErrors.cannotDropDefaultDatabaseError
}
if (!ignoreIfNotExists) {
   ```
   
   and here is the validation:
   
   ```
$ ./spark-shell --conf spark.sql.catalogImplementation=hive --conf 
spark.sql.catalog.spark_catalog.defaultDatabase=xyz
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   22/09/19 14:21:07 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   Spark context Web UI available at http://localhost:4040
   Spark context available as 'sc' (master = local[*], app id = 
local-1663590068068).
   Spark session available as 'spark'.
   Welcome to
   __
/ __/__  ___ _/ /__
   _\ \/ _ \/ _ `/ __/  '_/
  /___/ .__/\_,_/_/ /_/\_\   version 3.4.0-SNAPSHOT
 /_/

   Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 1.8.0_345)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala> spark.sql("show databases").show()
   +-+
   |namespace|
   +-+
   |  abc|
   |  default|
   |  xyz|
   +-+
   
   
   scala> spark.sql("use database abc")
   res1: org.apache.spark.sql.DataFrame = []
   
   scala> spark.sql("SELECT current_database() AS db").show()
   +---+
   | db|
   +---+
   |abc|
   +---+
   
   
   scala> 
   
   scala> spark.sql("drop database xyz")
   res3: org.apache.spark.sql.DataFrame = []
   
   scala> spark.sql("show databases").show()
   +-+
   |namespace|
   +-+
   |  abc|
   |  default|
   +-+
   
   
   scala> 
   ```
   Is this solution acceptable for you?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-18 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1250375409

   > > override val defaultNamespace: Array[String] = 
Array(SQLConf.get.defaultDatabase)
   > 
   > Yes
   
   @cloud-fan Ok, I have fixed this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-09 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1242406713

   > we should also update `V2SessionCatalog.defaultNamespace`
   
   @cloud-fan, for example this change will be good?
   
   ```
   diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala
   index efbc9dd755..23775c3ae0 100644
   --- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala
   +++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala
   @@ -33,6 +33,7 @@ import org.apache.spark.sql.connector.expressions.Transform
import org.apache.spark.sql.errors.{QueryCompilationErrors, 
QueryExecutionErrors}
import org.apache.spark.sql.execution.datasources.DataSource
import org.apache.spark.sql.internal.connector.V1Function
   +import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.util.CaseInsensitiveStringMap

   @@ -43,7 +44,7 @@ class V2SessionCatalog(catalog: SessionCatalog)
  extends TableCatalog with FunctionCatalog with SupportsNamespaces with 
SQLConfHelper {
  import V2SessionCatalog._

   -  override val defaultNamespace: Array[String] = Array("default")
   +  override val defaultNamespace: Array[String] = 
Array(SQLConf.get.defaultDatabase)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-08 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1240616362

   Thanks @cloud-fan, I will check these!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] roczei commented on pull request #37679: [SPARK-35242][SQL] Support changing session catalog's default database

2022-09-06 Thread GitBox


roczei commented on PR #37679:
URL: https://github.com/apache/spark/pull/37679#issuecomment-1238330280

   @cloud-fan 
   
   Could you please take a look when you have some time?  This PR is a 
follow-up PR for https://github.com/apache/spark/pull/32364. Thanks!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org