cloud-fan commented on a change in pull request #25903: [SPARK-29215][SQL]
current namespace should be tracked in SessionCatalog if the current catalog is
session catalog
URL: https://github.com/apache/spark/pull/25903#discussion_r327459492
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala
##########
@@ -85,33 +91,53 @@ class CatalogManager(conf: SQLConf, defaultSessionCatalog:
TableCatalog) extends
def currentNamespace: Array[String] = synchronized {
_currentNamespace.getOrElse {
- currentCatalog.map { catalogName =>
- getDefaultNamespace(catalog(catalogName))
- }.getOrElse(Array("default")) // The builtin catalog use "default" as
the default database.
+ // For session catalog, the current namespace is kept in
`SessionCatalog`. There are many
+ // commands that do not support v2 catalog API. They ignore the current
catalog and blindly
+ // go to `SessionCatalog`. This means, we must keep track of the current
namespace of session
+ // catalog even if the current catalog is not session catalog.
`CatalogManager` only tracks
+ // the current namespace of the current catalog.
+ if (currentCatalog.name() == SESSION_CATALOG_NAME) {
+ Array(v1SessionCatalog.getCurrentDatabase)
+ } else {
+ getDefaultNamespace(currentCatalog)
+ }
}
}
def setCurrentNamespace(namespace: Array[String]): Unit = synchronized {
- _currentNamespace = Some(namespace)
+ // For session catalog, the current namespace is kept in `SessionCatalog`.
There are many
+ // commands that do not support v2 catalog API. They ignore the current
catalog and blindly
+ // go to `SessionCatalog`. This means, we must keep track of the current
namespace of session
+ // catalog even if the current catalog is not session catalog.
`CatalogManager` only tracks
+ // the current namespace of the current catalog.
+ if (currentCatalog.name() == SESSION_CATALOG_NAME) {
+ if (namespace.length != 1) {
+ throw new NoSuchNamespaceException(namespace)
+ }
+ v1SessionCatalog.setCurrentDatabase(namespace.head)
+ } else {
+ _currentNamespace = Some(namespace)
+ }
}
- private var _currentCatalog: Option[String] = None
+ private var _currentCatalogName: Option[String] = None
- // Returns the name of current catalog. None means the current catalog is
the builtin catalog.
- def currentCatalog: Option[String] = synchronized {
- _currentCatalog.orElse(conf.defaultV2Catalog)
+ def currentCatalog: CatalogPlugin = synchronized {
+ _currentCatalogName.map(catalogName => catalog(catalogName))
+ .orElse(defaultCatalog)
+ .getOrElse(v2SessionCatalog)
}
def setCurrentCatalog(catalogName: String): Unit = synchronized {
- _currentCatalog = Some(catalogName)
+ _currentCatalogName = Some(catalogName)
_currentNamespace = None
Review comment:
resetting it when switching is definitely reasonable, but what I was talking
about is v1 command:
```
# Starts with session catalog
SELECT current_database() # default
ANALYZE TABLE t # analyze default.t from session catalog
USE db
SELECT current_database() # db
ANALYZE TABLE t # analyze db.t from session catalog
# switch catalog
USE CATALOG myCat
SELECT current_database() # [] (default namespace of myCat)
ANALYZE TABLE t
```
For the last `ANALYZE TABLE`,it's unclear what the behavior should be. There
are 3 options:
1. We should look up the table from `myCat`, so this should either throw
NoSuchTable exception, or throw unsupported exception because we don't have an
analyze table API in DS v2.
2. V1 commands should still go with the v1 code path. We should analyze
table `default.t` because the current catalog is not session catalog and we
don't know what the currennt database of session catalog is now.
3. V1 commands should still go with the v1 code path. We should analyze
table `db.t` because this is the most recent current database of the session
catalog.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]