srielau commented on code in PR #55977:
URL: https://github.com/apache/spark/pull/55977#discussion_r3267793373
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -265,6 +275,42 @@ class CatalogManager(
currentCatalog, currentNamespace,
currentCatalog, currentNamespace)
+ /**
+ * Snapshot the live PATH-derived [[SessionCatalog.SessionFunctionKind]]
order used by
+ * unqualified function/table-function resolution.
+ *
+ * The kinds list is computed by reading the current catalog, current
namespace, and the
Review Comment:
Good catch. Reworked the docstring (commit ecf913ab) to spell out the
contract: the kinds extraction in `systemFunctionKindsFromPath` only retains
literal `system.builtin` / `system.session` entries, so even if `v1CurrentDb`
lags by one `USE SCHEMA`, a stale `CURRENT_SCHEMA` expansion can change the
path **content** but never the **kinds list** that this helper returns. Added
that justification + a forward pointer to the deadlock cycle so future readers
don't lift the v1 read inside the lock.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -330,15 +376,32 @@ class CatalogManager(
catalog(_currentCatalogName.getOrElse(conf.getConf(SQLConf.DEFAULT_CATALOG)))
}
- def setCurrentCatalog(catalogName: String): Unit = synchronized {
- // `setCurrentCatalog` is noop if it doesn't switch to a different catalog.
- if (currentCatalog.name() != catalogName) {
- catalog(catalogName)
- _currentCatalogName = Some(catalogName)
- _currentNamespace = None
+ def setCurrentCatalog(catalogName: String): Unit = {
+ // SPARK-56939: see [[setCurrentNamespace]]. Avoid nesting
[[CatalogManager]]'s lock
+ // across [[v1SessionCatalog.setCurrentDatabase]] (which synchronizes on
+ // [[SessionCatalog]]) to prevent a lock-order inversion with concurrent
unqualified
+ // function resolution.
+ val needsSwitch = synchronized {
+ // `setCurrentCatalog` is noop if it doesn't switch to a different
catalog.
+ if (currentCatalog.name() != catalogName) {
+ // Force-load the named catalog while holding the manager lock to keep
the
+ // not-found error semantics; if loading fails, throw before mutating
state.
+ catalog(catalogName)
+ true
+ } else {
+ false
+ }
+ }
+ if (needsSwitch) {
// Reset the current database of v1 `SessionCatalog` when switching
current catalog, so that
// when we switch back to session catalog, the current namespace
definitely is ["default"].
+ // Run this BEFORE publishing the new catalog name so that if a reader
observes the new
Review Comment:
Fair point -- the prior comment only told the half-story that the new
ordering forbids. Expanded it (ecf913ab) to also acknowledge the symmetric
window: between `v1.setCurrentDatabase(default)` and the publish, a concurrent
reader of `currentNamespace` sees `(oldCatalog, v1.currentDb = default)`, which
when the old catalog was the session catalog briefly hides the user's previous
namespace. Framed both halves as a concurrency trade-off against the deadlock
alternative so a future maintainer can't read the comment as "strictly better."
Did the same for `setCurrentNamespace`'s drifted `isSession` snapshot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]