rdblue edited a comment on issue #25077: [SPARK-28301][SQL] fix the behavior of table name resolution with multi-catalog URL: https://github.com/apache/spark/pull/25077#issuecomment-509438061 -1 I agree with the idea behind this PR, which is that the v2 catalog default should be applied the same way in all cases... that's why I fixed this in #24768. From that PR's description: > Updates catalog lookup to always provide the default if it is set for consistent behavior. And from discussion: > After this PR, the rules to determine the catalog responsible for an identifier are: > 1. If the identifier starts with a known catalog, use it > 2. If there is a configured default v2 catalog, use that catalog > 3. Otherwise, the session catalog is responsible for the identifier While this PR includes some additional fixes for handling for temporary tables, I don't understand why this was necessary when another PR is proposing the same behavior change and could be updated to fix those issues as well. There are also significant problems with this alternative implementation. It reverses choices that were made to minimize the risk to existing users of adding multi-catalog support to Spark 3.0. The plan is to change as few parts of v1 as possible to ensure its behavior does not change, but this PR makes the v1 `SessionCatalog` responsible for proxying calls to load v2 tables by modifying the signature and behavior of `lookupRelation`. These changes are not needed to update the default catalog behavior. This may be the right approach, but should be considered independently. Similarly, `AsTableIdentifier` is used in rules that previously matched `TableIdentifier` and guarantees that the rule will not accidentally match an identifier with a catalog part. But this PR rewrites `AsTableIdentifier` so that it will match any 2-part identifier without the catalog check so it now must be applied after `CatalogObjectIdentifier`. Previously correct uses of `AsTableIdentifier` are now unsafe. Last, this PR makes changes that are unrelated to fixing the behavior of the default catalog. For example, this adds a `CatalogManager`, which may be a good idea, but should be proposed and added separately. To sum it up, this appears to be proposing a different implementation, not different behavior. But it is quite invasive and different from the design that we have been building to up to now. **Some of these changes may be good ideas, but let's separate them into their own PRs and discuss the merits of each**. If you want to change the design to add a proxy catalog, then please bring it up at the sync or write up a proposal.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
