rdblue commented on issue #25747: [SPARK-29039][SQL] centralize the catalog and table lookup logic URL: https://github.com/apache/spark/pull/25747#issuecomment-532829480 @cloud-fan, the rules you posted aren't correct. For example, you refer to the default catalog for catalog resolution. The default catalog should not be considered for resolution, only the current catalog. I think the confusion here is from mixing this work with SPARK-29014. I suggest cleaning up the catalogs first and then refactoring if you still want to. After SPARK-29014 is done, table resolution should happen like this: 1. If the identifier starts with a known catalog, use it. This will be a v2 catalog, and the query should be converted to a v2 plan. 2. Otherwise, the _current_ catalog is responsible 2.1 If the current catalog is the `spark_catalog` (built-in session catalog), then load the table and check its provider to determine whether to use a v1 plan or a v2 plan 2.2 Otherwise, convert to plan should be v2 We should implement this with an extractor that returns a catalog if it is _not_ the `spark_catalog`. Then the rules in `Analyzer` can be written entirely for v2. A second extractor for the `spark_catalog` can be used by `DataSourceResolution`. After using that extractor, those rules can look up the table to determine whether to use a v1 or v2 plan. Does that make sense?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
