cloud-fan opened a new pull request #25077: [SPARK-28301][SQL] fix the behavior of table name resolution with multi-catalog URL: https://github.com/apache/spark/pull/25077 ## What changes were proposed in this pull request? Now users can register multiple catalogs in Spark, and the table name resolution should be compatible with multi-catalog. The expected behavior is simple: * For DDL commands that can only deal with tables * If the table name has only one name part, then it's a table in the default catalog. * If the table name has more than one name part like `a.b.c`. * if `a` is a registered catalog, then it's a table `c` under namespace `b` in catalog `a`. * if `a` is not a registered catalog, then it's a table `c` under namespace `a.b` in the default catalog. * For SELECT/INSERT that can handle both tables and temp views, first check if the table name is a temp view or global temp view, otherwise the rule is the same as DDL commands. However, we need to change the expected behavior a little bit because the builtin hive catalog hasn't migrated to the new catalog API yet: 1. If the default catalog config is set, pick it as the default catalog. Otherwise pick hive catalog as the default catalog. 2. If the default catalog config is not set, and the table name has more than 2 name parts. We should fail with "no catalog specified for table" The current behavior of table name resolution is a little confusing: * For DDL commands that can only deal with tables * If the first part of the table name matches a registered catalog, then it's a table in that catalog. (expected) * Otherwise, if the table name has less than 3 parts, and the provider name is v1, go with the builtin Hive catalog. (This is not expected. By design different catalogs can interprete table provider name differently. We should go with the default catalog if the config is set, no matter what the table provider name is.) * For SELECT/INSERT that can handle both tables and temp views * If the first part of the table name does not match a registered catalog, and it has less than 3 parts, go with the builtin Hive catalog. (This is not expected as we need to respect the default catalog config.) * If the first part of the table name does not match a registered catalog, and it has more than 2 parts, the query is unresolved. (This is not expected as we need to respect the default catalog config.) This PR fixes the behavior of the table name resolution. ## How was this patch tested? new test cases
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
