rdblue commented on a change in pull request #30554:
URL: https://github.com/apache/spark/pull/30554#discussion_r533833283



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
##########
@@ -657,14 +659,8 @@ class ResolveSessionCatalog(
       comment: Option[String],
       storageFormat: CatalogStorageFormat,
       external: Boolean): CatalogTable = {
-    if (external) {
-      if (DDLUtils.isHiveTable(Some(provider))) {
-        if (location.isEmpty) {
-          throw new AnalysisException(s"CREATE EXTERNAL TABLE must be 
accompanied by LOCATION")
-        }
-      } else {
-        throw new AnalysisException(s"Operation not allowed: CREATE EXTERNAL 
TABLE ... USING")
-      }
+    if (external && location.isEmpty) {
+      throw new AnalysisException(s"CREATE EXTERNAL TABLE must be accompanied 
by LOCATION")

Review comment:
       This PR adds support for `EXTERNAL` with data source tables, but only 
for tables with the `LOCATION` clause. I have no problem with adding that 
support, or for enforcing the `LOCATION` restriction in Spark's catalog.
   
   The problem is that it is done in the analyzer rather than in the catalog 
implementation, so the `LOCATION` clause restriction is applied before calling 
a v2 extension to the session catalog.
   
   We've now had several problems with behavior mismatches between the session 
catalog and other v2 catalogs, so I think that support for `EXTERNAL` should be 
added properly: pass the flag correctly to the v2 extension and enforce the 
`LOCATION` restriction in the catalog rather than in the analysis rules. I'd 
also like to have tests for the newly supported SQL cases to validate that 
behavior.
   
   I know that the reason behind this was to avoid test failures, but I think 
only partially adding this feature means that it isn't done right: it 
introduces a behavior difference between session and other v2 catalogs, and 
would be missing tests.
   
   If you want to update this to add the `LOCATION` restriction in the session 
catalog, pass the v2 property correctly to catalog extensions, and add tests, 
then I'm okay with this change. However, then we would be looking at a 
significantly larger PR, so it is unlikely that this should be done at the same 
time as adding configuration to use data source tables by default.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to