kbendick opened a new pull request #3573:
URL: https://github.com/apache/iceberg/pull/3573


   This closes issue https://github.com/apache/iceberg/issues/3541
   
   The Spark Catalog assumes that implementations will cascade by default. 
Currently, we don't do that, and so if users attempt to drop a namespace, they 
get a rather unhelpful error message.
   
   ```scala
   scala> spark.sql("drop namespace iceberg.accounting cascade").show
   org.apache.iceberg.exceptions.NamespaceNotEmptyException: Namespace 
accounting is not empty. One or more tables exist.
   ```
   
   Unfortunately, Spark doesn't expose `ifNotExists` or `cascade` in the form 
of parameters to the `SupportsNamespace` interface, so unfortunately we simply 
have to try (see sample code from Spark below). It _does_ know about all of 
these things (as evidenced by the tests), so it won't drop things unless 
`CASCADE` is used.
   
   This still maintains the default behavior of respecting `RESTRICT` (not 
dropping non-empty namespaces) by default.
   
   Relevant code in Spark can be seen here:
   - The original PR that made this design shows that it's for performance - 
https://github.com/apache/spark/pull/26476.
   - Sample catalog source that is used in all of the tests in Spark that does 
have `cascade` in them - 
https://github.com/apache/spark/blob/e99fdf9654481dd9b691a3c10e52f3f3db6ed2ba/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala#L216-L224
   
   I have tested that the `DropNamespaceExec`  is called properly  (referenced 
in the first PR and still in Spark's master branch).
   
   The one thing I'm still worried about is the fact that we don't call 
`cascade` when using HMS, but the tests seem to be working. We should likely 
check HMS manually to see if things are cleared out or not: 
https://github.com/apache/iceberg/blob/f68d8d426661efc0d7e5686fe833b573b74eadab/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java#L271-L284
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to