RussellSpitzer commented on PR #11317:
URL: https://github.com/apache/iceberg/pull/11317#issuecomment-2604912801

   Note from our sync where we discussed this:
   
   Today we had a little discussion on the Apache Iceberg Catalog Community 
Sync 
   about DROP and DROP WITH PURGE. Currently the SparkCatalog implementation
   inside of the reference library has a unique method of DROP WITH PURGE vs 
other
   implementations. The pseudo code is essentially
   
   
   ```
   use Spark to list files to be removed and delete them
   send a drop table request to the Catalog
   ```
   
   As opposed to other systems
   
   ```
   send a drop table request to the Catalog with the purge flag enabled
   ```
   
   This has led us to a situation where it becomes difficult for REST Catalogs
   with custom purge implementations (or those with ignore purge) to
   work properly with Spark.
   
   Bringing this behavior in line with non-Spark implementations
   would have possibly dramatic impacts on users of the
   iceberg library but our consensus in the Catalog Sync today was that we 
should
   eventually have that be the default behavior. To this end I propose the 
following
   
   We support a flag to allow current Spark users to delegate to the REST 
Catalog
   (all other catalog behaviors remain the same). PR available 
[here](https://github.com/apache/iceberg/pull/11317) from
   (Credit to Tobias who wrote the PR and brought up this topic)
    We deprecate the client side delete for Spark
   In the next major release (Iceberg 2.0?) we change the [behavior 
officially](https://github.com/apache/iceberg/issues/11754) to only
   send through the Drop Purge flag with no client side file removal.
   For all non-REST catalog implementations we keep the code the same for 
legacy compatibility. 
   
   A user of 1.8 will then have the ability to choose for their Spark DROP 
PURGES whether
   or not to purge locally or Remotely for REST
   
   A user of 2.0 will only be able to do a remote purge
   
   Users of non-REST Catalogs will have no change in behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to