rajarshisarkar commented on code in PR #4443: URL: https://github.com/apache/iceberg/pull/4443#discussion_r841590718
########## docs/integrations/aws.md: ########## @@ -433,6 +433,25 @@ spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCata ``` For the above example, the objects in S3 will be saved with tags: `my_key1=my_val1` and `my_key2=my_val2`. +We can add tags before deleting the objects as well. For example, to add S3 delete tags with Spark 3.0, you can start the Spark SQL shell with: + +``` +sh spark-sql --conf spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \ + --conf spark.sql.catalog.my_catalog.warehouse=s3://iceberg-warehouse/s3-tagging \ + --conf spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog \ + --conf spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO \ + --conf spark.sql.catalog.my_catalog.s3.write.tags.my_ke1y=my_val1 \ + --conf spark.sql.catalog.my_catalog.s3.write.tags.my_key2=my_val2 \ + --conf spark.sql.catalog.my_catalog.s3.delete.tags.my_key3=my_val3 \ + --conf spark.sql.catalog.my_catalog.s3.delete-enabled=false +``` + +When `s3.delete-enabled` is disabled, users are expected to set the delete tags with `s3.delete.tags` and manage the deleted files through S3 lifecycle policy. +With the `s3.delete.tags` config, objects are tagged with the configured key-value pairs before deletion. This is considered a soft-delete, because users are able to configure tag-based object lifecycle policy at bucket level to transition objects to different tiers. +For more details, see [managing your storage lifecycle documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html). Review Comment: I don't see note sections being added in the AWS docs. I have added a new section for `s3.delete-enabled`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
