jackye1995 commented on code in PR #4423:
URL: https://github.com/apache/iceberg/pull/4423#discussion_r897501883
##########
aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java:
##########
@@ -273,13 +278,24 @@ private boolean isGlueIcebergTable(Table table) {
public boolean dropTable(TableIdentifier identifier, boolean purge) {
try {
TableOperations ops = newTableOps(identifier);
- TableMetadata lastMetadata = ops.current();
+
+ GlueTableOperations glueOps = (GlueTableOperations) ops;
+ S3FileIO s3FileIO = (S3FileIO) glueOps.io();
+ TableMetadata lastMetadata = null;
+ boolean isTablePurged = isTablePurged(identifier, s3FileIO.client());
+ if (!isTablePurged) {
+ lastMetadata = ops.current();
+ }
+
glue.deleteTable(DeleteTableRequest.builder()
.catalogId(awsProperties.glueCatalogId())
.databaseName(IcebergToGlueConverter.getDatabaseName(identifier))
.name(identifier.name())
.build());
LOG.info("Successfully dropped table {} from Glue", identifier);
+ ValidationException.check(!purge ||
!awsProperties.glueLakeFormationEnabled() || isTablePurged,
+ "Cannot purge table with LakeFormation enabled because S3 access is
lost after table is dropped. " +
Review Comment:
> I think we still want the glue table to be dropped even LF enabled and
purge is set to true?
Let's think about this again...
What we know is that if table is LF enabled and purge is true, there is no
way we can make this happen safely, because:
1. if we drop table first, we can pre-fetch s3 credential but that might not
be enough time to remove all the table data files
2. if we remove all the table data files first, we might fail in the middle,
this causes orphan files.
Originally I suggested that because there is no perfect way, let's just ask
customer to purge by themselves and we validate and not allow them to purge in
Iceberg if LF is enabled. But now I think again, this is known behavior even
for other normal tables, and it looks like we should go with 2, so that we
first remove table data files, all the failures are already suppressed in the
operation, and then we drop the table.
+ @amogh-jahagirdar any thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]