I would be hesitant to turn on any new feature by default. Especially for Spark compaction which is widely used in production.
+1 for providing a way for the users to enable the feature manually Gabor Kaszab <gaborkas...@apache.org> ezt írta (időpont: 2025. márc. 14., P, 12:19): > Hi Iceberg Community, > > There were recent additions to RemoveSnapshots to expire the unused > partition specs and schemas. This is controlled by a flag called > 'cleanExpiredMetadata' and has a default value 'false'. Additionally, > Spark > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java#L147> > and Flink > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/ExpireSnapshotsProcessor.java#L86> > don't offer a way to set this flag currently. > > 1) Default value of RemoveSnapshots.cleanExpiredMetadata > I'm wondering if it's desired by the community to default this flag to > true. The effect of that would be that each snapshot expiration would also > clean up the unused partition specs and schemas too. This functionality is > quite new so this might need some extra confidence by the community before > turning on by default but I think it's worth a consideration. > > 2) Spark and Flink to support setting this flag > I think it makes sense to add support in Spark's ExpireSnapshotProcedure > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java#L116> > and ExpireSnapshotsSparkAction > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java#L147> > also to Flink's ExpireSnapshotsProcessor > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/ExpireSnapshotsProcessor.java#L58> > and ExpireSnapshots > <https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/ExpireSnapshots.java#L44> > to allow setting this flag based on (user) inputs. > > WDYT? > > Regards, > Gabor >