elkhand opened a new issue #2033: URL: https://github.com/apache/iceberg/issues/2033
Hello, Iceberg community, I'm facing an issue with metadata files in S3 for the Iceberg table when bucket versioning is **enabled** for S3 bucket. Iceberg Flink connector `0.10.0` version is used. Flink job ingests data into S3, with a checkpointing interval of 1 hour. for S3 bucket with **versioning disabled**: every hour in the metadata folder 3 files are created: - Metadata file: 00001-180a83f3-c229-4ed5-a9dc-2f9c235f6d52.metadata.json - Manifest list file: snap-5807373598091371828-1-87p0b872-3f55-9b79-8cee-b1d354f2c378.avro - Manifest file: 39d0b872-4f56-4b79-8cee-c0a354f2c575-m0.avro for S3 bucket with **versioning enabled**: every hour in the metadata folder only a single file is created - Manifest file : 4c806ffdb03c41e09337b90f18781570-00000-0-466-00001.avro The metadata file and manifest list file are **NOT** created or updated either. This causes the issue of new data(partitions) not be available until the metadata and manifest list file is created. If I restart the Flink job, the new metadata and manifest list file are created, and the missing partitions become visible again. Few questions: 1) Did anyone face a similar issue with the S3 bucket versioning? 2) Why metadata files and manifest list files are not created/updated with every checkpoint? 3) How can S3 bucket versioning impact the manifest file version? "-m0.avro" suffix V1 version vs ".avro" suffix V2 version Thank you. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
