hguercan commented on issue #13763:
URL: https://github.com/apache/iceberg/issues/13763#issuecomment-3183511355
> Do you have access to the table's `metadata.json` file? If so, can you
share the snapshot summary of the snapshots with `snapshot-sequence` `59331`
and `59330`? These are the sequences of the snapshots in which the file you
mentioned was seemingly added according to the values in `file_sequence_number`
(in a regular situation, there should be only one as you say).
>
> They can be found, if not expired, in the `snapshots` key which contains
an array of snapshots.
>
> If they no longer exist in this array, perhaps an older `metadata.json`
which does still have this information still exists.
>
> I am asking because this connector utilizes snapshot properties for some
of the consistency guarantee mechanisms, and this could help pointing in the
right direction (or away from the wrong direction).
Unfortunately, I dont have access to that snapshot and its metadata anymore
but I have a fresh case where i did the same analysis and can share the
statistics for the corresponding snapshot-ids
`
{
"sequence-number": 54478,
"snapshot-id": 6176209833558056528,
"parent-snapshot-id": 2306737064233706627,
"timestamp-ms": 1755016698301,
"summary": {
"operation": "append",
"kafka.connect.commit-id": "ec688e95-d729-41ce-86ce-a3ed1ce89ffb",
"kafka.connect.offsets.wawi.iceberg-control-<my-table-name>-sink-iceberg":
"{\"0\":5501096}",
"added-data-files": "46015",
"added-records": "13656877",
"added-files-size": "904639986",
"changed-partition-count": "4230",
"total-records": "784700226",
"total-files-size": "17779462489",
"total-data-files": "90567",
"total-delete-files": "0",
"total-position-deletes": "0",
"total-equality-deletes": "0",
"iceberg-version": "Apache Iceberg 1.9.2 (commit
071d5606bc6199a0be9b3f274ec7fbf111d88821)"
},
"manifest-list":
"abfss://<irrelevant-path>/metadata/snap-6176209833558056528-1-3be0fa19-154c-47fb-995a-631988706f20.avro",
"schema-id": 3
},
{
"sequence-number": 54479,
"snapshot-id": 7322125787317837392,
"parent-snapshot-id": 6176209833558056528,
"timestamp-ms": 1755017224903,
"summary": {
"operation": "append",
"kafka.connect.commit-id": "48b912ad-cb3e-40ca-be7f-81fbb6724b0e",
"kafka.connect.offsets.wawi.iceberg-control-<my-table-name>-sink-iceberg":
"{\"0\":5708861}",
"added-data-files": "236825",
"added-records": "80701320",
"added-files-size": "4926359306",
"changed-partition-count": "8828",
"total-records": "865401546",
"total-files-size": "22705821795",
"total-data-files": "327392",
"total-delete-files": "0",
"total-position-deletes": "0",
"total-equality-deletes": "0",
"iceberg-version": "Apache Iceberg 1.9.2 (commit
071d5606bc6199a0be9b3f274ec7fbf111d88821)"
},
"manifest-list":
"abfss://<irrelevant-path>/metadata/snap-7322125787317837392-1-2ac3f8c4-c119-479e-80a6-38c65bfdc7f5.avro",
"schema-id": 3
},
`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]