I think reading the manifest list is an option, but previously we've spent a lot of effort trying to avoid requiring a manifest list scan as a part of upgrade. I do think that's a much more friendly option for users but I'd hate to have to scan every manifest list during upgrade just in case.
On Fri, Sep 12, 2025 at 4:37 PM Ryan Blue <rdb...@gmail.com> wrote: > Yeah, I think a clear message is the right option. There isn't a need for > a spec change, but we may want to detect whether we can convert a table to > v3 by reading the manifest list. Failing to update the table and requiring > a repair operation is better than updating and then failing to write. > > On Fri, Sep 12, 2025 at 11:24 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> I'm in support of just converting the NPE for now into a clearer message >> and finally finishing RepairMetadata/Manifests or whatever we have so folks >> can get those stats back into their manifests. >> >> On Fri, Sep 12, 2025 at 1:04 PM Christian Thiel < >> christian.t.b...@gmail.com> wrote: >> >>> Dear all, >>> while implementing v3 support in Rust, I noticed an unhandled case in >>> Java, which, Java beeing Java, leads to a NullPointerException. >>> >>> In V1 tables, existing_rows_count and added_rows_count is optional. >>> When parsed into a GenericManifestFile, it is thus parsed as optional. >>> >>> When this (data) manifest is now added to a new Snapshot in a V3 table, >>> we get a NullPointerException. >>> >>> I talked with Russel and we believe the proper way forward is to fail >>> via a validation. Please note this only affects very old V1 tables, as >>> newer V1 clients produce these fields. >>> >>> More details, the proposed fix, and a test are in this PR: >>> https://github.com/apache/iceberg/pull/14064 >>> >>> Do we need a Vote for this? Also, we should probably put more work into >>> the RepairMetadata PR, as this would be the way to fix it. >>> >>> Best, >>> Christian >>> >>