+1 -- comparing decoded values is a great fit for value-correctness cases
(variant, deletes, nested), but I am not sure if it would catch type-level
divergence like equality_ids int-vs-long (apache/iceberg-go#880), since the
values decode equal and a plain-JSON expected can't even express "int, not
long" — so it may be worth scoping v1 around value-correctness first and
bringing the type-level cases in with a typed expected or a write-side
check (along the lines Tanmay described).

Thanks
Xin

On Tue, Jun 30, 2026 at 9:32 AM Jones, Danny <[email protected]> wrote:

> +1, I’m broadly aligned with this proposal. I think having a reference
> physical artifact to then compare against is valuable.
>
>
>
> My team have been working on a few sets of tests that are of a similar
> nature. The motivation for us has been correctness of maintenance
> operations. I’ll share a bit of info here about these, in case its relevant
> and to understand how they may complement this proposal.
>
>
>
> First set tackle the question: is the compaction/replace operation
> resulting in the same logical data? We generate a table using Spark and
> then run some compaction operation (using a similar runner harness API like
> is proposed in this doc, i.e. “please compact table X”, could be PySpark,
> could be an API call to some maintenance service). Afterwards, we run a few
> common query engines (Spark, DuckDB, etc.) and verify that they agree with
> respect to an order-independent checksum, row count, NDVs; plus, that they
> return the same presence or absent for sampled rows to exercise the
> metadata path.
>
>
>
> Second, we have a table builder that, using the small model property
> (formal methods), is building an exhaustive set of table layouts which we
> can use as input to the above tests. For example, one test case is 3
> physical rows across 2 data files; 2 rows are deleted across 2 positional
> delete files.
>
>
>
> Third, we have a validator we’ve written in Rust that just loads a
> metadata DAG (the JSON, manifest lists, manifests) and validates a bunch of
> invariants – i.e. data/file sequence numbers in manifest entries are
> optional ONLY for Added status, sequence numbers are non-negative; for the
> replace operation: table-uuid is immutable, schema fields don’t change,
> etc.. I think it’s a little imperfect since it relies on iceberg-rust which
> swaps some things under the hood very helpfully but means we aren’t testing
> the actual physical artifacts.
>
>
>
> Happy to chat more about some of these – the first two I’ve been working
> with a colleague on getting in a shape to publish on GitHub.
>
>
>
> Danny
>
>
>
> On 2026/06/29 18:40:43 Tanmay Rauth wrote:
>
> > Thanks Neelesh, the doc lays this out really well, and +1 to Matt. The
>
> > framing I'd most want to underline is one you already make: the hardest
>
> > cases aren't bugs, they're where two implementations both  follow the
> spec
>
> > faithfully and still disagree. The day-transform field type
> (iceberg#16414)
>
> > is a good example, and I think your point that writing down the expected
>
> > value is what forces the ambiguity to get  resolved is one of the
> strongest
>
> > motivations for the proposal. That's something per-implementation CI can
>
> > never do on its own.
>
> >
>
> > A couple of small things, for whatever they're worth:
>
> >
>
> > - The decoded-values-not-bytes approach feels right. The day-transform
> case
>
> > (iceberg#16414) is exactly where a byte-level comparison would flag two
>
> > valid encodings as different, while a value comparison correctly treats
>
> > them as equivalent.
>
> > - On the open question of reads-only vs. reads+writes: iceberg-go#880
>
> > actually originated on the write side (Go wrote equality_ids as long, and
>
> > Java failed when reading it). It might be worth structuring each  fixture
>
> > as input -> golden file -> expected value from the start. The same
> fixture
>
> > can then exercise both directions: read tests verify that the golden file
>
> > decodes to the expected value, while write tests  verify that an
>
> > implementation produces a conforming golden file. That avoids having to
>
> > re-author fixtures when write conformance is added later.
>
> >
>
> > Thanks for putting this together.
>
> >
>
> > Regards,
>
> > Tanmay Rauth
>
> >
>
> > On Mon, Jun 29, 2026 at 9:57 AM Sung Yun <[email protected]> wrote:
>
> >
>
> > > +1, thanks Neelesh. Linking my parallel thread and doc for anyone who
>
> > > wants the detail [1].
>
> > >
>
> > > Having read your write-up, I think the two are substantially the same
>
> > > proposal, with just narrow difference around proposed repo layout and
> the
>
> > > integration plan. I think it's a great sign that there's already a
> great
>
> > > amount of overlap in our thoughts. I agree that a community sync sounds
>
> > > worthwhile, and it would also be useful to converge the two docs in
>
> > > parallel so we bring one proposal back here for review and convergence
>
> > > through lazy consensus.
>
> > >
>
> > > A few areas from my version/poc [2] I think are worth folding in as
> points
>
> > > to discuss and converge on:
>
> > >
>
> > > - Contribution/README guides for adding and reviewing fixtures.
>
> > > - A submodule-based integration pattern, with each implementation
> pinning
>
> > > the fixture repo to a commit.
>
> > > - How each test surface is meant to be consumed and integrated by the
>
> > > individual implementations in their CI
>
> > >
>
> > > Sung
>
> > >
>
> > > [1] https://lists.apache.org/thread/964630c6q0jovs579x1jzb1t0o19jgjg
> <https://urldefense.com/v3/__https://lists.apache.org/thread/964630c6q0jovs579x1jzb1t0o19jgjg__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0MMGZoNeQ$>
>
> > > [2] https://github.com/sungwy/iceberg-testing/pull/1
> <https://urldefense.com/v3/__https://github.com/sungwy/iceberg-testing/pull/1__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0N-ZK5Z-w$>
>
> > >
>
> > > On 2026/06/29 16:47:18 Neelesh Salian wrote:
>
> > > > Thanks Matt. Seems like there is interest in doing this.
>
> > > > Separately, Sung has a similar proposal in the community and we are
>
> > > > connected offline to sync and converge since the proposals are along
>
> > > > similar lines.
>
> > > > Will update this thread as we discuss.
>
> > > > If there are more folks interested in this, it might be worth doing a
>
> > > > community on-off sync to brainstorm this as well.
>
> > > >
>
> > > > On Mon, Jun 29, 2026 at 8:30 AM Matt Topol <[email protected]>
>
> > > wrote:
>
> > > >
>
> > > > > Thanks for the proposal! I'm gonna read through this, but I just
>
> > > wanted to
>
> > > > > chime in that this is something I've been desiring and hoping for
> for a
>
> > > > > long time. We've encountered tons of cases during the development
> of
>
> > > > > iceberg-go where implementations diverged while still following the
>
> > > letter
>
> > > > > of the spec. This kind of testing is very much needed.
>
> > > > >
>
> > > > > --Matt
>
> > > > >
>
> > > > > On Mon, Jun 29, 2026, 11:11 AM Neelesh Salian <
>
> > > [email protected]>
>
> > > > > wrote:
>
> > > > >
>
> > > > >> Hi all,
>
> > > > >>
>
> > > > >> Each Iceberg implementation has its own tests, but there isn't a
>
> > > shared
>
> > > > >> way to check that
>
> > > > >> a table written by one is read the same way by another.
>
> > > > >> A few examples that have come up across the implementations: a
>
> > > manifest
>
> > > > >> written by one client that another can't read,
>
> > > > >> a metadata.json one writer produces that another rejects because
> they
>
> > > > >> disagree on whether a field is required, and a partition transform
>
> > > that
>
> > > > >> ends up encoded more than one way across implementations. Some of
>
> > > these
>
> > > > >> turned out to be bugs, others places where the spec is ambiguous.
>
> > > > >>
>
> > > > >> We think this is worth solving with some form of shared
>
> > > > >> cross-implementation conformance testing, and we'd like to align
> as a
>
> > > > >> community on whether to take it on and how best to start. We've
>
> > > written up
>
> > > > >> our current thinking, a possible direction, and a small prototype
> in
>
> > > the
>
> > > > >> doc below.
>
> > > > >>
>
> > > > >> Details, a repo design, and the interop failures we've collected:
>
> > > > >>
>
> > >
> https://docs.google.com/document/d/1HRcUMcrqUjo4CjGdwAIw85f7miWOGJ4ZJ90AgHbahaw/edit?usp=sharing
> <https://urldefense.com/v3/__https://docs.google.com/document/d/1HRcUMcrqUjo4CjGdwAIw85f7miWOGJ4ZJ90AgHbahaw/edit?usp=sharing__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0MwXoHPXQ$>
>
> > > > >>
>
> > > > >>
>
> > > > >> Feedback welcome on whether this is worth doing and how we might
> get
>
> > > > >> started.
>
> > > > >>
>
> > > > >> Thanks,
>
> > > > >> Neelesh (with Andrei Tserakhau)
>
> > > > >>
>
> > > > >
>
> > > >
>
> > >
>
> >
>

Reply via email to