Please correct me if I'm wrong. The v3 spec for multi-arg transform only advises to use `source-ids` instead of `source-id`. Although it is implicit and obvious that only bucket transform can apply to multi-arg transform, it is still unclear the order of source columns and algorithm to use to calculate the bucket value.
Is this something we need to clarify? A relevant question is whether to clarify that duplicate values in the `source-ids` are disallowed. Best, Gang On Wed, Apr 30, 2025 at 7:07 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > We should probably come to a resolution on the compressed metadata.json > name as well, > although that's mostly retroactive. V3 would be the place where we could > officially change the naming convention. > > I'm also interested in getting a release with the full implementation of > V3 > as it currently stands before we vote for the spec to be closed so folks > can > really kick the tires a bit before we really close things down. > > I don't think I have any other Spec items left > > On Tue, Apr 29, 2025 at 5:35 PM Ryan Blue <rdb...@gmail.com> wrote: > >> Hi everyone, >> >> I think we’ve reached the point where it’s time to finalize and adopt the >> changes for Iceberg v3. We’ve been working toward this for the last few >> months and have now implemented the v3 features in the Java library to >> reduce the risk of needing changes or hitting problems (row lineage support >> in Spark 3.5 just went in!). We’ve also incorporated some clarifications >> and minor changes back into the spec from what we’ve learned. >> >> At this point, I’m confident that the spec is reasonable and correct. >> Thank you to everyone working on these reference implementations! >> >> The next step is to discuss any outstanding items or concerns about >> moving forward, and then to have a vote thread to adopt the spec. I’ll >> start off with a couple of items: >> >> One potential concern is that the upstream Variant spec hasn’t yet been >> finalized by the Parquet community, but we’ve built a full, independent >> implementation in Iceberg to validate the spec. I think the Parquet >> community is primarily waiting on getting the PRs in to have a Java >> reference implementation, so the risk of changes to the Variant spec is >> small. >> >> There’s also an on-going vote to add encryption keys in support of full >> table encryption that I think we want to get in. >> >> Any other items we may want to clear up? >> >> Ryan >> >