Re: [DISCUSS] Finalizing the v3 spec

Gang Wu Tue, 29 Apr 2025 18:49:12 -0700

Please correct me if I'm wrong.

The v3 spec for multi-arg transform only advises to use `source-ids`
instead of `source-id`. Although it is implicit and obvious that only
bucket transform can apply to multi-arg transform, it is still unclear the
order of source columns and algorithm to use to calculate the bucket value.


Is this something we need to clarify? A relevant question is whether to
clarify that duplicate values in the `source-ids` are disallowed.

Best,
Gang

On Wed, Apr 30, 2025 at 7:07 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> We should probably come to a resolution on the compressed metadata.json
> name as well,
> although that's mostly retroactive. V3 would be the place where we could
> officially change the naming convention.
>
> I'm also interested in getting a release with the full implementation of
> V3
> as it currently stands before we vote for the spec to be closed so folks
> can
> really kick the tires a bit before we really close things down.
>
> I don't think I have any other Spec items left
>
> On Tue, Apr 29, 2025 at 5:35 PM Ryan Blue <rdb...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> I think we’ve reached the point where it’s time to finalize and adopt the
>> changes for Iceberg v3. We’ve been working toward this for the last few
>> months and have now implemented the v3 features in the Java library to
>> reduce the risk of needing changes or hitting problems (row lineage support
>> in Spark 3.5 just went in!). We’ve also incorporated some clarifications
>> and minor changes back into the spec from what we’ve learned.
>>
>> At this point, I’m confident that the spec is reasonable and correct.
>> Thank you to everyone working on these reference implementations!
>>
>> The next step is to discuss any outstanding items or concerns about
>> moving forward, and then to have a vote thread to adopt the spec. I’ll
>> start off with a couple of items:
>>
>> One potential concern is that the upstream Variant spec hasn’t yet been
>> finalized by the Parquet community, but we’ve built a full, independent
>> implementation in Iceberg to validate the spec. I think the Parquet
>> community is primarily waiting on getting the PRs in to have a Java
>> reference implementation, so the risk of changes to the Variant spec is
>> small.
>>
>> There’s also an on-going vote to add encryption keys in support of full
>> table encryption that I think we want to get in.
>>
>> Any other items we may want to clear up?
>>
>> Ryan
>>
>

Re: [DISCUSS] Finalizing the v3 spec

Reply via email to