Hi Peter

Thanks for the update. I will do a new pass on the PR.

Regards
JB

On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <peter.vary.apa...@gmail.com> wrote:
>
> Hi Team,
> I have rebased the File Format API proposal 
> (https://github.com/apache/iceberg/pull/12298) to include the new changes 
> needed for the Variant types. I would love to hear your feedback, especially 
> Dan and Ryan, as you were the most active during our discussions. If I can 
> help in any way to make the review easier, please let me know.
> Thanks,
> Peter
>
> Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. febr. 28., 
> P, 17:50):
>>
>> Hi everyone,
>> Thanks for all of the actionable, relevant feedback on the PR 
>> (https://github.com/apache/iceberg/pull/12298).
>> Updated the code to address most of them. Please check if you agree with the 
>> general approach.
>> If there is a consensus about the general approach, I could. separate out 
>> the PR to smaller pieces so we can have an easier time to review and merge 
>> those step-by-step.
>> Thanks,
>> Peter
>>
>> Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: 2025. febr. 20., 
>> Cs, 14:14):
>>>
>>> Hi Peter
>>>
>>> sorry for the late reply on this.
>>>
>>> I did a pass on the proposal, it's very interesting and well written.
>>> I like the DataFile API and definitely worth to discuss all together.
>>>
>>> Maybe we can schedule a specific meeting to discuss about DataFile API ?
>>>
>>> Thoughts ?
>>>
>>> Regards
>>> JB
>>>
>>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <peter.vary.apa...@gmail.com> 
>>> wrote:
>>> >
>>> > Hi Team,
>>> >
>>> > As mentioned earlier on our Community Sync I am exploring the possibility 
>>> > to define a FileFormat API for accessing different file formats. I have 
>>> > put together a proposal based on my findings.
>>> >
>>> > -------------------
>>> > Iceberg currently supports 3 different file formats: Avro, Parquet, ORC. 
>>> > With the introduction of Iceberg V3 specification many new features are 
>>> > added to Iceberg. Some of these features like new column types, default 
>>> > values require changes at the file format level. The changes are added by 
>>> > individual developers with different focus on the different file formats. 
>>> > As a result not all of the features are available for every supported 
>>> > file format.
>>> > Also there are emerging file formats like Vortex [1] or Lance [2] which 
>>> > either by specialization, or by applying newer research results could 
>>> > provide better alternatives for certain use-cases like random access for 
>>> > data, or storing ML models.
>>> > -------------------
>>> >
>>> > Please check the detailed proposal [3] and the google document [4], and 
>>> > comment there or reply on the dev list if you have any suggestions.
>>> >
>>> > Thanks,
>>> > Peter
>>> >
>>> > [1] - https://github.com/spiraldb/vortex
>>> > [2] - https://lancedb.github.io/lance/
>>> > [3] - https://github.com/apache/iceberg/issues/12225
>>> > [4] - 
>>> > https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>> >

Reply via email to