Hi Peter Thanks for the update. I will do a new pass on the PR.
Regards JB On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <peter.vary.apa...@gmail.com> wrote: > > Hi Team, > I have rebased the File Format API proposal > (https://github.com/apache/iceberg/pull/12298) to include the new changes > needed for the Variant types. I would love to hear your feedback, especially > Dan and Ryan, as you were the most active during our discussions. If I can > help in any way to make the review easier, please let me know. > Thanks, > Peter > > Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. febr. 28., > P, 17:50): >> >> Hi everyone, >> Thanks for all of the actionable, relevant feedback on the PR >> (https://github.com/apache/iceberg/pull/12298). >> Updated the code to address most of them. Please check if you agree with the >> general approach. >> If there is a consensus about the general approach, I could. separate out >> the PR to smaller pieces so we can have an easier time to review and merge >> those step-by-step. >> Thanks, >> Peter >> >> Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: 2025. febr. 20., >> Cs, 14:14): >>> >>> Hi Peter >>> >>> sorry for the late reply on this. >>> >>> I did a pass on the proposal, it's very interesting and well written. >>> I like the DataFile API and definitely worth to discuss all together. >>> >>> Maybe we can schedule a specific meeting to discuss about DataFile API ? >>> >>> Thoughts ? >>> >>> Regards >>> JB >>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <peter.vary.apa...@gmail.com> >>> wrote: >>> > >>> > Hi Team, >>> > >>> > As mentioned earlier on our Community Sync I am exploring the possibility >>> > to define a FileFormat API for accessing different file formats. I have >>> > put together a proposal based on my findings. >>> > >>> > ------------------- >>> > Iceberg currently supports 3 different file formats: Avro, Parquet, ORC. >>> > With the introduction of Iceberg V3 specification many new features are >>> > added to Iceberg. Some of these features like new column types, default >>> > values require changes at the file format level. The changes are added by >>> > individual developers with different focus on the different file formats. >>> > As a result not all of the features are available for every supported >>> > file format. >>> > Also there are emerging file formats like Vortex [1] or Lance [2] which >>> > either by specialization, or by applying newer research results could >>> > provide better alternatives for certain use-cases like random access for >>> > data, or storing ML models. >>> > ------------------- >>> > >>> > Please check the detailed proposal [3] and the google document [4], and >>> > comment there or reply on the dev list if you have any suggestions. >>> > >>> > Thanks, >>> > Peter >>> > >>> > [1] - https://github.com/spiraldb/vortex >>> > [2] - https://lancedb.github.io/lance/ >>> > [3] - https://github.com/apache/iceberg/issues/12225 >>> > [4] - >>> > https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds >>> >