Let's bring this up in the next community sync on Wed 8/13 and see if it warrants another adhoc sync about FileIO.
Best, Kevin Liu On Wed, Aug 6, 2025 at 1:21 PM Stubbs, Michael <michs...@amazon.co.uk.invalid> wrote: > I think Kevin has raised a few good points on the future of FileIO and the > maintainability of the project going forwarded with the AAL default on > Proposal. > > I think we should schedule community sync about this. > > Thank you! > > > > On 2025/07/31 13:12:48 Steve Loughran wrote: > > > On Fri, 25 Jul 2025 at 17:28, Kevin Liu <ke...@apache.org> wrote: > > > > > > *> I think it would be great to also make these improvements available to > > > older Iceberg clients.* > > > > > > Use the S3A connector and turn on vector reads through parquert and you > > > currently get the same performance, about at 30% speedup in TPC > benchmarks > > > (I know, but what else do we have?). S3A connector is going to to move to > > > making the AAL input stream the default in a future release because it's > a > > > better architeture overall. > > > > > > the vector IO stuff can also do speedup on azire and abfs if their > > > connectors support it. (oh and local fs too, FWIW). scatter/gather IO for > > > the win. > > > > > > what AAL adds is format awareness, which could allow for extra > > > opportunities. > > > > > > As an aside, it'd be really good if FileIO added an overload > newInputFile() > > > api call which passed in the file type too, so that AAL &c would know > what > > > type to optimise for, rather than just guess of the extension. knowing > what > > > the v1 and maybe v2 schema offsets would save that GET request on the > > > footter. AAL does a GET of a range at the bottom with the goal of > including > > > the schema, but knowing the exact range would be better. > > > > > > *> BTW, we have one-off community syncs about specific topics, I would be > > > interested to talk more about this as well as other FileIOs. We use the > > > "Iceberg Dev Events" calendar for scheduling if there's interest.* > > > > > > I"d like that too. > > > > > > > > > > > > > >