Okay I am looking at scheduling a discussion on this for Wednesday the 27th at 4:00 – 5:00pm UTC. Does this not work for anyone who would like to attend?
From: Tushar Choudhary <tushar.choudhary...@gmail.com> Reply to: "dev@iceberg.apache.org" <dev@iceberg.apache.org> Date: Thursday, 14 August 2025 at 04:39 To: "dev@iceberg.apache.org" <dev@iceberg.apache.org> Subject: RE: [EXTERNAL] [Discuss] Analytics Accelerator Library for Amazon S3 as default S3 Input Stream CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. I agree Cheers, Tushar Choudhary On Wed, 13 Aug 2025 at 10:39 PM, Kevin Liu <kevinjq...@apache.org<mailto:kevinjq...@apache.org>> wrote: Hey everyone, As discussed on the community sync today, there's enough interest around AAL, S3FileIO, and FileIO in general that we would like to schedule an ad hoc sync for this topic. I'll work with Michael to find a suitable time. We'll add it to the "Iceberg Dev Events"<https://iceberg.apache.org/community/#apache-iceberg-community-calendar> and also post on the devlist when we figure out more details. Best, Kevin Liu On Mon, Aug 11, 2025 at 8:00 AM Kevin Liu <kevinjq...@apache.org<mailto:kevinjq...@apache.org>> wrote: Let's bring this up in the next community sync on Wed 8/13 and see if it warrants another adhoc sync about FileIO. Best, Kevin Liu On Wed, Aug 6, 2025 at 1:21 PM Stubbs, Michael <michs...@amazon.co.uk.invalid> wrote: I think Kevin has raised a few good points on the future of FileIO and the maintainability of the project going forwarded with the AAL default on Proposal. I think we should schedule community sync about this. Thank you! On 2025/07/31 13:12:48 Steve Loughran wrote: > On Fri, 25 Jul 2025 at 17:28, Kevin Liu > <ke...@apache.org<mailto:ke...@apache.org>> wrote: > > *> I think it would be great to also make these improvements available to > older Iceberg clients.* > > Use the S3A connector and turn on vector reads through parquert and you > currently get the same performance, about at 30% speedup in TPC benchmarks > (I know, but what else do we have?). S3A connector is going to to move to > making the AAL input stream the default in a future release because it's a > better architeture overall. > > the vector IO stuff can also do speedup on azire and abfs if their > connectors support it. (oh and local fs too, FWIW). scatter/gather IO for > the win. > > what AAL adds is format awareness, which could allow for extra > opportunities. > > As an aside, it'd be really good if FileIO added an overload newInputFile() > api call which passed in the file type too, so that AAL &c would know what > type to optimise for, rather than just guess of the extension. knowing what > the v1 and maybe v2 schema offsets would save that GET request on the > footter. AAL does a GET of a range at the bottom with the goal of including > the schema, but knowing the exact range would be better. > > *> BTW, we have one-off community syncs about specific topics, I would be > interested to talk more about this as well as other FileIOs. We use the > "Iceberg Dev Events" calendar for scheduling if there's interest.* > > I"d like that too. > > > > >