Perhaps a newbie question, but if the requirement is to just read v2 tables with equality and/or position delete files, does that also require Spark 3.2 or is that supported in Spark 2.4 as well (even if in a sub-optimal way).
Thanks, - Puneet On Wed, Nov 17, 2021 at 10:07 AM Ryan Blue <[email protected]> wrote: > The plan is to support it in 3.2. I think that we're very close but Anton > is the expert there. > > On Tue, Nov 16, 2021 at 6:22 AM Sreeram Garlapati <[email protected]> > wrote: > >> This makes sense, thanks a lot @Ryan Blue <[email protected]>. >> >> Are all building blocks for MOR support (features like - delta-based >> plans) fully available in Spark 3.2 - or is there any reason we would need >> Spark 3.3? Or is there more ongoing work needed to fully validate this? I >> am in need of this specific data point *about the Spark version* - to >> move our organization into the correct Spark version. Truly appreciate your >> help. >> >> Best regards, >> Sreeram >> >> On Mon, Nov 15, 2021 at 4:37 PM Ryan Blue <[email protected]> wrote: >> >>> Sreeram, >>> >>> The project tracking this is here: >>> https://github.com/apache/iceberg/projects/11 >>> >>> It isn’t easy to get a good picture, since most of the PRs are merged. >>> But Anton is working on the next set of PRs for Spark. Maybe Anton can find >>> some time to add a few notes about what's left to be done. >>> >>> What’s been done so far is pretty significant: >>> >>> - Add new writers that can handle deletes across multiple partition >>> specs >>> - Add Spark 3.2 module and refactor Spark builds >>> - Add metadata columns to Spark 3.2 >>> - Add support for required distribution and ordering in Spark 3.2 >>> - Support Spark 3.2 dynamic filtering >>> >>> Many of those are the building blocks for the delta-based plans. And >>> it’s really amazing to finally have support for some major improvements: >>> dynamic filtering on all queries, metadata columns, and required >>> distribution and ordering! >>> >>> Ryan >>> >>> On Thu, Nov 11, 2021 at 11:46 PM Sreeram Garlapati < >>> [email protected]> wrote: >>> >>>> Hello Iceberg devs! >>>> >>>> After going through the mail threads (especially "Spark version support >>>> strategy") and relevant PRs - it looks like - *Merge on Read* Support >>>> (ie., Spark writers writing equality deletes) will be available with >>>> *Iceberg **+ Spark 3.2*. Is this understanding correct!? Or is this >>>> something that will be available only with Iceberg on Spark 3.3!? >>>> >>>> Would really appreciate it if someone can point me to any place - which >>>> tracks - the remaining work. >>>> >>>> Thanks, >>>> Sreeram >>>> >>> >>> >>> -- >>> Ryan Blue >>> Tabular >>> >> > > -- > Ryan Blue > Tabular >
