As for the evolution, I do think it is a good strage to evolve step by step, instead of trying to standardize everything in one shot.
This approach makes sense to me, but we need to be explicit about it in the spec. Cheers, Dmitri. On Wed, Jun 11, 2025 at 5:45 PM yun zou <yunzou.colost...@gmail.com> wrote: > > I mean a doc page similar to [1] that explains what Generic Tables are, > how > to use them in Spark, how to use them is some other query engine, and most > importantly the planned evolution for the Generic Tables API and > specification. > > Yes, we can definitely add a webpage to describe the current guarantee of > generic table > support, and we can mention that it is currently a beta version. I am > currently working on this. > > > As for the evolution, I do think it is a good strage to evolve step by > step, instead of trying to standardize > everything in one shot. > As we have mentioned during design discussion, standardization of some > fields across different > engines and different formats are very challenging, such as schema where > different engines support different > data types. So we will need more thoughts when adding those fields, the > base location is just one of the easy fields. > > > Best Regards, > Yun > > > > > On Wed, Jun 11, 2025 at 2:17 PM Dmitri Bourlatchkov <di...@apache.org> > wrote: > > > > Can you explain what is a proper plain English spec for this feature? > > > > I mean a doc page similar to [1] that explains what Generic Tables are, > how > > to use them in Spark, how to use them is some other query engine, and > most > > importantly the planned evolution for the Generic Tables API and > > specification. > > > > IMHO, given this discussion thread, we can only offer a "beta" in 1.0. > > Meaning the spec and API are subject to change without backward > > compatibility guarantees. > > > > [1] https://polaris.apache.org/in-dev/unreleased/policy/ > > > > Cheers, > > Dmitri. > > > > On Wed, Jun 11, 2025 at 4:46 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > > > There are solid use cases for adding generic-table support with the > Spark > > > plugin: > > > > > > - Single Catalog, Many Formats – Keep Delta, CSV, Parquet (and > future > > > formats) side-by-side in one place instead of juggling separate > > > catalogs. > > > - Seamless Migrations – Let teams move data from one format to > another > > > without breaking queries or governance workflows. > > > > > > Happy to brainstorm more improvements and next steps! > > > > > > Now that [1543] is merged and adds some concrete specialization to > > Generic > > > > Tables API, I believe it is even more important to make a proper > plain > > > > English spec for this feature before 1.0. > > > > > > We've cut the branch for 1.0 release already, and PR 1543 won't be a > part > > > of 1.0 release. Can you explain what is a proper plain > > > English spec for this feature? I am glad to review it if you propose > one. > > > > > > > > > Yufei > > > > > > > > > On Wed, Jun 11, 2025 at 11:53 AM Dmitri Bourlatchkov <di...@apache.org > > > > > wrote: > > > > > > > Thanks, Laurent, for bringing up spec "readiness" and, I guess, by > > > > extension backward compatibility concerns. > > > > > > > > Regardless of how deep current spec is in Polaris, I believe it is > > > > important to have it written down as an artifact in the Polaris > repo. I > > > > know we had a design doc at some point, but the project is defined by > > > what > > > > is in the repository, plus discussion docs can quickly get out of > sync > > > with > > > > actual code. I believe I raised this point before. > > > > > > > > The API change merged under [1543] is not sufficient to inform users > of > > > > Polaris about the Generic Tables feature. I tend to regard comments > in > > > Open > > > > API yaml files as similar to javadoc. They are good for developers > > > working > > > > with that specific aspect of the system, but do not provide a > holistic > > > > view. > > > > > > > > Now that [1543] is merged and adds some concrete specialization to > > > Generic > > > > Tables API, I believe it is even more important to make a proper > plain > > > > English spec for this feature before 1.0. > > > > > > > > [1543] https://github.com/apache/polaris/pull/1543 > > > > > > > > Cheers, > > > > Dmitri. > > > > > > > > On Wed, Jun 11, 2025 at 10:56 AM Laurent Goujon > > > <laur...@dremio.com.invalid > > > > > > > > > wrote: > > > > > > > > > What I was trying to say is that i'm sure there's plenty of value > for > > > > > spark, but in it's current state the value is little from a Polaris > > > point > > > > > of view as an open catalog service? > > > > > > > > > > Of course we can follow-up on that but is the current spec still > > > > considered > > > > > wip or when 1.0 will be released, we would have to keep supporting > it > > > > even > > > > > if we come up with something more comprehensive? > > > > > > > > > > On Wed, Jun 11, 2025, 00:22 Eric Maynard <eric.w.mayn...@gmail.com > > > > > > wrote: > > > > > > > > > > > > I don't think there's a lot of value where the specification > of a > > > > table > > > > > > format is left to the client > > > > > > Considering that you currently can use non-Iceberg tables in > > Polaris > > > > with > > > > > > the Spark client and it works end-to-end, I'd have a hard time > > > agreeing > > > > > > that there is no value. > > > > > > > > > > > > But I think this discussion is maybe best moved to another > thread. > > > The > > > > > > incremental change to add a location may make sense for the > > existing > > > > > > generic table implementation, even if later we reach a consensus > to > > > rip > > > > > it > > > > > > out and replace it with something more "comprehensive". > > > > > > > > > > > > --EM > > > > > > > > > > > > > > > > > > > > >