> I mean a doc page similar to [1] that explains what Generic Tables are, how to use them in Spark, how to use them is some other query engine, and most importantly the planned evolution for the Generic Tables API and specification.
Yes, we can definitely add a webpage to describe the current guarantee of generic table support, and we can mention that it is currently a beta version. I am currently working on this. As for the evolution, I do think it is a good strage to evolve step by step, instead of trying to standardize everything in one shot. As we have mentioned during design discussion, standardization of some fields across different engines and different formats are very challenging, such as schema where different engines support different data types. So we will need more thoughts when adding those fields, the base location is just one of the easy fields. Best Regards, Yun On Wed, Jun 11, 2025 at 2:17 PM Dmitri Bourlatchkov <di...@apache.org> wrote: > > Can you explain what is a proper plain English spec for this feature? > > I mean a doc page similar to [1] that explains what Generic Tables are, how > to use them in Spark, how to use them is some other query engine, and most > importantly the planned evolution for the Generic Tables API and > specification. > > IMHO, given this discussion thread, we can only offer a "beta" in 1.0. > Meaning the spec and API are subject to change without backward > compatibility guarantees. > > [1] https://polaris.apache.org/in-dev/unreleased/policy/ > > Cheers, > Dmitri. > > On Wed, Jun 11, 2025 at 4:46 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > There are solid use cases for adding generic-table support with the Spark > > plugin: > > > > - Single Catalog, Many Formats – Keep Delta, CSV, Parquet (and future > > formats) side-by-side in one place instead of juggling separate > > catalogs. > > - Seamless Migrations – Let teams move data from one format to another > > without breaking queries or governance workflows. > > > > Happy to brainstorm more improvements and next steps! > > > > Now that [1543] is merged and adds some concrete specialization to > Generic > > > Tables API, I believe it is even more important to make a proper plain > > > English spec for this feature before 1.0. > > > > We've cut the branch for 1.0 release already, and PR 1543 won't be a part > > of 1.0 release. Can you explain what is a proper plain > > English spec for this feature? I am glad to review it if you propose one. > > > > > > Yufei > > > > > > On Wed, Jun 11, 2025 at 11:53 AM Dmitri Bourlatchkov <di...@apache.org> > > wrote: > > > > > Thanks, Laurent, for bringing up spec "readiness" and, I guess, by > > > extension backward compatibility concerns. > > > > > > Regardless of how deep current spec is in Polaris, I believe it is > > > important to have it written down as an artifact in the Polaris repo. I > > > know we had a design doc at some point, but the project is defined by > > what > > > is in the repository, plus discussion docs can quickly get out of sync > > with > > > actual code. I believe I raised this point before. > > > > > > The API change merged under [1543] is not sufficient to inform users of > > > Polaris about the Generic Tables feature. I tend to regard comments in > > Open > > > API yaml files as similar to javadoc. They are good for developers > > working > > > with that specific aspect of the system, but do not provide a holistic > > > view. > > > > > > Now that [1543] is merged and adds some concrete specialization to > > Generic > > > Tables API, I believe it is even more important to make a proper plain > > > English spec for this feature before 1.0. > > > > > > [1543] https://github.com/apache/polaris/pull/1543 > > > > > > Cheers, > > > Dmitri. > > > > > > On Wed, Jun 11, 2025 at 10:56 AM Laurent Goujon > > <laur...@dremio.com.invalid > > > > > > > wrote: > > > > > > > What I was trying to say is that i'm sure there's plenty of value for > > > > spark, but in it's current state the value is little from a Polaris > > point > > > > of view as an open catalog service? > > > > > > > > Of course we can follow-up on that but is the current spec still > > > considered > > > > wip or when 1.0 will be released, we would have to keep supporting it > > > even > > > > if we come up with something more comprehensive? > > > > > > > > On Wed, Jun 11, 2025, 00:22 Eric Maynard <eric.w.mayn...@gmail.com> > > > wrote: > > > > > > > > > > I don't think there's a lot of value where the specification of a > > > table > > > > > format is left to the client > > > > > Considering that you currently can use non-Iceberg tables in > Polaris > > > with > > > > > the Spark client and it works end-to-end, I'd have a hard time > > agreeing > > > > > that there is no value. > > > > > > > > > > But I think this discussion is maybe best moved to another thread. > > The > > > > > incremental change to add a location may make sense for the > existing > > > > > generic table implementation, even if later we reach a consensus to > > rip > > > > it > > > > > out and replace it with something more "comprehensive". > > > > > > > > > > --EM > > > > > > > > > > > > > > >