Thanks for all the replies. Since I opened the topic of specification , let
me come with a proposal that I'm looking forward to iterate upon with the
help of the community

Laurent

On Wed, Jun 11, 2025 at 4:12 PM yun zou <yunzou.colost...@gmail.com> wrote:

> Yes. I think we have agreed that we will make sure things are described
> clearly in both the spec and website for
> the critical fields added.
>
> We are currently trying to get a webpage out for the Generic Table support
> in Polaris.
>
> Best Regards,
> Yun
>
> On Wed, Jun 11, 2025 at 3:09 PM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > As for the evolution, I do think it is a good strage to evolve step by
> > step, instead of trying to standardize
> > everything in one shot.
> >
> >
> > This approach makes sense to me, but we need to be explicit about it in
> the
> > spec.
> >
> > Cheers,
> > Dmitri.
> >
> > On Wed, Jun 11, 2025 at 5:45 PM yun zou <yunzou.colost...@gmail.com>
> > wrote:
> >
> > > > I mean a doc page similar to [1] that explains what Generic Tables
> are,
> > > how
> > > to use them in Spark, how to use them is some other query engine, and
> > most
> > > importantly the planned evolution for the Generic Tables API and
> > > specification.
> > >
> > > Yes, we can definitely add a webpage to describe the current guarantee
> of
> > > generic table
> > > support, and we can mention that it is currently a beta version. I am
> > > currently working on this.
> > >
> > >
> > > As for the evolution, I do think it is a good strage to evolve step by
> > > step, instead of trying to standardize
> > > everything in one shot.
> > > As we have mentioned during design discussion, standardization of some
> > > fields across different
> > > engines and different formats are very challenging, such as schema
> where
> > > different engines support different
> > > data types.  So we will need more thoughts when adding those fields,
> the
> > > base location is just one of the easy fields.
> > >
> > >
> > > Best Regards,
> > > Yun
> > >
> > >
> > >
> > >
> > > On Wed, Jun 11, 2025 at 2:17 PM Dmitri Bourlatchkov <di...@apache.org>
> > > wrote:
> > >
> > > > > Can you explain what is a proper plain English spec for this
> feature?
> > > >
> > > > I mean a doc page similar to [1] that explains what Generic Tables
> are,
> > > how
> > > > to use them in Spark, how to use them is some other query engine, and
> > > most
> > > > importantly the planned evolution for the Generic Tables API and
> > > > specification.
> > > >
> > > > IMHO, given this discussion thread, we can only offer a "beta" in
> 1.0.
> > > > Meaning the spec and API are subject to change without backward
> > > > compatibility guarantees.
> > > >
> > > > [1] https://polaris.apache.org/in-dev/unreleased/policy/
> > > >
> > > > Cheers,
> > > > Dmitri.
> > > >
> > > > On Wed, Jun 11, 2025 at 4:46 PM Yufei Gu <flyrain...@gmail.com>
> wrote:
> > > >
> > > > > There are solid use cases for adding generic-table support with the
> > > Spark
> > > > > plugin:
> > > > >
> > > > >    - Single Catalog, Many Formats – Keep Delta, CSV, Parquet (and
> > > future
> > > > >    formats) side-by-side in one place instead of juggling separate
> > > > > catalogs.
> > > > >    - Seamless Migrations – Let teams move data from one format to
> > > another
> > > > >    without breaking queries or governance workflows.
> > > > >
> > > > > Happy to brainstorm more improvements and next steps!
> > > > >
> > > > > Now that  [1543] is merged and adds some concrete specialization to
> > > > Generic
> > > > > > Tables API, I believe it is even more important to make a proper
> > > plain
> > > > > > English spec for this feature before 1.0.
> > > > >
> > > > > We've cut the branch for 1.0 release already, and PR 1543 won't be
> a
> > > part
> > > > > of 1.0 release. Can you explain what is a proper plain
> > > > > English spec for this feature? I am glad to review it if you
> propose
> > > one.
> > > > >
> > > > >
> > > > > Yufei
> > > > >
> > > > >
> > > > > On Wed, Jun 11, 2025 at 11:53 AM Dmitri Bourlatchkov <
> > di...@apache.org
> > > >
> > > > > wrote:
> > > > >
> > > > > > Thanks, Laurent, for bringing up spec "readiness" and, I guess,
> by
> > > > > > extension backward compatibility concerns.
> > > > > >
> > > > > > Regardless of how deep current spec is in Polaris, I believe it
> is
> > > > > > important to have it written down as an artifact in the Polaris
> > > repo. I
> > > > > > know we had a design doc at some point, but the project is
> defined
> > by
> > > > > what
> > > > > > is in the repository, plus discussion docs can quickly get out of
> > > sync
> > > > > with
> > > > > > actual code. I believe I raised this point before.
> > > > > >
> > > > > > The API change merged under [1543] is not sufficient to inform
> > users
> > > of
> > > > > > Polaris about the Generic Tables feature. I tend to regard
> comments
> > > in
> > > > > Open
> > > > > > API yaml files as similar to javadoc. They are good for
> developers
> > > > > working
> > > > > > with that specific aspect of the system, but do not provide a
> > > holistic
> > > > > > view.
> > > > > >
> > > > > > Now that  [1543] is merged and adds some concrete specialization
> to
> > > > > Generic
> > > > > > Tables API, I believe it is even more important to make a proper
> > > plain
> > > > > > English spec for this feature before 1.0.
> > > > > >
> > > > > > [1543] https://github.com/apache/polaris/pull/1543
> > > > > >
> > > > > > Cheers,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Wed, Jun 11, 2025 at 10:56 AM Laurent Goujon
> > > > > <laur...@dremio.com.invalid
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > What I was trying to say is that i'm sure there's plenty of
> value
> > > for
> > > > > > > spark, but in it's current state the value is little from a
> > Polaris
> > > > > point
> > > > > > > of view as an open catalog service?
> > > > > > >
> > > > > > > Of course we can follow-up on that but is the current spec
> still
> > > > > > considered
> > > > > > > wip or when 1.0 will be released, we would have to keep
> > supporting
> > > it
> > > > > > even
> > > > > > > if we come up with something more comprehensive?
> > > > > > >
> > > > > > > On Wed, Jun 11, 2025, 00:22 Eric Maynard <
> > eric.w.mayn...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > > I don't think there's a lot of value where the
> specification
> > > of a
> > > > > > table
> > > > > > > > format is left to the client
> > > > > > > > Considering that you currently can use non-Iceberg tables in
> > > > Polaris
> > > > > > with
> > > > > > > > the Spark client and it works end-to-end, I'd have a hard
> time
> > > > > agreeing
> > > > > > > > that there is no value.
> > > > > > > >
> > > > > > > > But I think this discussion is maybe best moved to another
> > > thread.
> > > > > The
> > > > > > > > incremental change to add a location may make sense for the
> > > > existing
> > > > > > > > generic table implementation, even if later we reach a
> > consensus
> > > to
> > > > > rip
> > > > > > > it
> > > > > > > > out and replace it with something more "comprehensive".
> > > > > > > >
> > > > > > > > --EM
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to