Re: [Discuss] Add `location` to generic table spec

Laurent Goujon Wed, 11 Jun 2025 12:59:06 -0700

What I'm getting from this conversation is that if one want to know what to
do with a generic table, just check the spark code which is not what I was
hoping for as the standard for new entities to be added to the
specification. If you compare with iceberg itself, how to access iceberg
tables and interpret data is part of the table format, and when IRC was
introduced, the interaction between the rest api and a client was also part
of the iceberg project, not in the spark project.


As I'm myself a query engine developer, should I also propose my own
generic table format with just a json schema and forward people to my
source code if they want to be interoperable?

I'm aware that the initial spec was approved some time ago, but you said
the location is a critical addition to the spec, so isn't that a good
opportunity to reflect on the specification and see if it can meet our long
term goals or not?

Laurent

On Wed, Jun 11, 2025 at 11:11 AM yun zou <yunzou.colost...@gmail.com> wrote:

> Hi Laurent,
>
> I do agree that the generic table spec needs to be evolved to provide
> richer standardization. However,
> the current support of generic table does provides good amount of values:
> 1) Polaris can be a centralized catalog service for discovering all tables,
> not just Iceberg tables. (Of course,
>     there is large room to improve, for Delta table, as far as the base
> location is available, different engine will be
>     able to access them).
> 2) Engine specific plugin to help engine to interpret the spec. Spark is
> the first one, which covers a large number
>     of use cases, and we do have users.  We are also working on providing a
> connector for trinno to unblock the
>     trinno use case.
>
> As Polaris is currently Iceberg native, the strategy we are taking is to
> start with simple and enough support to
> unblock the Spark use cases. Since we own the spec, the spec should evolve
> quickly to adapt for different use cases
> and standardization across different formats.
> In other words, the 1.0 generic table support is the first step for Polaris
> to move towards non-iceberg tables support,
> there is definitely long way till we can fully support all operations
> across various table formats, and we would like to
> evolve the spec quickly based on specific use cases.
>
> The strategy and direction has been brought up to the community and there
> is agreement. If we there is general
> concerns about the direction of how generic table should evolve, i think we
> can definitely open a different thread
> and have a discussion there. This thread is only intended to discuss the
> evolving the current Spec one step forward
> to standardize one of the critical information for cross engine sharing,
> and eventually help the support for credential
> vending.
>
>
> Best Regards,
> Yun
>
>
> On Wed, Jun 11, 2025 at 8:15 AM Laurent Goujon <laur...@dremio.com.invalid
> >
> wrote:
>
> > What I was trying to say is that i'm sure there's plenty of value for
> > spark, but in it's current state the value is little from a Polaris point
> > of view as an open catalog service?
> >
> > Of course we can follow-up on that but is the current spec still
> considered
> > wip or when 1.0 will be released, we would have to keep supporting it
> even
> > if we come up with something more comprehensive?
> >
> > On Wed, Jun 11, 2025, 00:22 Eric Maynard <eric.w.mayn...@gmail.com>
> wrote:
> >
> > > > I don't think there's a lot of value where the specification of a
> table
> > > format is left to the client
> > > Considering that you currently can use non-Iceberg tables in Polaris
> with
> > > the Spark client and it works end-to-end, I'd have a hard time agreeing
> > > that there is no value.
> > >
> > > But I think this discussion is maybe best moved to another thread. The
> > > incremental change to add a location may make sense for the existing
> > > generic table implementation, even if later we reach a consensus to rip
> > it
> > > out and replace it with something more "comprehensive".
> > >
> > > --EM
> > >
> >
>

Re: [Discuss] Add `location` to generic table spec

Reply via email to