Hi Ryan,

I think I wasn't clear (sorry about that): by
"catalog-level-versioning" capability, I don't mean to actually define
any specific version, it's more to indicate to the client how the
catalog behaving in terms of versioning (per table/views or global to
catalog). It's a "pure" capability informing the client about catalog
behavior.

Regards
JB

On Fri, Jun 21, 2024 at 1:44 AM Ryan Blue <b...@databricks.com.invalid> wrote:
>
> I think the capabilities proposal is intended to let people build in a 
> different way than a versioning system would. It's probably valuable to think 
> through the differences between the approaches.
>
> The capabilities that are proposed let catalogs declare sets of features that 
> are supported, like support for tables, views, server-side planning, etc. 
> While you can think of that as a sort of versioning, the intent is to be more 
> flexible. A catalog implementation might not choose to implement server-side 
> planning because it doesn't have access to the underlying metadata files. For 
> example see #10089 for a use case where object store permissions aren't what 
> you might expect. Capabilities allow the service to tell the client what is 
> supported for graceful fallback, not just backward compatibility.
>
> Versioning the API is a different approach because to support the latest 
> version, an implementation would need to support everything. If we had tables 
> in v1, views in v2, and server-side planning in v3, then to support 
> server-side planning a catalog would also need to support views. That's not 
> necessarily a bad thing since it creates strong compatibility requirements; 
> that's what we chose to do for the table format.
>
> I think the question to help us choose between the two options is whether we 
> expect catalogs to all support the same set of features over time, or if we 
> expect some differences. I have a weakly-held opinion that we expect catalogs 
> not to support the same features, but it is very likely based on the 
> assumption that catalogs have significant differences because of limited 
> back-ends (like Hive). That may not be correct since there are quite a few 
> new catalog implementations using the protocol. Perhaps we should consider 
> stronger requirements for what needs to be provided through versioning.
>
> Ryan
>
> On Thu, Jun 20, 2024 at 7:15 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
> wrote:
>>
>> Hi Eduard,
>>
>> That makes sense. Thanks.
>>
>> Maybe we can already anticipate a little and add a "catalog-level
>> versioning" capability as it's a feature supported by Nessie catalog
>> for instance ?
>> We can also imagine a more generic capability like "version scope".
>>
>> Regards
>> JB
>>
>> On Thu, Jun 20, 2024 at 3:47 PM Eduard Tudenhoefner
>> <etudenhoef...@apache.org> wrote:
>> >
>> > Hey JB,
>> >
>> > If adding UDFs would require adding new endpoints, then you'd also add a 
>> > udf capability when adding UDF support to the REST catalog.
>> > That way a client knows whether it's safe to call the UDF endpoints on a 
>> > given server.
>> >
>> > Eduard
>> >
>> > On Thu, Jun 20, 2024 at 1:59 PM Jean-Baptiste Onofré <j...@nanthrax.net> 
>> > wrote:
>> >>
>> >> Hi Eduard
>> >>
>> >> It looks good to me. I have a question however :)
>> >>
>> >> Later, Imagine, we add UDF support in Iceberg. Does it mean that you
>> >> will need to update REST Spec (ConfigResponse/capabilities) to add
>> >> this capability ?
>> >> For consistency, I think it makes sense as I don't think we often add
>> >> new capability. And also as every REST server would have to implement
>> >> it, /config is generic enough to add custom/new capabilities (but the
>> >> client will have to deal with capability).
>> >>
>> >> Am I right?
>> >>
>> >> Thanks !
>> >> Regards
>> >> JB
>> >>
>> >> On Thu, Jun 20, 2024 at 1:28 PM Eduard Tudenhoefner
>> >> <etudenhoef...@apache.org> wrote:
>> >> >
>> >> > Hey everyone,
>> >> >
>> >> > I'd like to bring up the discussion around describing REST server 
>> >> > capabilities via the /config endpoint.
>> >> > There is PR #9940 that describes the OpenAPI spec changes.
>> >> >
>> >> > Mainly we'd like to have a capabilities field in the ConfigResponse 
>> >> > that allows servers to indicate to clients which capabilities are being 
>> >> > supported.
>> >> >
>> >> > So far we have the following capabilities:
>> >> >
>> >> > tables
>> >> > views
>> >> > remote-signing
>> >> > vended-credentials
>> >> > multi-table-commit
>> >> > register-table
>> >> > table-metrics
>> >> > oauth2
>> >> >
>> >> >
>> >> > The general idea behind a capability is that if e.g. a server supports 
>> >> > views, then that server must implement all endpoints grouped under that 
>> >> > capability.
>> >> > It's worth noting that the /config endpoint is currently being implicit 
>> >> > (meaning that every REST server would have to implement it).
>> >> >
>> >> > One discussion point that came up during review is how we want to 
>> >> > handle capabilities and backwards compatibility and what the default 
>> >> > capability would be, since older servers don't know anything about 
>> >> > capabilities (in such a case we could assume that the default 
>> >> > capabilities would be oauth2 / tables).
>> >> >
>> >> > Are there any other capabilities that we'd like to include in the list?
>> >> >
>> >> > Eduard
>
>
>
> --
> Ryan Blue
> Databricks

Reply via email to