Hi Ryan, I think I wasn't clear (sorry about that): by "catalog-level-versioning" capability, I don't mean to actually define any specific version, it's more to indicate to the client how the catalog behaving in terms of versioning (per table/views or global to catalog). It's a "pure" capability informing the client about catalog behavior.
Regards JB On Fri, Jun 21, 2024 at 1:44 AM Ryan Blue <b...@databricks.com.invalid> wrote: > > I think the capabilities proposal is intended to let people build in a > different way than a versioning system would. It's probably valuable to think > through the differences between the approaches. > > The capabilities that are proposed let catalogs declare sets of features that > are supported, like support for tables, views, server-side planning, etc. > While you can think of that as a sort of versioning, the intent is to be more > flexible. A catalog implementation might not choose to implement server-side > planning because it doesn't have access to the underlying metadata files. For > example see #10089 for a use case where object store permissions aren't what > you might expect. Capabilities allow the service to tell the client what is > supported for graceful fallback, not just backward compatibility. > > Versioning the API is a different approach because to support the latest > version, an implementation would need to support everything. If we had tables > in v1, views in v2, and server-side planning in v3, then to support > server-side planning a catalog would also need to support views. That's not > necessarily a bad thing since it creates strong compatibility requirements; > that's what we chose to do for the table format. > > I think the question to help us choose between the two options is whether we > expect catalogs to all support the same set of features over time, or if we > expect some differences. I have a weakly-held opinion that we expect catalogs > not to support the same features, but it is very likely based on the > assumption that catalogs have significant differences because of limited > back-ends (like Hive). That may not be correct since there are quite a few > new catalog implementations using the protocol. Perhaps we should consider > stronger requirements for what needs to be provided through versioning. > > Ryan > > On Thu, Jun 20, 2024 at 7:15 AM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: >> >> Hi Eduard, >> >> That makes sense. Thanks. >> >> Maybe we can already anticipate a little and add a "catalog-level >> versioning" capability as it's a feature supported by Nessie catalog >> for instance ? >> We can also imagine a more generic capability like "version scope". >> >> Regards >> JB >> >> On Thu, Jun 20, 2024 at 3:47 PM Eduard Tudenhoefner >> <etudenhoef...@apache.org> wrote: >> > >> > Hey JB, >> > >> > If adding UDFs would require adding new endpoints, then you'd also add a >> > udf capability when adding UDF support to the REST catalog. >> > That way a client knows whether it's safe to call the UDF endpoints on a >> > given server. >> > >> > Eduard >> > >> > On Thu, Jun 20, 2024 at 1:59 PM Jean-Baptiste Onofré <j...@nanthrax.net> >> > wrote: >> >> >> >> Hi Eduard >> >> >> >> It looks good to me. I have a question however :) >> >> >> >> Later, Imagine, we add UDF support in Iceberg. Does it mean that you >> >> will need to update REST Spec (ConfigResponse/capabilities) to add >> >> this capability ? >> >> For consistency, I think it makes sense as I don't think we often add >> >> new capability. And also as every REST server would have to implement >> >> it, /config is generic enough to add custom/new capabilities (but the >> >> client will have to deal with capability). >> >> >> >> Am I right? >> >> >> >> Thanks ! >> >> Regards >> >> JB >> >> >> >> On Thu, Jun 20, 2024 at 1:28 PM Eduard Tudenhoefner >> >> <etudenhoef...@apache.org> wrote: >> >> > >> >> > Hey everyone, >> >> > >> >> > I'd like to bring up the discussion around describing REST server >> >> > capabilities via the /config endpoint. >> >> > There is PR #9940 that describes the OpenAPI spec changes. >> >> > >> >> > Mainly we'd like to have a capabilities field in the ConfigResponse >> >> > that allows servers to indicate to clients which capabilities are being >> >> > supported. >> >> > >> >> > So far we have the following capabilities: >> >> > >> >> > tables >> >> > views >> >> > remote-signing >> >> > vended-credentials >> >> > multi-table-commit >> >> > register-table >> >> > table-metrics >> >> > oauth2 >> >> > >> >> > >> >> > The general idea behind a capability is that if e.g. a server supports >> >> > views, then that server must implement all endpoints grouped under that >> >> > capability. >> >> > It's worth noting that the /config endpoint is currently being implicit >> >> > (meaning that every REST server would have to implement it). >> >> > >> >> > One discussion point that came up during review is how we want to >> >> > handle capabilities and backwards compatibility and what the default >> >> > capability would be, since older servers don't know anything about >> >> > capabilities (in such a case we could assume that the default >> >> > capabilities would be oauth2 / tables). >> >> > >> >> > Are there any other capabilities that we'd like to include in the list? >> >> > >> >> > Eduard > > > > -- > Ryan Blue > Databricks