I think the capabilities proposal is intended to let people build in a different way than a versioning system would. It's probably valuable to think through the differences between the approaches.
The capabilities that are proposed let catalogs declare sets of features that are supported, like support for tables, views, server-side planning, etc. While you can think of that as a sort of versioning, the intent is to be more flexible. A catalog implementation might not choose to implement server-side planning because it doesn't have access to the underlying metadata files. For example see #10089 <https://github.com/apache/iceberg/issues/10089> for a use case where object store permissions aren't what you might expect. Capabilities allow the service to tell the client what is supported for graceful fallback, not just backward compatibility. Versioning the API is a different approach because to support the latest version, an implementation would need to support everything. If we had tables in v1, views in v2, and server-side planning in v3, then to support server-side planning a catalog would also need to support views. That's not necessarily a bad thing since it creates strong compatibility requirements; that's what we chose to do for the table format. I think the question to help us choose between the two options is whether we expect catalogs to all support the same set of features over time, or if we expect some differences. I have a weakly-held opinion that we expect catalogs not to support the same features, but it is very likely based on the assumption that catalogs have significant differences because of limited back-ends (like Hive). That may not be correct since there are quite a few new catalog implementations using the protocol. Perhaps we should consider stronger requirements for what needs to be provided through versioning. Ryan On Thu, Jun 20, 2024 at 7:15 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Eduard, > > That makes sense. Thanks. > > Maybe we can already anticipate a little and add a "catalog-level > versioning" capability as it's a feature supported by Nessie catalog > for instance ? > We can also imagine a more generic capability like "version scope". > > Regards > JB > > On Thu, Jun 20, 2024 at 3:47 PM Eduard Tudenhoefner > <etudenhoef...@apache.org> wrote: > > > > Hey JB, > > > > If adding UDFs would require adding new endpoints, then you'd also add a > udf capability when adding UDF support to the REST catalog. > > That way a client knows whether it's safe to call the UDF endpoints on a > given server. > > > > Eduard > > > > On Thu, Jun 20, 2024 at 1:59 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> > >> Hi Eduard > >> > >> It looks good to me. I have a question however :) > >> > >> Later, Imagine, we add UDF support in Iceberg. Does it mean that you > >> will need to update REST Spec (ConfigResponse/capabilities) to add > >> this capability ? > >> For consistency, I think it makes sense as I don't think we often add > >> new capability. And also as every REST server would have to implement > >> it, /config is generic enough to add custom/new capabilities (but the > >> client will have to deal with capability). > >> > >> Am I right? > >> > >> Thanks ! > >> Regards > >> JB > >> > >> On Thu, Jun 20, 2024 at 1:28 PM Eduard Tudenhoefner > >> <etudenhoef...@apache.org> wrote: > >> > > >> > Hey everyone, > >> > > >> > I'd like to bring up the discussion around describing REST server > capabilities via the /config endpoint. > >> > There is PR #9940 that describes the OpenAPI spec changes. > >> > > >> > Mainly we'd like to have a capabilities field in the ConfigResponse > that allows servers to indicate to clients which capabilities are being > supported. > >> > > >> > So far we have the following capabilities: > >> > > >> > tables > >> > views > >> > remote-signing > >> > vended-credentials > >> > multi-table-commit > >> > register-table > >> > table-metrics > >> > oauth2 > >> > > >> > > >> > The general idea behind a capability is that if e.g. a server > supports views, then that server must implement all endpoints grouped under > that capability. > >> > It's worth noting that the /config endpoint is currently being > implicit (meaning that every REST server would have to implement it). > >> > > >> > One discussion point that came up during review is how we want to > handle capabilities and backwards compatibility and what the default > capability would be, since older servers don't know anything about > capabilities (in such a case we could assume that the default capabilities > would be oauth2 / tables). > >> > > >> > Are there any other capabilities that we'd like to include in the > list? > >> > > >> > Eduard > -- Ryan Blue Databricks