Re: [DISCUSS] Describing REST Server capabilities

Micah Kornfield Thu, 20 Jun 2024 17:15:17 -0700

>
> The general idea behind a capability is that if e.g. a server supports
> *views*, then that server must implement all endpoints grouped under that
> capability.



I haven't thought deeply about this, but is there a reason to be
prescriptive about this by grouping endpoints in capabilities?  Another
option would be just list all endpoints (and maybe even further which
operations are supported) the server actually supports and let clients take
appropriate actions (i.e. grouping could happen on the client side).  This
could be done by vending the OpenAPI spec the server supports at its own
endpoint. I think this avoids the future problem of having to classify new
endpoints into a specific capability.

On the versioning aspects I'm not sure if this is what J.B. meant but
another way to model this could be as a list of objects where each object
is {"capability": "version (or other metadata relevant to the capability")}.

Thanks,
Micah

On Thu, Jun 20, 2024 at 4:45 PM Ryan Blue <b...@databricks.com.invalid>
wrote:

> I think the capabilities proposal is intended to let people build in a
> different way than a versioning system would. It's probably valuable to
> think through the differences between the approaches.
>
> The capabilities that are proposed let catalogs declare sets of features
> that are supported, like support for tables, views, server-side planning,
> etc. While you can think of that as a sort of versioning, the intent is to
> be more flexible. A catalog implementation might not choose to implement
> server-side planning because it doesn't have access to the underlying
> metadata files. For example see #10089
> <https://github.com/apache/iceberg/issues/10089> for a use case where
> object store permissions aren't what you might expect. Capabilities allow
> the service to tell the client what is supported for graceful fallback, not
> just backward compatibility.
>
> Versioning the API is a different approach because to support the latest
> version, an implementation would need to support everything. If we had
> tables in v1, views in v2, and server-side planning in v3, then to support
> server-side planning a catalog would also need to support views. That's not
> necessarily a bad thing since it creates strong compatibility requirements;
> that's what we chose to do for the table format.
>
> I think the question to help us choose between the two options is whether
> we expect catalogs to all support the same set of features over time, or if
> we expect some differences. I have a weakly-held opinion that we expect
> catalogs not to support the same features, but it is very likely based on
> the assumption that catalogs have significant differences because of
> limited back-ends (like Hive). That may not be correct since there are
> quite a few new catalog implementations using the protocol. Perhaps we
> should consider stronger requirements for what needs to be provided through
> versioning.
>
> Ryan
>
> On Thu, Jun 20, 2024 at 7:15 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi Eduard,
>>
>> That makes sense. Thanks.
>>
>> Maybe we can already anticipate a little and add a "catalog-level
>> versioning" capability as it's a feature supported by Nessie catalog
>> for instance ?
>> We can also imagine a more generic capability like "version scope".
>>
>> Regards
>> JB
>>
>> On Thu, Jun 20, 2024 at 3:47 PM Eduard Tudenhoefner
>> <etudenhoef...@apache.org> wrote:
>> >
>> > Hey JB,
>> >
>> > If adding UDFs would require adding new endpoints, then you'd also add
>> a udf capability when adding UDF support to the REST catalog.
>> > That way a client knows whether it's safe to call the UDF endpoints on
>> a given server.
>> >
>> > Eduard
>> >
>> > On Thu, Jun 20, 2024 at 1:59 PM Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>> >>
>> >> Hi Eduard
>> >>
>> >> It looks good to me. I have a question however :)
>> >>
>> >> Later, Imagine, we add UDF support in Iceberg. Does it mean that you
>> >> will need to update REST Spec (ConfigResponse/capabilities) to add
>> >> this capability ?
>> >> For consistency, I think it makes sense as I don't think we often add
>> >> new capability. And also as every REST server would have to implement
>> >> it, /config is generic enough to add custom/new capabilities (but the
>> >> client will have to deal with capability).
>> >>
>> >> Am I right?
>> >>
>> >> Thanks !
>> >> Regards
>> >> JB
>> >>
>> >> On Thu, Jun 20, 2024 at 1:28 PM Eduard Tudenhoefner
>> >> <etudenhoef...@apache.org> wrote:
>> >> >
>> >> > Hey everyone,
>> >> >
>> >> > I'd like to bring up the discussion around describing REST server
>> capabilities via the /config endpoint.
>> >> > There is PR #9940 that describes the OpenAPI spec changes.
>> >> >
>> >> > Mainly we'd like to have a capabilities field in the ConfigResponse
>> that allows servers to indicate to clients which capabilities are being
>> supported.
>> >> >
>> >> > So far we have the following capabilities:
>> >> >
>> >> > tables
>> >> > views
>> >> > remote-signing
>> >> > vended-credentials
>> >> > multi-table-commit
>> >> > register-table
>> >> > table-metrics
>> >> > oauth2
>> >> >
>> >> >
>> >> > The general idea behind a capability is that if e.g. a server
>> supports views, then that server must implement all endpoints grouped under
>> that capability.
>> >> > It's worth noting that the /config endpoint is currently being
>> implicit (meaning that every REST server would have to implement it).
>> >> >
>> >> > One discussion point that came up during review is how we want to
>> handle capabilities and backwards compatibility and what the default
>> capability would be, since older servers don't know anything about
>> capabilities (in such a case we could assume that the default capabilities
>> would be oauth2 / tables).
>> >> >
>> >> > Are there any other capabilities that we'd like to include in the
>> list?
>> >> >
>> >> > Eduard
>>
>
>
> --
> Ryan Blue
> Databricks
>

Re: [DISCUSS] Describing REST Server capabilities

Reply via email to