I have clarified the wording in #9940
<https://github.com/apache/iceberg/pull/9940> around the requirement on
having to implement all endpoints under a particular capability.

For servers that only *partially* implement endpoints under a capability
the spec requires the server to throw a *501 Not Implemented*. This was
suggested by Jack and it seems reasonable to do that.

Regarding the inclusion of table-spec / *view-spec *as a capability: I
think this might make sense for the next iteration of the REST spec but as
I mentioned earlier I don't see any clear benefit for the current REST spec
as the client wouldn't do anything with that information.
If there is a clear benefit of having this, then this can still be added
later to the current REST spec but I believe we should rather have a few
well-defined and actionable capabilities rather than too many.

Eduard

On Wed, Jul 3, 2024 at 5:44 AM Renjie Liu <liurenjie2...@gmail.com> wrote:

> Spec is an interesting topic we did not discuss. Robert, how do you
>> envision this to be used?
>> In my mind, if a new table format v3 is launched, there are 2 approaches
>> we can go with, taking CreateTable as an example:
>> (1) increment the related operation version, which means that POST
>> /v2/{prefix}/namespaces/{ns}/tables will be created and allow creating
>> tables in the v3 version.
>> (2) update the existing table metadata model to support both v2 and v3
>> fields, and the server enforces the payload differently based on the
>> TableMetadata.format-version field. If the server does not support v3, it
>> can return unsupported at that time.
>> Either way we go, the table-spec version does not need to be a
>> capability. (1) seems to be cleaner, but has some overhead in provisioning
>> a new endpoint compared to (2).
>> Do you see another way to do this leveraging the table-spec version?
>
>
> 2 is cleaner but maybe inconsistent with current behavior, since
> /v1/tables operation supports both v1 and v3. We should only go to 2 only
> when we have incompatible fields/break changes according to discussion.
>
> Generally I agree with adding table-spec into capabilities. For example,
> we can expose this to user in api so that user could choose a supported
> table format version without throwing exception.
>
> On Wed, Jul 3, 2024 at 12:18 AM Jack Ye <yezhao...@gmail.com> wrote:
>
>> Spec is an interesting topic we did not discuss. Robert, how do you
>> envision this to be used?
>>
>> In my mind, if a new table format v3 is launched, there are 2 approaches
>> we can go with, taking CreateTable as an example:
>>
>> (1) increment the related operation version, which means that POST
>> /v2/{prefix}/namespaces/{ns}/tables will be created and allow creating
>> tables in the v3 version.
>>
>> (2) update the existing table metadata model to support both v2 and v3
>> fields, and the server enforces the payload differently based on the
>> TableMetadata.format-version field. If the server does not support v3, it
>> can return unsupported at that time.
>>
>> Either way we go, the table-spec version does not need to be a
>> capability. (1) seems to be cleaner, but has some overhead in provisioning
>> a new endpoint compared to (2).
>>
>> Do you see another way to do this leveraging the table-spec version?
>>
>> -Jack
>>
>>
>>
>>
>>
>> On Tue, Jul 2, 2024 at 6:03 AM Eduard Tudenhöfner
>> <eduard.tudenhoef...@databricks.com.invalid> wrote:
>>
>>>
>>> I couldn't make it to the catalog sync meeting yesterday but I watched
>>> the recording today (thanks for providing that).
>>>
>>> The missing piece is how (new, capabilities-aware) clients handle the
>>>> case when a service does _not_ return the capabilities field (absent). My
>>>> proposal would be that a client should in this case assume that all
>>>> _currently_ existing capabilities are supported.
>>>>
>>>> - tables: [1]
>>>> - views: [1]
>>>> - remote-signing: [1]
>>>> - multi-table-commit: [1]
>>>> - register-table: [1]
>>>> - table-metrics: [1]
>>>> - table-spec: [1,2]
>>>> - view-spec: [1,2]
>>>>
>>>>
>>>> The one thing I would like to add here is that the current PR uses the
>>> *tables* capability (as version 1) as the default when a server doesn't
>>> return *capabilities *but it might be also ok to include *views *(as
>>> version 1) because the current client impl has *some* code to deal with
>>> errors in case endpoints don't exist.
>>>
>>> Unless we agree that the currently existing functionality in the REST
>>> spec is the *default* behavior to be assumed for older server, I'm not
>>> sure about including *remote-signing / multi-table-commit /
>>> register-table / table-metrics* as it has been indicated in earlier
>>> comments on the PR/ML that not every REST server supports these.
>>>
>>> That being said, we should discuss whether we want the *default*
>>> behavior (when an older server doesn't send back *capabilities*) to be
>>> a) *tables* (version 1) only
>>> b) the currently existing functionality as defined in the REST spec (as
>>> version 1)
>>>
>>>
>>> On another note: Including *table-spec / view-spec* seems to be more
>>> informative in its nature as I don't think a client would act differently
>>> right now when seeing these.
>>>
>>> Thanks
>>> Eduard
>>>
>>>

Reply via email to