I think that we really need to let a service express which table/view/udf/etc-spec versions it supports. Otherwise, as you noted, a client would only receive some error. If a "table-spec v4" aware client knows that the service just knows about "table-spec v2", it can just use that version - preventing an error that a user has to deal with. Users will just come to the mailing-list or Slack and say: "here's an exception" - we can prevent this.

On 02.07.24 16:59, Jack Ye wrote:
Spec is an interesting topic we did not discuss. Robert, how do you envision this to be used?

In my mind, if a new table format v3 is launched, there are 2 approaches we can go with, taking CreateTable as an example:

(1) increment the related operation version, which means that POST /v2/{prefix}/namespaces/{ns}/tables will be created and allow creating tables in the v3 version.

(2) update the existing table metadata model to support both v2 and v3 fields, and the server enforces the payload differently based on the TableMetadata.format-version field. If the server does not support v3, it can return unsupported at that time.

Either way we go, the table-spec version does not need to be a capability. (1) seems to be cleaner, but has some overhead in provisioning a new endpoint compared to (2).

Do you see another way to do this leveraging the table-spec version?

-Jack





On Tue, Jul 2, 2024 at 6:03 AM Eduard Tudenhöfner <eduard.tudenhoef...@databricks.com.invalid> wrote:


    I couldn't make it to the catalog sync meeting yesterday but I
    watched the recording today (thanks for providing that).

        The missing piece is how (new, capabilities-aware) clients
        handle the case when a service does _not_ return the
        capabilities field (absent). My proposal would be that a
        client should in this case assume that all _currently_
        existing capabilities are supported.

        - tables: [1]
        - views: [1]
        - remote-signing: [1]
        - multi-table-commit: [1]
        - register-table: [1]
        - table-metrics: [1]
        - table-spec: [1,2]
        - view-spec: [1,2]


    The one thing I would like to add here is that the current PR uses
    the *tables* capability (as version 1) as the default when a
    server doesn't return *capabilities *but it might be also ok to
    include *views *(as version 1) because the current client impl has
    /some/ code to deal with errors in case endpoints don't exist.

    Unless we agree that the currently existing functionality in the
    REST spec is the *default* behavior to be assumed for older
    server, I'm not sure about including *remote-signing /
    multi-table-commit / register-table / table-metrics* as it has
    been indicated in earlier comments on the PR/ML that not every
    REST server supports these.

    That being said, we should discuss whether we want the *default*
    behavior (when an older server doesn't send back *capabilities*) to be
    a) *tables* (version 1) only
    b) the currently existing functionality as defined in the REST
    spec (as version 1)


    On another note: Including *table-spec / view-spec* seems to be
    more informative in its nature as I don't think a client would act
    differently right now when seeing these.

    Thanks
    Eduard

--
Robert Stupp
@snazy

Reply via email to