dennishuo commented on code in PR #1026:
URL: https://github.com/apache/polaris/pull/1026#discussion_r2000161707
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
Review Comment:
I'd actually prefer to move away from more "type-differentiation" in favor
of "capability differentiation", since we've talked about wanting the ability
to convert from an external catalog to an internal catalog in the future
someday.
Then, a catalog simply may or may not have a `ConnectionConfigInfo`; during
a migration, it doesn't exactly matter what "type" we call the catalog, but it
should be able to start functioning like a normal `INTERNAL` catalog at some
point, while still using the `ConnectionConfigInfo` to detect updates in the
remote catalog.
Aside from conversion to INTERNAL, there's also a close relationship between
EXTERNAL catalogs with and without ConnectionConfigInfo; maybe people start out
with a typical migration tool and plain notification-based EXTERNAL catalog,
but then want to enable federation on the existing catalog that might already
have grants defined. The entity metadata we might have "cached" locally would
then be able to be "verified" during `loadTable` from the remote catalog, and
then `updateTable` requests could be accepted as well.
It might still be good to have an enum of some sort so that we're not just
inferring a mode-of-operation based on what fields are present, but maybe that
would be better as a separate `modeOfOperation` enum than `type`, with key
differences being:
- The mode of operation doesn't necessarily strictly define the set of
attributes in the catalog object
- There would be no API-spec "discriminator" for sub-object based on the
`modeOfOperation` -- the discriminator-style inheritance is honestly quite
painful to work with
- The mode of operation is more fluid, intended to be able to be changed
over time
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
+
+ ConnectionConfigInfo:
Review Comment:
Thinking about it more, I replied to @dimas-b 's comment here as well:
https://github.com/apache/polaris/pull/1026/files#r1978040280
I think we might want to rework how we express updates so that it's less
confusing. Right now it *almost* looks like the API takes a strict REST-style
"replace entire object" approach, but it's already subtly conditioned on which
fields are "special" to be ignored vs deleted vs modified.
If we do it Iceberg's `updateTable`-style, we'd flatten out verbs to take
update objects like `UpdateCatalogConnectionConfigRemoteUri` and
`UpdateCatalogConnectionConfigSecrets`.
This is possibly going to be a bit complicated to hash out, so maybe it's
best to leave ConnectionConfigInfo out of `UpdateCatalog` for now.
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
Review Comment:
Yeah, the way we express updates probably needs some rework anyways. Even
though the original structure of `UpdateCatalogRequest` kind of looks a
"replace all", it also says:
> Any fields which are required in the Catalog will remain unaltered if
omitted from the contents of this Update request.
Which is a bit ambiguous, especially when it comes to partially specifying
optional portions of a required outer struct (e.g. if `StorageConfigInfo` is
being partially updated).
In practice, there's basically currently some very specialized logic for
deciding which fields are supposed to be "total replace", or "delete through
omission" or "ignore if omitted" which is definitely confusing when using the
API.
On the plus side at least we didn't just make the body of
`UpdateCatalogRequest` be the full `Catalog` like a naive REST pattern would
do. Probably we need to switch to a more imperative/verb-style like used in the
Iceberg `updateTable` API and basically just flatten out all the possible
individual-field updates, maybe accepting a list of updates in a single request.
In theory the HTTP verb should've been `PATCH` to most correctly match
"partial-update" semantics, but there's some inconsistency in support of
`PATCH`. Also it's interesting that Iceberg's `updateTable` is a `POST` as well
as `createTable` also being a `POST` (on the parent namespace), which is
presumably why they needed to make namespace-update a `POST` on
`/v1/{prefix}/namespaces/{namespace}/properties` instead of on
`/v1/{prefix}/namespaces/{namespace}`, instead of normally `PUT` being the
update verb.
I'm starting to think I'll leave `ConnectionConfigInfo` out of scope of the
current `UpdateCatalog` definition so we can put more thought into updates
overall without blocking basic progress on federation.
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
+
+ ConnectionConfigInfo:
+ type: object
+ description: A connection configuration representing a remote catalog
service
+ properties:
+ connectionType:
+ type: string
+ enum:
+ - ICEBERG_REST
+ description: The type of remote catalog service represented by this
connection
+ remoteUri:
+ type: string
+ description: URI to the remote catalog service
+ required:
+ - connectionType
+ discriminator:
+ propertyName: connectionType
+ mapping:
+ ICEBERG_REST: "#/components/schemas/IcebergRestConnectionConfigInfo"
+
+ IcebergRestConnectionConfigInfo:
+ type: object
+ description: Configuration necessary for connecting to an Iceberg REST
Catalog
+ allOf:
+ - $ref: '#/components/schemas/ConnectionConfigInfo'
+ properties:
+ remoteCatalogName:
+ type: string
+ description: The name of a remote catalog instance within the remote
catalog service; in some older systems
+ this is specified as the 'warehouse' when multiple logical
catalogs are served under the same base
+ remoteUri, and often translates into a 'prefix' added to all REST
resource paths
+ restAuthentication:
+ $ref: "#/components/schemas/RestAuthenticationInfo"
+
+ RestAuthenticationInfo:
+ type: object
+ description: Authentication-specific information for a REST connection
+ properties:
+ restAuthenticationType:
+ type: string
+ enum:
+ - OAUTH
+ - BEARER
+ description: The type of authentication to use when connecting to
the remote rest service
+ required:
+ - restAuthenticationType
+ discriminator:
+ propertyName: restAuthenticationType
+ mapping:
+ OAUTH: "#/components/schemas/OauthRestAuthenticationInfo"
+ BEARER: "#/components/schemas/BearerRestAuthenticationInfo"
+
+ OauthRestAuthenticationInfo:
+ type: object
+ description: OAuth authentication based on client_id/client_secret
Review Comment:
Done, including the casing `OAuth` to match conventions elsewhere.
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
+
+ ConnectionConfigInfo:
+ type: object
+ description: A connection configuration representing a remote catalog
service
+ properties:
+ connectionType:
+ type: string
+ enum:
+ - ICEBERG_REST
+ description: The type of remote catalog service represented by this
connection
+ remoteUri:
Review Comment:
Changed to `uri`
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
+
+ ConnectionConfigInfo:
+ type: object
+ description: A connection configuration representing a remote catalog
service
+ properties:
+ connectionType:
+ type: string
+ enum:
+ - ICEBERG_REST
+ description: The type of remote catalog service represented by this
connection
+ remoteUri:
+ type: string
+ description: URI to the remote catalog service
+ required:
+ - connectionType
+ discriminator:
+ propertyName: connectionType
+ mapping:
+ ICEBERG_REST: "#/components/schemas/IcebergRestConnectionConfigInfo"
+
+ IcebergRestConnectionConfigInfo:
+ type: object
+ description: Configuration necessary for connecting to an Iceberg REST
Catalog
+ allOf:
+ - $ref: '#/components/schemas/ConnectionConfigInfo'
+ properties:
+ remoteCatalogName:
+ type: string
+ description: The name of a remote catalog instance within the remote
catalog service; in some older systems
+ this is specified as the 'warehouse' when multiple logical
catalogs are served under the same base
+ remoteUri, and often translates into a 'prefix' added to all REST
resource paths
+ restAuthentication:
+ $ref: "#/components/schemas/RestAuthenticationInfo"
+
+ RestAuthenticationInfo:
+ type: object
+ description: Authentication-specific information for a REST connection
+ properties:
+ restAuthenticationType:
+ type: string
+ enum:
+ - OAUTH
+ - BEARER
+ description: The type of authentication to use when connecting to
the remote rest service
+ required:
+ - restAuthenticationType
+ discriminator:
+ propertyName: restAuthenticationType
+ mapping:
+ OAUTH: "#/components/schemas/OauthRestAuthenticationInfo"
+ BEARER: "#/components/schemas/BearerRestAuthenticationInfo"
+
+ OauthRestAuthenticationInfo:
+ type: object
+ description: OAuth authentication based on client_id/client_secret
+ allOf:
+ - $ref: '#/components/schemas/RestAuthenticationInfo'
+ properties:
+ tokenUri:
+ type: string
+ description: Token server URI
+ clientId:
+ type: string
+ description: oauth client id
+ clientSecret:
+ type: string
+ format: password
+ description: oauth client secret (input-only)
+ scopes:
+ type: array
+ items:
+ type: string
+ description: oauth scopes to specify when exchanging for a
short-lived access token
+
+ BearerRestAuthenticationInfo:
Review Comment:
Done, renamed to `BearerAuthenticationParameters`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]