dennishuo commented on code in PR #1026:
URL: https://github.com/apache/polaris/pull/1026#discussion_r2004463034
##########
spec/polaris-management-service.yml:
##########
@@ -850,9 +850,92 @@ components:
- $ref: "#/components/schemas/Catalog"
- type: object
properties:
- remoteUrl:
- type: string
- description: URL to the remote catalog API
+ connectionConfigInfo:
+ $ref: "#/components/schemas/ConnectionConfigInfo"
+
+ ConnectionConfigInfo:
+ type: object
+ description: A connection configuration representing a remote catalog
service
+ properties:
+ connectionType:
+ type: string
+ enum:
+ - ICEBERG_REST
+ description: The type of remote catalog service represented by this
connection
+ uri:
+ type: string
+ description: URI to the remote catalog service
+ required:
+ - connectionType
+ discriminator:
+ propertyName: connectionType
+ mapping:
+ ICEBERG_REST: "#/components/schemas/IcebergRestConnectionConfigInfo"
+
+ IcebergRestConnectionConfigInfo:
+ type: object
+ description: Configuration necessary for connecting to an Iceberg REST
Catalog
+ allOf:
+ - $ref: '#/components/schemas/ConnectionConfigInfo'
+ properties:
+ remoteCatalogName:
+ type: string
+ description: The name of a remote catalog instance within the remote
catalog service; in some older systems
+ this is specified as the 'warehouse' when multiple logical
catalogs are served under the same base
+ uri, and often translates into a 'prefix' added to all REST
resource paths
+ restAuthentication:
+ $ref: "#/components/schemas/AuthenticationParameters"
+
+ AuthenticationParameters:
+ type: object
+ description: Authentication-specific information for a REST connection
+ properties:
+ restAuthenticationType:
+ type: string
+ enum:
+ - OAUTH
+ - BEARER
+ description: The type of authentication to use when connecting to
the remote rest service
+ required:
+ - restAuthenticationType
+ discriminator:
+ propertyName: restAuthenticationType
+ mapping:
+ OAUTH: "#/components/schemas/OAuthClientCredentialsParameters"
+ BEARER: "#/components/schemas/BearerAuthenticationParameters"
+
+ OAuthClientCredentialsParameters:
+ type: object
+ description: OAuth authentication based on client_id/client_secret
+ allOf:
+ - $ref: '#/components/schemas/AuthenticationParameters'
+ properties:
+ tokenUri:
+ type: string
+ description: Token server URI
+ clientId:
+ type: string
+ description: oauth client id
+ clientSecret:
+ type: string
+ format: password
+ description: oauth client secret (input-only)
+ scopes:
+ type: array
+ items:
+ type: string
+ description: oauth scopes to specify when exchanging for a
short-lived access token
+
+ BearerAuthenticationParameters:
+ type: object
+ description: Bearer authentication directly embedded in request auth
headers
+ allOf:
+ - $ref: '#/components/schemas/AuthenticationParameters'
+ properties:
+ bearerToken:
Review Comment:
Yeah, the intent here is to support a variety of authentication models, this
could be something we hash out more on the dev mailing list as well.
I agree we could iterate quickly by marking the feature as alpha initially,
as I suspect the end-to-end interaction will make the considerations more
obvious as we iterate on it.
As for "storing", I think there are two aspects:
1. *Expressing* the secret directly in the ConnectionConfig struct of the
API object
2. *Persisting* the secret somewhere
Are you concerned about the API structure for (1)?
I agree we probably don't want (2) to default to storing plaintext in the
persistence DB at any point in time. There are a few different models that
could coexist for (2):
1. Fully require an external "secret" to already exist as a reference, e.g.
to a Vault URI, and specify that Vault URI in the config info -- but this only
shifts the problem to the meta-secret for accessing Vault, so at some point the
caller needs to configure some secret on the Polaris side
2. Require a separate SecretIntegration type of Polaris entity to already be
created whose sole purpose is plumbing the secrets into some other configurable
secret store (e.g. Polaris would manage Vault in this case), but this is
somewhat complex for the API and ultimately the same kinds of internal secrets
plumbing would need to be built.
3. (Plan to do this): Within Polaris's logic for handling entities that
define secrets, we can have a SecretsManager class whose purpose is to take
secrets from API models and actually store them somewhere else that's safe -
Vault, KMS, some other keystore, etc., and then add internal references to the
stored secret inside the Polaris entity so that the secret can be found when
needed. Callsites that need the unpacked secret again go through the
SecretsManager to "extract" the secret from the Polaris entity; the
implementation of the SecretsManager gets to decide for itself how it wants to
store a reference and then unpack it again.
4. (Should consider also supporting this model): Similar to (3) but if we
want non-uniform types of secrets but a uniform secrets-management layer, the
SecretsManager might, instead of only storing a reference to an external
resource, could be leasing a new encryption key from the external system and
then embedding the encrypted secret in some field within the Polaris entity.
Then when the callsite again needs the unpacked secret, this SecretsManager
would recognize to decrypt the encrypted blob in order to retrieve the original
secrets
The nice thing about such a model is that it's fairly flexible for different
secrets-management flows under a single interface. For example, to add direct
sigv4-based auth to AWS Glue or S3Tables, the part where the SecretsManager
transforms the entity would *not* actually be handling a secret directly, but
instead performing the assumeRole relationship like how the
StorageConfiguration does it; in this world, true "secrets" are only implicitly
handled within the environment, but the flow looks the same -- embed the
`userArn` and `externalId` into the Polaris entity, and then later when you
want to unpack a secret, you use something externally-managed to become that
`userArn` before doing an `assumeRole` to mint a new subscoped token.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]