Hi Fabio, Thanks for the detailed explanation! It helps a lot.
A couple of things are still not clear to me, though :) Are table metadata JSON files ever encrypted with KMS? If so, can Polaris read them without knowing the KMS FileIO properties (e.g. if the table came via registerTable with pre-encrypted metadata)? Sorry, if these are trivial things and I missed some doc somewhere... please give a pointer if so. Thanks, Dmitri. On Thu, Oct 23, 2025 at 5:28 AM Rizzo Cascio, Fabio <[email protected]> wrote: > Hi Dmitri, > > Let me try to explain what I mean: > > Let’s say we’re creating a catalog using AWS S3, just using the FileIo > Iceberg properties (see: > https://iceberg.apache.org/docs/nightly/aws/#s3-server-side-encryption). > > First, we create the catalog with a request like this: > > POST http://localhost:8181/api/management/v1/catalogs > > { > "catalog": { > "name": "test_catalog", > "type": "INTERNAL", > "readOnly": true, > "properties": { > "default-base-location": "s3://mybucket" > }, > "storageConfigInfo": { > "storageType": "S3", > "roleArn": "arn:aws:iam::*****:role/myrole", > "region": "us-east-1" > } > } > } > > > Next, we create a namespace: > > POST http://localhost:8181/api/catalog/v1/test_catalog/namespaces > > { > "namespace": ["fns"], > "properties": { > "location": "s3://mybucket/fns/" > } > } > > > Then, we create a table: > > POST > http://localhost:8181/api/catalog/v1/test_catalog/namespaces/fns/tables > > { > "name": "t3", > "schema": { > "type": "struct", > "fields": [ > {"id": 1, "name": "field1", "type": "int", "required": false}, > {"id": 2, "name": "field2", "type": "int", "required": false} > ] > }, > "stage-create": false > } > > > To insert data into the table using Spark with a KMS key: > > bin/spark-sql \ > --packages > org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0 > \ > --conf > spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions > \ > --conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog > \ > --conf spark.sql.catalog.polaris.type=rest \ > --conf spark.sql.catalog.polaris.uri=http://localhost:8181/api/catalog > \ > --conf spark.sql.catalog.polaris.token-refresh-enabled=false \ > --conf spark.sql.catalog.polaris.warehouse=test_catalog \ > --conf spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL \ > --conf > spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=vended-credentials > \ > --conf > spark.sql.catalog.polaris.s3.sse.key=arn:aws:kms:us-east-1:****:key/mykey \ > --conf spark.sql.catalog.polaris.s3.sse.type=kms \ > --conf > spark.sql.catalog.polaris.credential=38a0ebdbcc97c58f:654af690485a32f02ce17dda7cd2c36d > > > Then run: > > insert into t3 values (7,9); > > > This will create an entry in the bucket with the data encrypted using the > specified key. > > If you remove these two properties from the connection command: > > * --conf > spark.sql.catalog.polaris.s3.sse.key=arn:aws:kms:us-east-1:****:key/mykey > > * --conf spark.sql.catalog.polaris.s3.sse.type=kms > > then you can insert data into the table without encryption. > > In Polaris, at the point of the insert, we don’t know the KMS key, so > there’s no way to add that key in the policy. This is where adding > arn:aws:kms:%s:%s:key/* helps. > > Adding a new property in the storageConfigInfo for the KMS key will help > with creating a more restrictive IAM policy, but we also need to support > the existing properties defined in Iceberg. > > The reason for my changes—passing the properties down to > AwsCredentialsStorageIntegration—is because you can pass extra properties > to any Polaris request, and those will feed down to Iceberg. For example: > > { > "name": "t3", > "schema": { > "type": "struct", > "fields": [ > {"id": 1, "name": "field1", "type": "int", "required": false}, > {"id": 2, "name": "field2", "type": "int", "required": false} > ] > }, > "properties": { > "s3.sse.type": "kms", > "s3.sse.key": "arn:aws:kms:us-east-1:***:key/mykey" > }, > "stage-create": false > } > > > With my changes I was trying to support both ways, not sure if this is the > best approach or just going for the easier way of allowing all the keys > under the account is better. > > Let me know if you need any more details. > > Thanks > > Fabio > > > > > 1. > > > > From: Dmitri Bourlatchkov <[email protected]> > Date: Wednesday, 22 October 2025 at 18:50 > To: [email protected] <[email protected]> > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3 > > Hi Fabio, > > I agree that we seem to have some confusion here :) let's try and clarify. > > What do you mean exactly by "s3 iceberg properties"? Are they Iceberg table > metadata properties? Iceberg (REST) Catalog properties? FileIO properties? > > Why would Polaris need to support them? What would an end-to-end flow be > between Polaris and the client (e.g. Spark)? > > Thanks, > Dmitri. > > On Wed, Oct 22, 2025 at 12:08 PM Rizzo Cascio, Fabio > <[email protected]> wrote: > > > I feel like there’s something I’m overlooking. > > Setting the kms property at the catalog level works, but it doesn’t > > support the s3 iceberg properties, right? > > > > To support those, the best approach is to let clients use the s3 iceberg > > properties directly. Then, in AwsCredentialsStorageIntegration, we can > > simply add a policy that allows any key within that account. This way, we > > don’t need to rely on metadata properties to retrieve the key value. > > > > Does that make sense? > > > > > > Thanks > > Fabio > > > > > > From: Dmitri Bourlatchkov <[email protected]> > > Date: Wednesday, 22 October 2025 at 15:42 > > To: [email protected] <[email protected]> > > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3 > > > > Hi Fabio, > > > > Yes, that would be the preferable solution for KMS from my POV. > > > > I wonder what other people think about this too :) > > > > Cheers, > > Dmitri. > > > > On Wed, Oct 22, 2025 at 9:45 AM Rizzo Cascio, Fabio > > <[email protected]> wrote: > > > > > Hi Dmitri, > > > > > > If I add the kms props to AccessConfig in AWSCredentialStorage I can > see > > > it in the LoadTable response coming from Polaris. > > > > > > Fabio > > > > > > From: Dmitri Bourlatchkov <[email protected]> > > > Date: Tuesday, 21 October 2025 at 15:23 > > > To: [email protected] <[email protected]> > > > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3 > > > > > > Hi Fabio, > > > > > > Yes, I glimpsed that from your email. Sorry if my post caused > confusion. > > I > > > just wanted to reply to the top email as what I'm proposing seems to > be a > > > key feature for KMS support. > > > > > > Would you be able to validate whether sending KMS FileIO properties to > > > clients from LoadTable responses work in practice (e.g. in Spark)? > > > > > > I believe this can be done by adding KMS properties as "extra" > properties > > > to AccessConfig. > > > > > > Thanks, > > > Dmitri. > > > > > > On Tue, Oct 21, 2025 at 4:15 AM Rizzo Cascio, Fabio > > > <[email protected]> wrote: > > > > > > > Hi Dmitri, > > > > > > > > This is what I was saying in my other email. Anyway I’m gonna update > my > > > PR > > > > with the changes I have made to get it working, the project won’t > > build > > > > because I haven’t update the tests etc, I just want to show my > changes > > > and > > > > see if we can agree on a direction before I make all the changes. > > > > > > > > Thanks > > > > > > > > Fabio > > > > > > > > > > > > > > > > > > > > From: Dmitri Bourlatchkov <[email protected]> > > > > Date: Monday, 20 October 2025 at 17:38 > > > > To: [email protected] <[email protected]> > > > > Subject: [EXTERNAL]Re: KMS Key addition for s3 > > > > > > > > Hi Fabio, Ashok and All, > > > > > > > > Apologies if I'm missing something obvious, but the two WIP KMS PRs > > > [1424] > > > > [2802] appear to be dealing only with AWS policies on the vended > > > credential > > > > session. They do not appear to deal with client configuration (in > > > LoadTable > > > > responses). > > > > > > > > As far as I understand, Iceberg clients need certain FileIO > properties > > to > > > > be set in order to utilize KMS. > > > > > > > > I'd imagine that Polaris ought to provide these FileIO properties in > > > > LoadTable responses in addition to granting privileges for KMS access > > to > > > > the vended (session) credentials. > > > > > > > > In other words, the decision whether to use KMS rests with Polaris > (we > > > can > > > > discuss how to configure that). If that is enabled, clients should > not > > > need > > > > any extra configuration, they should get complete and usable > > > > configuration + credentials from Polaris. > > > > > > > > WDYT? > > > > > > > > [1424] https://github.com/apache/polaris/pull/1424 > > > > [2802] https://github.com/apache/polaris/pull/2802 > > > > > > > > Thanks, > > > > Dmitri. > > > > > > > > > > > > On Mon, Oct 13, 2025 at 3:50 AM Rizzo Cascio, Fabio > > > > <[email protected]> wrote: > > > > > > > > > Hi guys, > > > > > > > > > > I have created a new PR to be able to use a kms key for the S3 > > bucket, > > > it > > > > > is mandatory for me to use any S3 storage and hopefully a good > > addition > > > > for > > > > > other people that want to use it. > > > > > > > > > > PR link: https://github.com/apache/polaris/pull/2802 > > > > > > > > > > Thanks > > > > > > > > > > Fabio > > > > > > > > > > This message is confidential and subject to terms at: > > > > > https://www.jpmorgan.com/emaildisclaimer including on > confidential, > > > > > privileged or legal entity information, malicious content and > > > monitoring > > > > of > > > > > electronic messages. If you are not the intended recipient, please > > > delete > > > > > this message and notify the sender immediately. Any unauthorized > use > > is > > > > > strictly prohibited. > > > > > > > > > > > > > This message is confidential and subject to terms at: > > > > https://www.jpmorgan.com/emaildisclaimer including on confidential, > > > > privileged or legal entity information, malicious content and > > monitoring > > > of > > > > electronic messages. If you are not the intended recipient, please > > delete > > > > this message and notify the sender immediately. Any unauthorized use > is > > > > strictly prohibited. > > > > > > > > > > This message is confidential and subject to terms at: > > > https://www.jpmorgan.com/emaildisclaimer including on confidential, > > > privileged or legal entity information, malicious content and > monitoring > > of > > > electronic messages. If you are not the intended recipient, please > delete > > > this message and notify the sender immediately. Any unauthorized use is > > > strictly prohibited. > > > > > > > This message is confidential and subject to terms at: > > https://www.jpmorgan.com/emaildisclaimer including on confidential, > > privileged or legal entity information, malicious content and monitoring > of > > electronic messages. If you are not the intended recipient, please delete > > this message and notify the sender immediately. Any unauthorized use is > > strictly prohibited. > > > > This message is confidential and subject to terms at: > https://www.jpmorgan.com/emaildisclaimer including on confidential, > privileged or legal entity information, malicious content and monitoring of > electronic messages. If you are not the intended recipient, please delete > this message and notify the sender immediately. Any unauthorized use is > strictly prohibited. >
