Maybe I caused more confusion than I meant to 😅
In my mind, there are two ways to support table-level overrides for KMS
keys:
1. We can support StorageConfig overrides at the table level - I think we
previously talked about this being the right long term approach, as it
would be great to allow overrides for more than just KMS keys (for example,
using a base location that's in the catalog's allowed locations, but not
the default base location). This is a bigger change, however. We also
haven't defined an easy way for users to create tables with these storage
overrides.
2. Support existing S3FileIO table properties. This feels like a necessary
change anyway, as I think this would be necessary for users to migrate
existing tables into Polaris. It's also a smaller change, so I was kinda
hoping we could sneak it in here :)
I think Dmitri's question of whether the metadata.json file is stored
encrypted is an important one, as we'd need to know which KMS key to add to
the IAM policy before we can read the metadata.json file to read the
table's properties 🤪 . Maybe we can continue with work like #2735 and add
the TableMetadata properties to the Table Entity's properties map? That
would allow us to read the encryption properties before we read the file
from storage. This supports existing CREATE TABLE syntax, as users can use
something like this to specify a table-level override:
CREATE TABLE ... TBLPROPERTIES("s3.sse.type": "kms", "s3.sse.key":
"arn:aws:kms:us-east-1:***:key/mykey)
However, that doesn't fix the registerTable issue raised. The only way I
can see to support registering tables that use a different KMS key from the
catalog default is to follow the pattern we established with
allowedLocations and defaultBaseLocation. The catalog can support a list of
KMS keys that can be used for decryption with one specified as the default.
Again, this feels like an issue that needs to be tackled anyway as there
are definitely catalogs that use a mix of encryption keys.
The encryption keys do need to be returned in the LoadTableResponse, which
I think the current PR does. That should eliminate the need to specify the
--conf spark.sql.catalog.polaris.s3.sse.* key/value pairs in the client
configuration. Only cases like CREATE TABLE AS SELECT... would need to know
about the KMS keys beforehand. Simple CREATE TABLE statements can use the
TBLPROPERTIES config, as in the above example.
This is maybe a lot for one PR, so I'm open to hear about how we can split
this up and move forward incrementally. But I think it would be good for
the APIs to be thought of up front.
Mike
On Thu, Oct 23, 2025 at 7:42 AM Rizzo Cascio, Fabio
<[email protected]> wrote:
> Hi Dmitri,
>
> You mean doing a read table data I assume, like this:
>
> http://localhost:8181/api/catalog/v1/test_catalog/namespaces/fns/tables/t3
>
> The answer is yes; the key it is used to do server-side encryption so if
> the role running the pod has the correct entitlement to use the key to
> decrypt the data there would be no issue.
>
> Btw, this is also the case using mini on docker.
>
> Response is:
> {
> "metadata-location": "s3://mybucket/fns/t3/metadata/a.metadata.json",
> "metadata": {
> "format-version": 2,
> "table-uuid": "4d452e08-cf39-434b-b003-a21ccd6c4a42",
> "location": "s3://mybucket/fns/t3",
> "last-sequence-number": 1,
> "last-updated-ms": 1761059868331,
> "last-column-id": 2,
> "current-schema-id": 0,
> "schemas": [
> {
> "type": "struct",
> "schema-id": 0,
> "fields": [
> {
> "id": 1,
> "name": "field1",
> "required": false,
> "type": "int"
> },
> {
> "id": 2,
> "name": "field2",
> "required": false,
> "type": "int"
> }
> ]
> }
> ],
> "default-spec-id": 0,
> "partition-specs": [
> {
> "spec-id": 0,
> "fields": []
> }
> ],
> "last-partition-id": 999,
> "default-sort-order-id": 0,
> "sort-orders": [
> {
> "order-id": 0,
> "fields": []
> }
> ],
> "properties": {
> "created-at": "2025-10-21T15:14:52.093498193Z",
> "s3.sse.type": "kms",
> "s3.sse.key": "arn:aws:kms:us-east-1:***:key/mykey,
> "write.parquet.compression-codec": "zstd"
> },
> "current-snapshot-id": 2031898559602022053,
> "refs": {
> "main": {
> "snapshot-id": 2031898559602022053,
> "type": "branch"
> }
> },
> "snapshots": [
> {
> "sequence-number": 1,
> "snapshot-id": 2031898559602022053,
> "timestamp-ms": 1761059868331,
> "summary": {
> "operation": "append",
> "spark.app.id": "local-1761059845206",
> "added-data-files": "1",
> "added-records": "1",
> "added-files-size": "657",
> "changed-partition-count": "1",
> "total-records": "1",
> "total-files-size": "657",
> "total-data-files": "1",
> "total-delete-files": "0",
> "total-position-deletes": "0",
> "total-equality-deletes": "0",
> "engine-version": "3.5.7",
> "app-id": "local-1761059845206",
> "engine-name": "spark",
> "iceberg-version": "Apache Iceberg unspecified (commit
> 7dbafb438ee1e68d0047bebcb587265d7d87d8a1)"
> },
> "manifest-list":
> "s3://mybucket/fns/t3/metadata/snap-1.avro",
> "schema-id": 0
> }
> ],
> "statistics": [],
> "partition-statistics": [],
> "snapshot-log": [
> {
> "timestamp-ms": 1761059868331,
> "snapshot-id": 2031898559602022053
> }
> ],
> "metadata-log": [
> {
> "timestamp-ms": 1761059692095,
> "metadata-file":
> "s3://mybucket/fns/t3/metadata/0.metadata.json"
> }
> ]
> }
> }
>
> Thanks
>
> Fabio
>
>
> From: Dmitri Bourlatchkov <[email protected]>
> Date: Thursday, 23 October 2025 at 15:24
> To: [email protected] <[email protected]>
> Subject: Re: [EXTERNAL]Re: KMS Key addition for s3
>
> Hi Fabio,
>
> Thanks for the detailed explanation! It helps a lot.
>
> A couple of things are still not clear to me, though :)
>
> Are table metadata JSON files ever encrypted with KMS?
>
> If so, can Polaris read them without knowing the KMS FileIO properties
> (e.g. if the table came via registerTable with pre-encrypted metadata)?
>
> Sorry, if these are trivial things and I missed some doc somewhere...
> please give a pointer if so.
>
> Thanks,
> Dmitri.
>
> On Thu, Oct 23, 2025 at 5:28 AM Rizzo Cascio, Fabio
> <[email protected]> wrote:
>
> > Hi Dmitri,
> >
> > Let me try to explain what I mean:
> >
> > Let’s say we’re creating a catalog using AWS S3, just using the FileIo
> > Iceberg properties (see:
> > https://iceberg.apache.org/docs/nightly/aws/#s3-server-side-encryption).
> >
> > First, we create the catalog with a request like this:
> >
> > POST http://localhost:8181/api/management/v1/catalogs
> >
> > {
> > "catalog": {
> > "name": "test_catalog",
> > "type": "INTERNAL",
> > "readOnly": true,
> > "properties": {
> > "default-base-location": "s3://mybucket"
> > },
> > "storageConfigInfo": {
> > "storageType": "S3",
> > "roleArn": "arn:aws:iam::*****:role/myrole",
> > "region": "us-east-1"
> > }
> > }
> > }
> >
> >
> > Next, we create a namespace:
> >
> > POST http://localhost:8181/api/catalog/v1/test_catalog/namespaces
> >
> > {
> > "namespace": ["fns"],
> > "properties": {
> > "location": "s3://mybucket/fns/"
> > }
> > }
> >
> >
> > Then, we create a table:
> >
> > POST
> > http://localhost:8181/api/catalog/v1/test_catalog/namespaces/fns/tables
> >
> > {
> > "name": "t3",
> > "schema": {
> > "type": "struct",
> > "fields": [
> > {"id": 1, "name": "field1", "type": "int", "required": false},
> > {"id": 2, "name": "field2", "type": "int", "required": false}
> > ]
> > },
> > "stage-create": false
> > }
> >
> >
> > To insert data into the table using Spark with a KMS key:
> >
> > bin/spark-sql \
> > --packages
> >
> org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0
> > \
> > --conf
> >
> spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
> > \
> > --conf
> spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog
> > \
> > --conf spark.sql.catalog.polaris.type=rest \
> > --conf spark.sql.catalog.polaris.uri=
> http://localhost:8181/api/catalog
> > \
> > --conf spark.sql.catalog.polaris.token-refresh-enabled=false \
> > --conf spark.sql.catalog.polaris.warehouse=test_catalog \
> > --conf spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL \
> > --conf
> >
> spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=vended-credentials
> > \
> > --conf
> >
> spark.sql.catalog.polaris.s3.sse.key=arn:aws:kms:us-east-1:****:key/mykey \
> > --conf spark.sql.catalog.polaris.s3.sse.type=kms \
> > --conf
> >
> spark.sql.catalog.polaris.credential=38a0ebdbcc97c58f:654af690485a32f02ce17dda7cd2c36d
> >
> >
> > Then run:
> >
> > insert into t3 values (7,9);
> >
> >
> > This will create an entry in the bucket with the data encrypted using the
> > specified key.
> >
> > If you remove these two properties from the connection command:
> >
> > * --conf
> > spark.sql.catalog.polaris.s3.sse.key=arn:aws:kms:us-east-1:****:key/mykey
> >
> > * --conf spark.sql.catalog.polaris.s3.sse.type=kms
> >
> > then you can insert data into the table without encryption.
> >
> > In Polaris, at the point of the insert, we don’t know the KMS key, so
> > there’s no way to add that key in the policy. This is where adding
> > arn:aws:kms:%s:%s:key/* helps.
> >
> > Adding a new property in the storageConfigInfo for the KMS key will help
> > with creating a more restrictive IAM policy, but we also need to support
> > the existing properties defined in Iceberg.
> >
> > The reason for my changes—passing the properties down to
> > AwsCredentialsStorageIntegration—is because you can pass extra properties
> > to any Polaris request, and those will feed down to Iceberg. For example:
> >
> > {
> > "name": "t3",
> > "schema": {
> > "type": "struct",
> > "fields": [
> > {"id": 1, "name": "field1", "type": "int", "required": false},
> > {"id": 2, "name": "field2", "type": "int", "required": false}
> > ]
> > },
> > "properties": {
> > "s3.sse.type": "kms",
> > "s3.sse.key": "arn:aws:kms:us-east-1:***:key/mykey"
> > },
> > "stage-create": false
> > }
> >
> >
> > With my changes I was trying to support both ways, not sure if this is
> the
> > best approach or just going for the easier way of allowing all the keys
> > under the account is better.
> >
> > Let me know if you need any more details.
> >
> > Thanks
> >
> > Fabio
> >
> >
> >
> >
> > 1.
> >
> >
> >
> > From: Dmitri Bourlatchkov <[email protected]>
> > Date: Wednesday, 22 October 2025 at 18:50
> > To: [email protected] <[email protected]>
> > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3
> >
> > Hi Fabio,
> >
> > I agree that we seem to have some confusion here :) let's try and
> clarify.
> >
> > What do you mean exactly by "s3 iceberg properties"? Are they Iceberg
> table
> > metadata properties? Iceberg (REST) Catalog properties? FileIO
> properties?
> >
> > Why would Polaris need to support them? What would an end-to-end flow be
> > between Polaris and the client (e.g. Spark)?
> >
> > Thanks,
> > Dmitri.
> >
> > On Wed, Oct 22, 2025 at 12:08 PM Rizzo Cascio, Fabio
> > <[email protected]> wrote:
> >
> > > I feel like there’s something I’m overlooking.
> > > Setting the kms property at the catalog level works, but it doesn’t
> > > support the s3 iceberg properties, right?
> > >
> > > To support those, the best approach is to let clients use the s3
> iceberg
> > > properties directly. Then, in AwsCredentialsStorageIntegration, we can
> > > simply add a policy that allows any key within that account. This way,
> we
> > > don’t need to rely on metadata properties to retrieve the key value.
> > >
> > > Does that make sense?
> > >
> > >
> > > Thanks
> > > Fabio
> > >
> > >
> > > From: Dmitri Bourlatchkov <[email protected]>
> > > Date: Wednesday, 22 October 2025 at 15:42
> > > To: [email protected] <[email protected]>
> > > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3
> > >
> > > Hi Fabio,
> > >
> > > Yes, that would be the preferable solution for KMS from my POV.
> > >
> > > I wonder what other people think about this too :)
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Wed, Oct 22, 2025 at 9:45 AM Rizzo Cascio, Fabio
> > > <[email protected]> wrote:
> > >
> > > > Hi Dmitri,
> > > >
> > > > If I add the kms props to AccessConfig in AWSCredentialStorage I can
> > see
> > > > it in the LoadTable response coming from Polaris.
> > > >
> > > > Fabio
> > > >
> > > > From: Dmitri Bourlatchkov <[email protected]>
> > > > Date: Tuesday, 21 October 2025 at 15:23
> > > > To: [email protected] <[email protected]>
> > > > Subject: Re: [EXTERNAL]Re: KMS Key addition for s3
> > > >
> > > > Hi Fabio,
> > > >
> > > > Yes, I glimpsed that from your email. Sorry if my post caused
> > confusion.
> > > I
> > > > just wanted to reply to the top email as what I'm proposing seems to
> > be a
> > > > key feature for KMS support.
> > > >
> > > > Would you be able to validate whether sending KMS FileIO properties
> to
> > > > clients from LoadTable responses work in practice (e.g. in Spark)?
> > > >
> > > > I believe this can be done by adding KMS properties as "extra"
> > properties
> > > > to AccessConfig.
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > On Tue, Oct 21, 2025 at 4:15 AM Rizzo Cascio, Fabio
> > > > <[email protected]> wrote:
> > > >
> > > > > Hi Dmitri,
> > > > >
> > > > > This is what I was saying in my other email. Anyway I’m gonna
> update
> > my
> > > > PR
> > > > > with the changes I have made to get it working, the project won’t
> > > build
> > > > > because I haven’t update the tests etc, I just want to show my
> > changes
> > > > and
> > > > > see if we can agree on a direction before I make all the changes.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Fabio
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > From: Dmitri Bourlatchkov <[email protected]>
> > > > > Date: Monday, 20 October 2025 at 17:38
> > > > > To: [email protected] <[email protected]>
> > > > > Subject: [EXTERNAL]Re: KMS Key addition for s3
> > > > >
> > > > > Hi Fabio, Ashok and All,
> > > > >
> > > > > Apologies if I'm missing something obvious, but the two WIP KMS PRs
> > > > [1424]
> > > > > [2802] appear to be dealing only with AWS policies on the vended
> > > > credential
> > > > > session. They do not appear to deal with client configuration (in
> > > > LoadTable
> > > > > responses).
> > > > >
> > > > > As far as I understand, Iceberg clients need certain FileIO
> > properties
> > > to
> > > > > be set in order to utilize KMS.
> > > > >
> > > > > I'd imagine that Polaris ought to provide these FileIO properties
> in
> > > > > LoadTable responses in addition to granting privileges for KMS
> access
> > > to
> > > > > the vended (session) credentials.
> > > > >
> > > > > In other words, the decision whether to use KMS rests with Polaris
> > (we
> > > > can
> > > > > discuss how to configure that). If that is enabled, clients should
> > not
> > > > need
> > > > > any extra configuration, they should get complete and usable
> > > > > configuration + credentials from Polaris.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > [1424] https://github.com/apache/polaris/pull/1424
> > > > > [2802] https://github.com/apache/polaris/pull/2802
> > > > >
> > > > > Thanks,
> > > > > Dmitri.
> > > > >
> > > > >
> > > > > On Mon, Oct 13, 2025 at 3:50 AM Rizzo Cascio, Fabio
> > > > > <[email protected]> wrote:
> > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > I have created a new PR to be able to use a kms key for the S3
> > > bucket,
> > > > it
> > > > > > is mandatory for me to use any S3 storage and hopefully a good
> > > addition
> > > > > for
> > > > > > other people that want to use it.
> > > > > >
> > > > > > PR link: https://github.com/apache/polaris/pull/2802
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Fabio
> > > > > >
> > > > > > This message is confidential and subject to terms at:
> > > > > > https://www.jpmorgan.com/emaildisclaimer including on
> > confidential,
> > > > > > privileged or legal entity information, malicious content and
> > > > monitoring
> > > > > of
> > > > > > electronic messages. If you are not the intended recipient,
> please
> > > > delete
> > > > > > this message and notify the sender immediately. Any unauthorized
> > use
> > > is
> > > > > > strictly prohibited.
> > > > > >
> > > > >
> > > > > This message is confidential and subject to terms at:
> > > > > https://www.jpmorgan.com/emaildisclaimer including on
> confidential,
> > > > > privileged or legal entity information, malicious content and
> > > monitoring
> > > > of
> > > > > electronic messages. If you are not the intended recipient, please
> > > delete
> > > > > this message and notify the sender immediately. Any unauthorized
> use
> > is
> > > > > strictly prohibited.
> > > > >
> > > >
> > > > This message is confidential and subject to terms at:
> > > > https://www.jpmorgan.com/emaildisclaimer including on confidential,
> > > > privileged or legal entity information, malicious content and
> > monitoring
> > > of
> > > > electronic messages. If you are not the intended recipient, please
> > delete
> > > > this message and notify the sender immediately. Any unauthorized use
> is
> > > > strictly prohibited.
> > > >
> > >
> > > This message is confidential and subject to terms at:
> > > https://www.jpmorgan.com/emaildisclaimer including on confidential,
> > > privileged or legal entity information, malicious content and
> monitoring
> > of
> > > electronic messages. If you are not the intended recipient, please
> delete
> > > this message and notify the sender immediately. Any unauthorized use is
> > > strictly prohibited.
> > >
> >
> > This message is confidential and subject to terms at:
> > https://www.jpmorgan.com/emaildisclaimer including on confidential,
> > privileged or legal entity information, malicious content and monitoring
> of
> > electronic messages. If you are not the intended recipient, please delete
> > this message and notify the sender immediately. Any unauthorized use is
> > strictly prohibited.
> >
>
> This message is confidential and subject to terms at:
> https://www.jpmorgan.com/emaildisclaimer including on confidential,
> privileged or legal entity information, malicious content and monitoring of
> electronic messages. If you are not the intended recipient, please delete
> this message and notify the sender immediately. Any unauthorized use is
> strictly prohibited.
>