> if it is practically possible right now to produce an Iceberg table with
encrypted data files so that Polaris could be tested in a realistic setting?

Yes with the caveat that certain operations are not possible as we
discussed, like drop-by-purge and future scan planning.

Yufei


On Tue, Jun 9, 2026 at 8:53 AM Dmitri Bourlatchkov <[email protected]> wrote:

> Hi Adam,
>
> Working incrementally on this makes sense. I agree that handling internal
> Polaris workflows that deal with encrypted files sounds like a good
> starting point.
>
> I wonder, though, if it is practically possible right now to produce an
> Iceberg table with encrypted data files so that Polaris could be tested in
> a realistic setting? Do you mean something like storing encrypted files
> directly from a client and later registering the table with Polaris? This
> is not a blocker for starting KMS work of course. I'm just trying to
> understand how much of that feature can be practically usable ATM.
>
> Cheers,
> Dmitri.
>
> On Tue, Jun 2, 2026 at 10:55 AM Adam Szita <[email protected]> wrote:
>
> > Thanks everyone, this helps clarify the discussion.
> >
> > I think we should separate two related but different topics:
> >
> >    1. KMS/Vault credential vending to clients via Iceberg REST.
> >    2. KMS configuration used by Polaris itself for server-side
> operations.
> >
> > I agree that #1 should be discussed on the Iceberg side and should not be
> > invented as Polaris-specific behavior. I’m also happy to participate in
> it
> > as I already have a working dev setup with REST client-side encryption
> > enabled, plus a POC for catalog-level KMS configuration. I can help
> > brainstorm/test concrete options but I do see this as a parallel
> > workstream.
> >
> > For Polaris, though, I think #2 will be needed regardless of the final
> REST
> > credential-vending implementation.
> > Iceberg table encryption is coming, and Polaris server-side operations
> that
> > read encrypted Iceberg artifacts will need KMS support. The immediate
> > example is drop table with purge / table cleanup, where Polaris reads
> > manifest lists and manifests to enumerate files for deletion. Those paths
> > will need an EncryptingFileIO initialized with catalog-level KMS
> > configuration.
> >
> > I also agree with the RFC that metadata integrity protection should be
> part
> > of the first Polaris effort, since metadata.json is not encrypted and
> > Polaris should detect out-of-band modification before trusting it for
> > encrypted tables.
> >
> > So my suggested first phase would be limited to:
> >
> >    - catalog-level KMS configuration (separate from storage
> configuration)
> >    - AWS KMS wiring for Polaris server-side operations
> >    - metadata integrity checks for encrypted tables
> >
> > The current RFC seems structured around a broader end-to-end
> > table-encryption story (including client credential vending, key
> rotation,
> > governance/lifecycle topics, and general Iceberg encryption background).
> > Those are important, but I think it would be easier to make progress if
> we
> > first split out and design the narrower Polaris server-side building
> block
> > above, and discuss the broader pieces separately.
> >
> > Does that separation sound reasonable?
> >
> > Cheers,
> > Adam
> >
> > On Thu, 28 May 2026 at 03:26, Yufei Gu <[email protected]> wrote:
> >
> > > Thanks Adam for raising this. I think it's a great feature to have.
> > >
> > > Agreed on what Prashant said. We need some work on the IRC side to
> avoid
> > > any premature implementation in Polaris.
> > >
> > > Yufei
> > >
> > >
> > > On Wed, May 27, 2026 at 9:14 AM Prashant Singh via dev <
> > > [email protected]> wrote:
> > >
> > > > Hey Adam,
> > > >
> > > > Thanks for starting a thread on this in the Polaris community.
> > > > I believe we need a dedicated field in the loadTable response in IRC
> to
> > > > vend KMS credentials. Currently, KMS credentials are mixed with
> storage
> > > > credentials to achieve SSE, but there is no consistent way to enforce
> > > this
> > > > because the spec is silent about it.
> > > > With CSE (Iceberg v3 encryption), things get more involved because
> one
> > > can
> > > > use Vault with S3 as the combination of their KMS and ObjectStore.
> > > > Consequently, a catalog cannot provide access to both as part of
> > > loadTable
> > > > response, my take here is if catalog is giving access to a caller
> > > > If the catalog grants access to a caller because it has SELECT
> > privilege
> > > it
> > > > should provide access to both KMS and Storage.
> > > >
> > > > I have an open thread in the *Iceberg community* [1] . Let's conclude
> > > there
> > > > what the IRC response should look like after consulting with the
> > broader
> > > > Iceberg catalog community (I added REST catalog encryption support in
> > the
> > > > last catalog community sync agenda but we ran out of time [2]), and
> > then
> > > we
> > > > can circle back in the Polaris community to see what would looks like
> > to
> > > > support here.
> > > >
> > > > Best,
> > > > Prashant
> > > >
> > > > [1] https://lists.apache.org/thread/z48t5wgx778j17pzto9kqxwysw4ysxxo
> > > > [2]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1iPGVCIcr-M0XtAiudOguWAvmqIdVgpYN5vz5ohO8PKw/edit?tab=t.0#heading=h.cr6o1g2rn5hc
> > > >
> > > > On Wed, May 27, 2026 at 8:38 AM Alexandre Dutra <[email protected]>
> > > wrote:
> > > >
> > > > > Hi Adam, hi all,
> > > > >
> > > > > I did some archaeology on this topic and (unless I'm reading this
> > > > > wrong) it seems there is some previous work on this topic by Anand
> > > > > Sankaran. He sent his proposal to the Polaris dev mailing list in
> > > > > February [1] and wrote a design doc: [2]. Yufei also opened an
> issue
> > a
> > > > > while ago: [3].
> > > > >
> > > > > I think that the best next step would be to revive Anand's design
> doc
> > > > > and see if it aligns with what you have in mind.
> > > > >
> > > > > I agree that this feature should be prioritized as it is extremely
> > > > > useful for users running on untrusted storage providers. However,
> if
> > I
> > > > > understand the situation correctly, it seems that on the Iceberg
> side
> > > > > the feature is already in the REST spec, but client-side support is
> > > > > still pending [4] – it's been under review for a year. Is that
> > > > > assessment correct? (If so, this would be a good candidate for a
> > > > > feature branch on our side, while we wait for the 1.12 release to
> > > > > land.)
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > > > [1]:
> > https://lists.apache.org/thread/mpg46o0w2bzy75hyhx2j74dgwzjh2ob7
> > > > > [2]:
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1f4Mgg5W1t4NT6R7KLq5K3S4pHlAwYwXTFwUR9uNNpSU/edit?tab=t.0#heading=h.7ucqpo88io4u
> > > > > [3]: https://github.com/apache/polaris/issues/2829
> > > > > [4]: https://github.com/apache/iceberg/pull/13225
> > > > >
> > > > > On Wed, May 27, 2026 at 10:55 AM Adam Szita <[email protected]>
> > wrote:
> > > > > >
> > > > > > Thanks for your replies Dmitri and JB,
> > > > > >
> > > > > > IIUC, the KMS integration you’re referring to is closely tied to
> > AWS
> > > S3
> > > > > > storage. It is storage-layer encryption at rest: Polaris can
> record
> > > AWS
> > > > > KMS
> > > > > > key ARNs in the S3 storage configuration, and during storage
> > > credential
> > > > > > vending it grants the vended AWS credentials the required KMS
> > > > permissions
> > > > > > such as decrypt/encrypt/data-key operations. That lets clients
> > > > read/write
> > > > > > SSE-KMS encrypted S3 objects, but it is still a low-level storage
> > > > concern
> > > > > > and does not know whether the object is an Iceberg data file,
> > > manifest,
> > > > > or
> > > > > > anything else.
> > > > > >
> > > > > > Iceberg table encryption is different. It is one abstraction
> level
> > > > higher
> > > > > > and is table-format aware:
> > > > > >
> > > > > >    - under the hood an EncryptingFileIO is used to access
> encrypted
> > > > > >    artifacts
> > > > > >    - it uses envelope encryption to encrypt data files, manifest
> > > files
> > > > > and
> > > > > >    snapshot files, defining a master table key to be managed in a
> > KMS
> > > > > (for
> > > > > >    some more context:
> https://www.youtube.com/watch?v=G7Y2eNS_d-s)
> > > > > >    - table metadata carries encryption metadata and key
> > references; a
> > > > > >    KMS-backed `KeyManagementClient` wraps/unwraps the keys.
> > > > > >    - it provides better portability of encrypted tables, it's
> > vendor
> > > > > >    independent - in theory you could have a combination of S3
> > storage
> > > > > with GCP
> > > > > >    KMS, or even a custom KMS client implementation should
> > enterprise
> > > > > users
> > > > > >    favor that
> > > > > >    - supporting catalogs would have to bear additional
> > > responsibilities
> > > > > >    such as protecting metadata integrity and preventing master
> > > > > encryption key
> > > > > >    changes (which is an Iceberg table property)
> > > > > >
> > > > > > The catalog-level KMS config I’m proposing is for Iceberg table
> > > > > encryption,
> > > > > > not for S3 SSE-KMS. It also shouldn't be modeled as storage
> > > > configuration
> > > > > > because the storage backend and table-encryption KMS provider do
> > not
> > > > have
> > > > > > to match, perhaps we could use a more concrete naming such
> > > > > > as icebergTableEncryptionKmsConfigInfo to avoid confusion.
> > > > > > In any case I'm happy to draft a design doc and share it here.
> > > > > >
> > > > > > Cheers,
> > > > > > Adam
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, 27 May 2026 at 08:07, Jean-Baptiste Onofré <
> > [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Adam,
> > > > > > >
> > > > > > > Thanks for the proposal.
> > > > > > >
> > > > > > > I share Dmitri's question; my understanding is that this
> pertains
> > > to
> > > > > > > client-side encryption. I can confirm that KMS should work, as
> I
> > > > > recall an
> > > > > > > issue regarding this being fixed in the past.
> > > > > > >
> > > > > > > Adam, could you please clarify the scope of this work?
> > > > > > >
> > > > > > > Regards,
> > > > > > > JB
> > > > > > >
> > > > > > >
> > > > > > > On Tue, May 26, 2026 at 8:01 PM Dmitri Bourlatchkov <
> > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Adam,
> > > > > > > >
> > > > > > > > Thanks for this proposal!
> > > > > > > >
> > > > > > > > Polaris should already support storage-side KMS in AWS (and
> > > > > compatible
> > > > > > > > systems) via [2802] (cf. [1]).
> > > > > > > >
> > > > > > > > I guess the new features you mention relate to client-side
> > > > > encryption,
> > > > > > > > right?
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://polaris.apache.org/blog/2025/12/24/securing-s3-data-with-aws-kms/
> > > > > > > >
> > > > > > > > [2802] https://github.com/apache/polaris/pull/2802
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > > > On Tue, May 26, 2026 at 11:06 AM Adam Szita <
> [email protected]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > Iceberg 1.11 shipped the base implementation for table
> > > > encryption,
> > > > > > > > > including KMS-based key wrapping/unwrapping and encrypted
> > > > > data/delete,
> > > > > > > > > manifest, and manifest-list files. REST catalog support is
> > also
> > > > > being
> > > > > > > > > worked on in Iceberg (see
> > > > > https://github.com/apache/iceberg/pull/13225
> > > > > > > ).
> > > > > > > > >
> > > > > > > > > I have been testing Polaris with Iceberg REST client-side
> > > > > encryption
> > > > > > > > > enabled. Basic catalog operations such as loadTable,
> > > commit/drop
> > > > > > > without
> > > > > > > > > purge, list, etc. work without Polaris changes because
> > Polaris
> > > > only
> > > > > > > needs
> > > > > > > > > the table metadata JSON for those paths, and metadata.json
> is
> > > not
> > > > > > > > > encrypted.
> > > > > > > > >
> > > > > > > > > The places where Polaris does need encryption awareness are
> > the
> > > > > > > > server-side
> > > > > > > > > paths that read encrypted Iceberg artifacts. The first
> > concrete
> > > > > example
> > > > > > > > is
> > > > > > > > > drop table with purge: TableCleanupTask reads snapshot
> > manifest
> > > > > lists
> > > > > > > and
> > > > > > > > > manifests to enumerate files for deletion, so it needs to
> use
> > > an
> > > > > > > > > EncryptingFileIO. The same would apply to any Polaris-side
> > > table
> > > > > > > > > maintenance/optimization, orphan/snapshot cleanup logic, or
> > any
> > > > > future
> > > > > > > > > remote scan/planning capability that reads manifests or
> > > > data/delete
> > > > > > > > files.
> > > > > > > > >
> > > > > > > > > There is also a related but separate topic around vending
> KMS
> > > > > > > credentials
> > > > > > > > > to clients. That likely needs Iceberg REST spec work first,
> > > > > similar in
> > > > > > > > > spirit to current storage credential vending, so I think it
> > > > should
> > > > > be
> > > > > > > > > designed for but not required as the first Polaris step.
> > > > > > > > >
> > > > > > > > > The first Polaris-side building block I would propose is to
> > > allow
> > > > > > > Iceberg
> > > > > > > > > catalogs to carry KMS configuration, similarly to how
> > catalogs
> > > > > > > currently
> > > > > > > > > carry StorageConfigurationInfo. This should be separate
> from
> > > > > storage
> > > > > > > > > configuration because the storage backend and KMS provider
> > may
> > > > > differ,
> > > > > > > > for
> > > > > > > > > example GCS storage with AWS KMS. AWS KMS would be a
> > reasonable
> > > > > first
> > > > > > > > > implementation target, using Iceberg’s existing
> > > > > KeyManagementClient/AWS
> > > > > > > > KMS
> > > > > > > > > support, while leaving the model extensible for Azure and
> > GCP.
> > > > > > > > >
> > > > > > > > > I have already been experimenting with this locally and
> would
> > > be
> > > > > happy
> > > > > > > to
> > > > > > > > > work on the Polaris changes. A possible first PR could be
> > > limited
> > > > > to:
> > > > > > > > >
> > > > > > > > > 1. Add catalog-level KMS configuration model/API support.
> > > > > > > > > 2. Add AWS KMS server-side configuration wiring.
> > > > > > > > >
> > > > > > > > > Any feedback is welcome.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Adam
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to