Thanks for starting an S3 signing doc Alex!

Just a bit of a nitpicking comment: It looks like this thread was hijacked
for the S3 remote signing discussion :) This is fine from my POV, I just
wanted to clarify that this thread was started to discuss options for
Polaris to send specific credentials to clients (a.k.a. vended credentials)
when STS is not available.

For the sake of clarity, let's re-title the thread for replies related to
the remote signing discussion.

Cheers,
Dmitri.

On Mon, Aug 18, 2025 at 11:54 AM Alexandre Dutra <adu...@apache.org> wrote:

> Hi Yufei,
>
> Yes, sure! There you go:
>
>
> https://docs.google.com/document/d/1ygdia7u4bUHUt6n8XhZo48aKoIyyrCvKqan3XP25iB8/edit?usp=sharing
>
> Thanks,
> Alex
>
>
> On Thu, Aug 14, 2025 at 11:10 PM Yufei Gu <flyrain...@gmail.com> wrote:
> >
> > Thanks Robert, Alex for working on this. Thanks Prashant for chiming in.
> > This is a big feature deserving a design doc and community discussion.
> Can
> > we have a design doc first?
> >
> > Yufei
> >
> >
> > On Thu, Aug 14, 2025 at 8:53 AM Alexandre Dutra <adu...@apache.org>
> wrote:
> >
> > > Hi all,
> > >
> > > I've drafted an initial version of remote signing enablement in
> > > Polaris [1]. Your comments are welcome, either here or directly on the
> > > PR, where there's already some valuable discussion.
> > >
> > > This PR aims to be a minimum viable product for remote signing, not a
> > > comprehensive implementation. Notably, it doesn't include Nessie's
> > > cryptographically-signed request parameters.
> > >
> > > One aspect of remote signing not covered by the IRC specification is
> > > RBAC. For this, I've introduced a new table privilege and authorizable
> > > operation in the PR, with access checks based on these table-like
> > > validations. This is admittedly coarse-grained, but can be refined
> > > later.
> > >
> > > A consequence of implementing RBAC for remote signing is that it's
> > > impractical to use the spec's default endpoint – /v1/aws/s3/sign –
> > > because it cannot properly identify the table and catalog.
> > >
> > > Thanks,
> > > Alex
> > >
> > > [1]: https://github.com/apache/polaris/pull/2280
> > >
> > > On Thu, Aug 14, 2025 at 5:40 PM Prashant Singh
> > > <prashant.si...@snowflake.com.invalid> wrote:
> > > >
> > > > IMHO encoding stuff in the url so that we can avoid reverse lookup
> is the
> > > > right thing to do !
> > > > Since we are relying on this, signing by a key that the catalog owns
> > > seems
> > > > a logical natural step to avoid tampering.
> > > > Nevertheless it's a standard practice which S3 has that gives you
> > > signature
> > > > in the pre-signed url (https://amzn-s3-demo-bucket.s3.amazonaws.com/
> > > > object.txt?AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Signature=
> > > > vjbyNxybdZaMmLa%2ByT372YEAiv4%3D&Expires=1741978496) Looking forward
> to
> > > the
> > > > design doc / proposal for Polaris. Best, Prashant Singh
> > > >
> > > > On Tue, Aug 5, 2025 at 6:23 AM Robert Stupp <sn...@snazy.de> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I can contribute what we did in Nessie:
> > > > >
> > > > > S3 request signing requires one additional request against the
> catalog
> > > > > for each request performed by S3 (HTTP/REST here). The catalog has
> to
> > > > > enforce the access rules (allow-listing, allowed read & write
> > > > > locations).
> > > > > Doing the access privilege "dance" considering the huge amount of
> > > > > requests is quite expensive, those S3 signing requests have to be
> as
> > > > > fast as possible at best without any backend access, allowing the
> > > > > catalog to make a secure decision whether a particular request is
> > > > > allowed.
> > > > > We have to keep in mind that a single loadTable() can easily lead
> to
> > > > > thousands of S3 requests, and each requires its individual
> signature.
> > > > >
> > > > > So how can that be done? As the catalog still has to perform checks
> > > > > against the above mentioned access rules, it has to know those. We
> can
> > > > > pass the (encoded) access rules and an expiration timestamp in the
> > > > > catalog's request signing URL. We "just" have to ensure that
> clients
> > > > > cannot tamper the access rules, which is where cryptographic
> signing
> > > > > comes into play.
> > > > >
> > > > > When a client performs a "loadTable()" to get the S3 request
> signing
> > > > > URL, the catalog collects the access rules and encodes them in a
> > > > > serialized structure and signs it with a secret key that's only
> known
> > > > > by the catalog.
> > > > >
> > > > >    client: loadTable()
> > > > >     ---> catalog identifies the table
> > > > >     ---> catalog performs authZ checks
> > > > >     ---> catalog collects access rules
> > > > >     ---> catalog serializes access rules
> > > > >     ---> catalog signs serialized object
> > > > >     ---> catalog returns S3 signing endpoint
> > > > > Such an S3 signing endpoint may look like this
> > > > >     --->
> > > > >
> > >
> https://my-polaris.local/s3-signing/v1/sign/aGVsbG9wb2V3ZmtvcGV3a29wazMybzRpb3VoMjNpdXJoaXVoNGlwdWhqcGl1Z2pyb2lnam9pZWpnb3BpNGppb3B1Z2pocGl1aGdpdXAzNGhnaXVlcmhpdXBnaHJlaXB1Z2h1aXBoaXB1MmhiM3JpdWJuMzJpdXJ0bgo=
> > > > >
> > > > > When the catalog receives a signing request, it verifies the
> signature
> > > > > [1] and validates [2] the S3 request against those rules. This
> happens
> > > > > in Nessie without any database access, so each S3 signing request
> > > > > executes very quickly.
> > > > >
> > > > > The trick is to manage the secret keys. This is where the
> > > > > signing-keys-service [3] comes into play. This service ensures that
> > > > > all Nessie instances have a secret key for signing purposes and
> have
> > > > > access to the keys that have been used before, to enable automatic
> key
> > > > > rotation.
> > > > >
> > > > > There is no knob that a user has to tune or set, it's a standard
> > > > > functionality in Nessie. And it works for all Nessie instances
> (pods)
> > > > > accessing the same backend.
> > > > >
> > > > > We can certainly contribute this functionality, which already
> works in
> > > > > many production environments, to Polaris.
> > > > >
> > > > > Robert
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > >
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/rest/src/main/java/org/projectnessie/catalog/service/rest/IcebergApiV1S3SignResource.java#L104-L106
> > > > > [2]
> > > > >
> > >
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/rest/src/main/java/org/projectnessie/catalog/service/rest/IcebergS3SignParams.java#L118
> > > > > [3]
> > > > >
> > >
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/impl/src/main/java/org/projectnessie/catalog/service/impl/SignerKeysServiceImpl.java#L46
> > > > >
> > > > > On Tue, Aug 5, 2025 at 6:04 AM Yufei Gu <flyrain...@gmail.com>
> wrote:
> > > > > >
> > > > > > Hi Pat,
> > > > > >
> > > > > > Remote signing sounds a good idea! Looking forward to a
> > > proposal/design
> > > > > doc.
> > > > > >
> > > > > > Yufei
> > > > > >
> > > > > >
> > > > > > On Fri, Aug 1, 2025 at 8:44 AM Pat Patterson
> > > <p...@backblaze.com.invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm Pat Patterson, Chief Technical Evangelist at Backblaze.
> I've
> > > > > > > been working with Backblaze B2, our S3-compatible cloud object
> > > store,
> > > > > and
> > > > > > > Iceberg for a little while now, showing how to use it from
> > > Snowflake,
> > > > > > > Trino, DuckDB, etc.
> > > > > > >
> > > > > > > I'm replying here as requested by Dmitri on the "Support for
> > > non-AWS S3
> > > > > > > compatible storage with STS" GitHub issue [1]. I think S3
> signing
> > > would
> > > > > > > work well with Backblaze B2, since we don't currently have an
> STS.
> > > I'm
> > > > > > > happy to help in any way I can - I just left a reply to
> Alexandre
> > > > > Dutra on
> > > > > > > the "On-Premise S3 & Remote Signing" GitHub issue [2].
> > > > > > >
> > > > > > > [1]
> > > > >
> https://github.com/apache/polaris/issues/1530#issuecomment-3138005897
> > > > > > > [2]
> > > > >
> https://github.com/apache/polaris/issues/32#issuecomment-3144991873
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Pat
> > > > > > >
> > > > > > > On 2025/07/31 15:35:55 Robert Stupp wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > not sure whether exposing the object storage credentials
> given to
> > > > > > > > Polaris to all clients isn't going to cause a "false
> impression
> > > of
> > > > > > > > security" (aka: "our credentials are vended by Polaris, so
> we're
> > > > > safe"
> > > > > > > > - nope...).
> > > > > > > > With my "evil user" hat on, I'd try to figure out the
> > > configuration
> > > > > > > > option (is it realm-specific?) to tell Polaris to yield its
> > > "master"
> > > > > > > > object storage credentials for a few seconds, just long
> enough
> > > so I
> > > > > > > > can gain access to it and have access to all the data.
> > > > > > > >
> > > > > > > > No doubt, there are S3 implementations (software and
> appliances)
> > > that
> > > > > > > > do not support STS, which is admittedly not great. I can
> imagine
> > > that
> > > > > > > > at least some appliance vendors and software
> projects/products
> > > will
> > > > > > > > get STS.
> > > > > > > >
> > > > > > > > For the non-STS use cases, I think S3 signing is the way to
> go.
> > > Sure,
> > > > > > > > it requires one more request, but we can make those requests
> fast
> > > > > (aka
> > > > > > > > not require any persistence access) as we did in Nessie. With
> > > that we
> > > > > > > > could still ensure that clients don't have access to
> everything,
> > > > > > > > respecting the object-storage level read/write/list
> privileges.
> > > > > > > >
> > > > > > > > Another option is still to configure the object storage
> > > credentials
> > > > > at
> > > > > > > > the clients. It's not great, but it's still an option.
> Admins can
> > > > > give
> > > > > > > > each client individual credentials to reduce potential risks,
> > > being
> > > > > > > > able to revoke access for individual clients, and/or audit
> those.
> > > > > > > >
> > > > > > > > On Thu, Jul 31, 2025 at 2:51 AM Yufei Gu <fl...@gmail.com>
> > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks for raising this, Dmitri!
> > > > > > > > >
> > > > > > > > > For non-STS use cases, some users may be more comfortable
> > > without
> > > > > > > > > credential vending. They could configure the storage
> > > credentials
> > > > > at the
> > > > > > > > > engines side. Can we first confirm that vending raw
> > > credentials are
> > > > > > > really
> > > > > > > > > users asking for?
> > > > > > > > >
> > > > > > > > > If that's the case, raw credential vending should be at
> least
> > > > > optional,
> > > > > > > > > which could be guarded by feature flags.
> > > > > > > > >
> > > > > > > > > And I didn't see much difference between option 1 and
> option 2.
> > > > > Both
> > > > > > > > > provide raw credentials and need rotation. Either way is
> fine
> > > with
> > > > > me.
> > > > > > > > >
> > > > > > > > > Yufei
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Jul 30, 2025 at 3:24 PM Dmitri Bourlatchkov <
> > > > > di...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > Recent conversations [1] [2] about non-AWS S3 storage
> > > brought up
> > > > > user
> > > > > > > needs
> > > > > > > > > > for operating with S3-compatible storage that does not
> have
> > > STS.
> > > > > > > > > >
> > > > > > > > > > Remote request signing can be used to support those use
> > > cases,
> > > > > but it
> > > > > > > is a
> > > > > > > > > > considerable development effort to add to Polaris, plus
> it
> > > has
> > > > > > > different
> > > > > > > > > > performance characteristics than vended credentials.
> > > > > > > > > >
> > > > > > > > > > I propose two short-term options to support users of
> non-STS
> > > S3
> > > > > > > storage.
> > > > > > > > > >
> > > > > > > > > > 1) Add a configuration option to vend the same
> credentials
> > > that
> > > > > > > Polaris has
> > > > > > > > > > to clients.
> > > > > > > > > >
> > > > > > > > > > While this may (rightly) be considered suboptimal from
> the
> > > > > security
> > > > > > > > > > perspective, this option does give users a choice to
> operate
> > > > > clients
> > > > > > > > > > without explicitly configuring storage credentials for
> them.
> > > > > Polaris
> > > > > > > > > > Servers still control the rotation of those credentials.
> > > > > > > > > >
> > > > > > > > > > 2) Add secondary plain credentials for vending to
> clients.
> > > > > Polaris
> > > > > > > itself
> > > > > > > > > > will use one key/secret pair. Clients will be issued
> another
> > > > > > > key/secret
> > > > > > > > > > pair. Rotation of the client credentials should be
> possible
> > > to
> > > > > > > implement
> > > > > > > > > > too.
> > > > > > > > > >
> > > > > > > > > > WDYT?
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > >
> > > https://github.com/apache/polaris/issues/1530#issuecomment-3137374380
> > > > > > > > > > [2] https://github.com/apache/polaris/issues/2207
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Dmitri.
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > This email, including its contents and any attachment(s), may
> > > contain
> > > > > > > confidential and/or proprietary information and is solely for
> the
> > > > > review
> > > > > > > and use of the intended recipient(s). If you have received this
> > > email
> > > > > in
> > > > > > > error, please notify the sender and permanently delete this
> email,
> > > its
> > > > > > > content, and any attachment(s).  Any disclosure, copying, or
> > > taking of
> > > > > any
> > > > > > > action in reliance on an email received in error is strictly
> > > > > prohibited.
> > > > > > >
> > > > >
> > >
>

Reply via email to