IMHO encoding stuff in the url so that we can avoid reverse lookup is the
right thing to do !
Since we are relying on this, signing by a key that the catalog owns seems
a logical natural step to avoid tampering.
Nevertheless it's a standard practice which S3 has that gives you signature
in the pre-signed url (https://amzn-s3-demo-bucket.s3.amazonaws.com/
object.txt?AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Signature=
vjbyNxybdZaMmLa%2ByT372YEAiv4%3D&Expires=1741978496) Looking forward to the
design doc / proposal for Polaris. Best, Prashant Singh

On Tue, Aug 5, 2025 at 6:23 AM Robert Stupp <sn...@snazy.de> wrote:

> Hi,
>
> I can contribute what we did in Nessie:
>
> S3 request signing requires one additional request against the catalog
> for each request performed by S3 (HTTP/REST here). The catalog has to
> enforce the access rules (allow-listing, allowed read & write
> locations).
> Doing the access privilege "dance" considering the huge amount of
> requests is quite expensive, those S3 signing requests have to be as
> fast as possible at best without any backend access, allowing the
> catalog to make a secure decision whether a particular request is
> allowed.
> We have to keep in mind that a single loadTable() can easily lead to
> thousands of S3 requests, and each requires its individual signature.
>
> So how can that be done? As the catalog still has to perform checks
> against the above mentioned access rules, it has to know those. We can
> pass the (encoded) access rules and an expiration timestamp in the
> catalog's request signing URL. We "just" have to ensure that clients
> cannot tamper the access rules, which is where cryptographic signing
> comes into play.
>
> When a client performs a "loadTable()" to get the S3 request signing
> URL, the catalog collects the access rules and encodes them in a
> serialized structure and signs it with a secret key that's only known
> by the catalog.
>
>    client: loadTable()
>     ---> catalog identifies the table
>     ---> catalog performs authZ checks
>     ---> catalog collects access rules
>     ---> catalog serializes access rules
>     ---> catalog signs serialized object
>     ---> catalog returns S3 signing endpoint
> Such an S3 signing endpoint may look like this
>     --->
> https://my-polaris.local/s3-signing/v1/sign/aGVsbG9wb2V3ZmtvcGV3a29wazMybzRpb3VoMjNpdXJoaXVoNGlwdWhqcGl1Z2pyb2lnam9pZWpnb3BpNGppb3B1Z2pocGl1aGdpdXAzNGhnaXVlcmhpdXBnaHJlaXB1Z2h1aXBoaXB1MmhiM3JpdWJuMzJpdXJ0bgo=
>
> When the catalog receives a signing request, it verifies the signature
> [1] and validates [2] the S3 request against those rules. This happens
> in Nessie without any database access, so each S3 signing request
> executes very quickly.
>
> The trick is to manage the secret keys. This is where the
> signing-keys-service [3] comes into play. This service ensures that
> all Nessie instances have a secret key for signing purposes and have
> access to the keys that have been used before, to enable automatic key
> rotation.
>
> There is no knob that a user has to tune or set, it's a standard
> functionality in Nessie. And it works for all Nessie instances (pods)
> accessing the same backend.
>
> We can certainly contribute this functionality, which already works in
> many production environments, to Polaris.
>
> Robert
>
>
> [1]
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/rest/src/main/java/org/projectnessie/catalog/service/rest/IcebergApiV1S3SignResource.java#L104-L106
> [2]
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/rest/src/main/java/org/projectnessie/catalog/service/rest/IcebergS3SignParams.java#L118
> [3]
> https://github.com/projectnessie/nessie/blob/17ab7e5f58bf8e8e62d3bafe8c7f97378f28fe12/catalog/service/impl/src/main/java/org/projectnessie/catalog/service/impl/SignerKeysServiceImpl.java#L46
>
> On Tue, Aug 5, 2025 at 6:04 AM Yufei Gu <flyrain...@gmail.com> wrote:
> >
> > Hi Pat,
> >
> > Remote signing sounds a good idea! Looking forward to a proposal/design
> doc.
> >
> > Yufei
> >
> >
> > On Fri, Aug 1, 2025 at 8:44 AM Pat Patterson <p...@backblaze.com.invalid>
> > wrote:
> >
> > > Hi,
> > >
> > > I'm Pat Patterson, Chief Technical Evangelist at Backblaze. I've
> > > been working with Backblaze B2, our S3-compatible cloud object store,
> and
> > > Iceberg for a little while now, showing how to use it from Snowflake,
> > > Trino, DuckDB, etc.
> > >
> > > I'm replying here as requested by Dmitri on the "Support for non-AWS S3
> > > compatible storage with STS" GitHub issue [1]. I think S3 signing would
> > > work well with Backblaze B2, since we don't currently have an STS. I'm
> > > happy to help in any way I can - I just left a reply to Alexandre
> Dutra on
> > > the "On-Premise S3 & Remote Signing" GitHub issue [2].
> > >
> > > [1]
> https://github.com/apache/polaris/issues/1530#issuecomment-3138005897
> > > [2]
> https://github.com/apache/polaris/issues/32#issuecomment-3144991873
> > >
> > > Cheers,
> > >
> > > Pat
> > >
> > > On 2025/07/31 15:35:55 Robert Stupp wrote:
> > > > Hi,
> > > >
> > > > not sure whether exposing the object storage credentials given to
> > > > Polaris to all clients isn't going to cause a "false impression of
> > > > security" (aka: "our credentials are vended by Polaris, so we're
> safe"
> > > > - nope...).
> > > > With my "evil user" hat on, I'd try to figure out the configuration
> > > > option (is it realm-specific?) to tell Polaris to yield its "master"
> > > > object storage credentials for a few seconds, just long enough so I
> > > > can gain access to it and have access to all the data.
> > > >
> > > > No doubt, there are S3 implementations (software and appliances) that
> > > > do not support STS, which is admittedly not great. I can imagine that
> > > > at least some appliance vendors and software projects/products will
> > > > get STS.
> > > >
> > > > For the non-STS use cases, I think S3 signing is the way to go. Sure,
> > > > it requires one more request, but we can make those requests fast
> (aka
> > > > not require any persistence access) as we did in Nessie. With that we
> > > > could still ensure that clients don't have access to everything,
> > > > respecting the object-storage level read/write/list privileges.
> > > >
> > > > Another option is still to configure the object storage credentials
> at
> > > > the clients. It's not great, but it's still an option. Admins can
> give
> > > > each client individual credentials to reduce potential risks, being
> > > > able to revoke access for individual clients, and/or audit those.
> > > >
> > > > On Thu, Jul 31, 2025 at 2:51 AM Yufei Gu <fl...@gmail.com> wrote:
> > > > >
> > > > > Thanks for raising this, Dmitri!
> > > > >
> > > > > For non-STS use cases, some users may be more comfortable without
> > > > > credential vending. They could configure the storage credentials
> at the
> > > > > engines side. Can we first confirm that vending raw credentials are
> > > really
> > > > > users asking for?
> > > > >
> > > > > If that's the case, raw credential vending should be at least
> optional,
> > > > > which could be guarded by feature flags.
> > > > >
> > > > > And I didn't see much difference between option 1 and option 2.
> Both
> > > > > provide raw credentials and need rotation. Either way is fine with
> me.
> > > > >
> > > > > Yufei
> > > > >
> > > > >
> > > > > On Wed, Jul 30, 2025 at 3:24 PM Dmitri Bourlatchkov <
> di...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Recent conversations [1] [2] about non-AWS S3 storage brought up
> user
> > > needs
> > > > > > for operating with S3-compatible storage that does not have STS.
> > > > > >
> > > > > > Remote request signing can be used to support those use cases,
> but it
> > > is a
> > > > > > considerable development effort to add to Polaris, plus it has
> > > different
> > > > > > performance characteristics than vended credentials.
> > > > > >
> > > > > > I propose two short-term options to support users of non-STS S3
> > > storage.
> > > > > >
> > > > > > 1) Add a configuration option to vend the same credentials that
> > > Polaris has
> > > > > > to clients.
> > > > > >
> > > > > > While this may (rightly) be considered suboptimal from the
> security
> > > > > > perspective, this option does give users a choice to operate
> clients
> > > > > > without explicitly configuring storage credentials for them.
> Polaris
> > > > > > Servers still control the rotation of those credentials.
> > > > > >
> > > > > > 2) Add secondary plain credentials for vending to clients.
> Polaris
> > > itself
> > > > > > will use one key/secret pair. Clients will be issued another
> > > key/secret
> > > > > > pair. Rotation of the client credentials should be possible to
> > > implement
> > > > > > too.
> > > > > >
> > > > > > WDYT?
> > > > > >
> > > > > > [1]
> > > https://github.com/apache/polaris/issues/1530#issuecomment-3137374380
> > > > > > [2] https://github.com/apache/polaris/issues/2207
> > > > > >
> > > > > > Thanks,
> > > > > > Dmitri.
> > > > > >
> > > >
> > >
> > > --
> > > This email, including its contents and any attachment(s), may contain
> > > confidential and/or proprietary information and is solely for the
> review
> > > and use of the intended recipient(s). If you have received this email
> in
> > > error, please notify the sender and permanently delete this email, its
> > > content, and any attachment(s).  Any disclosure, copying, or taking of
> any
> > > action in reliance on an email received in error is strictly
> prohibited.
> > >
>

Reply via email to