Re: [DISCUSS] Simplifying Authorization API in Polaris

Yufei Gu Fri, 08 May 2026 09:23:42 -0700

Hi Dmitri,

+1, folding this into the broader Authorizer SPI discussion makes sense to
me.

The core question here is the same one the general SPI discussion needs to
settle: what semantics should the API expose to authorizer implementations,
and how explicit should those semantics be in the request model. The point
I would like to carry forward is that whichever shape we choose, the
evaluation semantics should be explicit that different authorizer
implementations do not interpret the same request differently. That matters
more to me than the specific class shape.

Yufei

On Thu, 7 May 2026 16:58:56 -0400, Dmitri Bourlatchkov
[email protected] wrote:

Hi All,

I propose folding this thread into the more general Authorizer SPI
discussion.

Minutes doc:
https://docs.google.com/document/d/1C_SSaZH1i83UUGXrnVBur1fR_FHKYWZ75ISFfcb3kns/edit?tab=t.0#heading=h.wodibvbtg7qj

Last meeting :
https://lists.apache.org/thread/wjjj9dxg9zqkc7ys3kyylkvdwpp9omxv

General sync thread:
https://lists.apache.org/thread/ndl802wrkz9993kwpwpp9vx4zvcpkt88

Cheers,
Dmitri.

On Fri, Apr 24, 2026 at 7:48 PM Yufei Gu [email protected] wrote:

Thanks Dmitri, this makes sense, and I agree the current SPI can support
the view use case as you described.

My concern is less about whether the SPI can support it, and more about
whether the current modeling (binding vs two lists) makes the intended
semantics clear. In particular, with two lists (targets and secondaries),
it’s not obvious whether we are evaluating pairs, independent sets, or
something else. That ambiguity could lead to different interpretations
across authorizer implementations.

My motivation for the simpler model is to avoid over-constraining the API
to 1:1 bindings while keeping the semantics flexible and future-proof. Even
if today’s use cases look mostly 1:1, I’m hesitant to encode that
assumption into the model. Maybe the right next step is to make the
evaluation semantics explicit (independent vs pairwise), rather than
relying on the shape of the API to imply it. Curious what you think.

Yufei

On Mon, Apr 20, 2026 at 8:38 PM Dmitri Bourlatchkov [email protected]
wrote:

Thanks for your example, Madhan!

I believe your use case (while still hypothetical in Polaris) fits well
within the current AuthZ SPI, which takes a
List per authorization call [1][2]. Each of
those entries would represent one reference from the view to a table.

I do not see an immediate need for additional SPI changes to support this
use case.

[1]

https://github.com/apache/polaris/blob/0272f42fb3a5f6a542d9317c6c06c3b5e0dc8195/polaris-core/src/main/java/org/apache/polaris/core/auth/AuthorizationRequest.java#L58

[2]

https://github.com/apache/polaris/blob/26dcf0b401bc6ee5eb6971bd4ef9edb0cfd2675b/polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisAuthorizer.java#L51

Cheers,
Dmitri.

On Mon, Apr 20, 2026 at 4:11 PM Madhan Neethiraj [email protected]
wrote:

Hi Dimitri,

mainly because there are no realistic use cases
Authorization to create a view from multiple tables would be a use case
for a single target (the view being created) with multiple secondaries
(tables from which the view is created), right?

Authorizing such operations in a single call, with all entities
accessed,
might be critical for authorizer implementations, for example to
prevent
toxic joins involving 2 specific tables. I suggest retaining N:M
association between targets and secondaries in authorization calls.

Thanks,
Madhan

On 4/20/26, 12:22 PM, “Dmitri Bourlatchkov” [email protected] amp #10; amp
#100; amp #105; amp #109; amp #97; amp #115; amp #64; amp #97; amp #112;
amp #97; amp #99; amp #104; amp #101; amp #46; amp #111; amp #114; amp #103;
<[email protected]%02amp%03#10;%02amp%03%23100;%02amp%03%23105;%02amp%03%23109;%02amp%03%2397;%02amp%03%23115;%02amp%03%2364;%02amp%03%2397;%02amp%03%23112;%02amp%03%2397;%02amp%03%2399;%02amp%03%23104;%02amp%03%23101;%02amp%03%2346;%02amp%03%23111;%02amp%03%23114;%02amp%03%23103;>
wrote:

Hi Yufei,

I appreciate having this discussion, as it will hopefully lead to more
clarity on the AuthZ SPI use cases.

However, at this point I find the many-to-many (N:M) association
between
targets and secondaries very confusing (mainly because there are no
realistic use cases).

Transferring my comment from GH [1] for visibility:

If targets and secondaries form a non-trivial (size > 1) set, I
believe a
more coherent approach would be for the caller to perform multiple
authorization checks for each tuple as
appropriate
for the use case. I believe the existing SPI covers that by
List.

Re: the Cartesian product of targets and secondaries - does anyone have
such a use case in practice?

[1]
https://github.com/apache/polaris/pull/4201#issuecomment-4247944484
<
https://github.com/apache/polaris/pull/4201#issuecomment-4247944484>

Thanks,
Dmitri.

On Mon, Apr 20, 2026 at 2:25 PM Yufei Gu [email protected] amp #10; amp
#102; amp #108; amp #121; amp #114; amp #97; amp #105; amp #110; amp #48;
amp #48; amp #48; amp #64; amp #103; amp #109; amp #97; amp #105; amp #108;
amp #46; amp #99; amp #111; amp #109;
<[email protected]%02amp%03#10;%02amp%03%23102;%02amp%03%23108;%02amp%03%23121;%02amp%03%23114;%02amp%03%2397;%02amp%03%23105;%02amp%03%23110;%02amp%03%2348;%02amp%03%2348;%02amp%03%2348;%02amp%03%2364;%02amp%03%23103;%02amp%03%23109;%02amp%03%2397;%02amp%03%23105;%02amp%03%23108;%02amp%03%2346;%02amp%03%2399;%02amp%03%23111;%02amp%03%23109;>
wrote:

Thanks for sharing more context. I copied the reason for introducing
the
binding here.

Given current use cases, I wonder if a List-to-List relationship
between

targets and secondary is actually correct in general. I’d imagine
in
all
relevant cases it is a 1:1 association. When there are multiple
targets,
there are going to be multiple target/secondary pairs.

I think the intuition may well be right for many current use cases,
many
of
them do look like 1:1 associations in practice. But I am not sure
that
means 1:1 should be treated as a required constraint in the model.

I am hesitant to make that assumption for a few reasons.

First, from a modeling perspective, the relationship is not
necessarily
1:1. A target may depend on multiple secondaries, for example, it
might
grant multiple privileges to a role. You might argue this could be
done
by
flattening them into multiple pairs, but the flattening itself seems
unnecessary. Conversely, multiple targets may also share the same
secondary, such as supporting the attachment of one policy to
multiple
tables.

Second, even if today’s cases mostly look 1:1, keeping the API as
list
to
list gives us more flexibility going forward. It avoids having to
redesign
the API when new operations introduce more complex dependency
relationships, and also prevents the SPI from being constrained to an
overly narrow expressive model.

To answer your questions and concerns:

Should these generally be treated as independent sets, where each
target
and secondary is evaluated on its own without any pairing semantics?

I’d prefer to treat them as independent sets because the privileges
are
the
same for either the whole target list or secondary list, using RBAC
as
an
example. I think the paring semantics make more sense when we can
apply
different privileges to different pairs.

it’s not immediately clear whether we’re evaluating over pairs, or
over
the Cartesian product of targets and secondaries, or treating the two
lists
completely independently.

Maybe we just need to document the behavior here? I’m open to the
ideas.

Yufei

On Tue, Apr 14, 2026 at 3:49 PM Sung Yun <[email protected]
[email protected]> wrote:

Thanks for putting this together Yufei, I think this is a very
useful
discussion.

For context, the notion of “binding” in the new SPI came out of the
earlier PR discussion [1], and at the time it felt like a
reasonable
direction, especially since we don’t have concrete M:N or even 1:N
use
cases today.

One thing I’m still trying to understand is the intended semantic
model
behind having two lists (targets and secondaries) without an
explicit
binding. For example, for operations like ATTACH_POLICY_TO_TARGET,
the
name
suggests a pairwise relationship between the policy and target.
With
the
existing shape (List targets,
List secondaries), it’s not immediately
clear
whether we’re evaluating over pairs, or over the Cartesian product
of
targets and secondaries, or treating the two lists completely
independently.

I don’t have a strong preference on enforcing 1:1 vs reverting to
two
independent lists, but I do think it would help to make the
evaluation
semantics explicit. Otherwise different implementations may
interpret
the
same request differently.

Curious how you’re thinking about this:
- Should these generally be treated as independent sets, where each
target
and secondary is evaluated on its own without any pairing
semantics?
- Or should some operations still be interpreted as
relationship-oriented,
even without explicit bindings?
And importantly, should these two questions be open to
interpretation
per
each PolarisAuthorizer implementation?

Sung

[1]
https://github.com/apache/polaris/pull/3760#discussion_r2830890135
https://github.com/apache/polaris/pull/3760#discussion_r2830890135

On 2026/04/14 21:30:44 Yufei Gu wrote:

Hi folks,

I’d like to get feedback on a proposal to simplify the
authorization
API:
https://github.com/apache/polaris/pull/4201 <
https://github.com/apache/polaris/pull/4201>. This PR removes
AuthorizationTargetBinding and replaces it with a simpler model
based
on
two lists: a target list and a secondary list.

This avoids enforcing a 1:1 mapping in the binding class (I might
miss
something regarding this enforcement, feel free to chime in),
making
it
more flexible to support 1:1, 1:N or even N:M relationships. For
example,
supporting the attachment of one policy to multiple tables
requires
duplicating bindings, which are then flattened anyway. This
design
also
aligns better with the existing RBAC semantics, where target
securables
are
evaluated as one group and secondary securables as another,
instead
of
enforcing pairwise mappings.

Open question: We may not need N:M relationships. I couldn’t come
up
with a
clear use case.

Note: This interface was introduced recently and is not part of
any
release, so it can be removed without deprecation.

Would love to hear feedback, especially on the intended semantics
and
real
use cases.

Yufei

–
Dmitri Bourlatchkov
Senior Staff Software Engineer, Dremio
Dremio.com
https://www.dremio.com/?utm_medium=email&utm_source=signature&utm_term=na&utm_content=email-signature&utm_campaign=email-signature
/
Follow Us on LinkedIn https://www.linkedin.com/company/dremio / Get
Started https://www.dremio.com/get-started/

The Agentic Lakehouse
The only lakehouse built for agents, managed by agents

Re: [DISCUSS] Simplifying Authorization API in Polaris

Reply via email to