Hi Dmitri, +1, folding this into the broader Authorizer SPI discussion makes sense to me.
The core question here is the same one the general SPI discussion needs to settle: what semantics should the API expose to authorizer implementations, and how explicit should those semantics be in the request model. The point I would like to carry forward is that whichever shape we choose, the evaluation semantics should be explicit that different authorizer implementations do not interpret the same request differently. That matters more to me than the specific class shape. Yufei On Thu, 7 May 2026 16:58:56 -0400, Dmitri Bourlatchkov [email protected] wrote: Hi All, I propose folding this thread into the more general Authorizer SPI discussion. Minutes doc: https://docs.google.com/document/d/1C_SSaZH1i83UUGXrnVBur1fR_FHKYWZ75ISFfcb3kns/edit?tab=t.0#heading=h.wodibvbtg7qj Last meeting : https://lists.apache.org/thread/wjjj9dxg9zqkc7ys3kyylkvdwpp9omxv General sync thread: https://lists.apache.org/thread/ndl802wrkz9993kwpwpp9vx4zvcpkt88 Cheers, Dmitri. On Fri, Apr 24, 2026 at 7:48 PM Yufei Gu [email protected] wrote: Thanks Dmitri, this makes sense, and I agree the current SPI can support the view use case as you described. My concern is less about whether the SPI can support it, and more about whether the current modeling (binding vs two lists) makes the intended semantics clear. In particular, with two lists (targets and secondaries), it’s not obvious whether we are evaluating pairs, independent sets, or something else. That ambiguity could lead to different interpretations across authorizer implementations. My motivation for the simpler model is to avoid over-constraining the API to 1:1 bindings while keeping the semantics flexible and future-proof. Even if today’s use cases look mostly 1:1, I’m hesitant to encode that assumption into the model. Maybe the right next step is to make the evaluation semantics explicit (independent vs pairwise), rather than relying on the shape of the API to imply it. Curious what you think. Yufei On Mon, Apr 20, 2026 at 8:38 PM Dmitri Bourlatchkov [email protected] wrote: Thanks for your example, Madhan! I believe your use case (while still hypothetical in Polaris) fits well within the current AuthZ SPI, which takes a List per authorization call [1][2]. Each of those entries would represent one reference from the view to a table. I do not see an immediate need for additional SPI changes to support this use case. [1] https://github.com/apache/polaris/blob/0272f42fb3a5f6a542d9317c6c06c3b5e0dc8195/polaris-core/src/main/java/org/apache/polaris/core/auth/AuthorizationRequest.java#L58 [2] https://github.com/apache/polaris/blob/26dcf0b401bc6ee5eb6971bd4ef9edb0cfd2675b/polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisAuthorizer.java#L51 Cheers, Dmitri. On Mon, Apr 20, 2026 at 4:11 PM Madhan Neethiraj [email protected] wrote: Hi Dimitri, mainly because there are no realistic use cases Authorization to create a view from multiple tables would be a use case for a single target (the view being created) with multiple secondaries (tables from which the view is created), right? Authorizing such operations in a single call, with all entities accessed, might be critical for authorizer implementations, for example to prevent toxic joins involving 2 specific tables. I suggest retaining N:M association between targets and secondaries in authorization calls. Thanks, Madhan On 4/20/26, 12:22 PM, “Dmitri Bourlatchkov” [email protected] amp #10; amp #100; amp #105; amp #109; amp #97; amp #115; amp #64; amp #97; amp #112; amp #97; amp #99; amp #104; amp #101; amp #46; amp #111; amp #114; amp #103; <[email protected]%02amp%03#10;%02amp%03%23100;%02amp%03%23105;%02amp%03%23109;%02amp%03%2397;%02amp%03%23115;%02amp%03%2364;%02amp%03%2397;%02amp%03%23112;%02amp%03%2397;%02amp%03%2399;%02amp%03%23104;%02amp%03%23101;%02amp%03%2346;%02amp%03%23111;%02amp%03%23114;%02amp%03%23103;> wrote: Hi Yufei, I appreciate having this discussion, as it will hopefully lead to more clarity on the AuthZ SPI use cases. However, at this point I find the many-to-many (N:M) association between targets and secondaries very confusing (mainly because there are no realistic use cases). Transferring my comment from GH [1] for visibility: If targets and secondaries form a non-trivial (size > 1) set, I believe a more coherent approach would be for the caller to perform multiple authorization checks for each tuple as appropriate for the use case. I believe the existing SPI covers that by List. Re: the Cartesian product of targets and secondaries - does anyone have such a use case in practice? [1] https://github.com/apache/polaris/pull/4201#issuecomment-4247944484 < https://github.com/apache/polaris/pull/4201#issuecomment-4247944484> Thanks, Dmitri. On Mon, Apr 20, 2026 at 2:25 PM Yufei Gu [email protected] amp #10; amp #102; amp #108; amp #121; amp #114; amp #97; amp #105; amp #110; amp #48; amp #48; amp #48; amp #64; amp #103; amp #109; amp #97; amp #105; amp #108; amp #46; amp #99; amp #111; amp #109; <[email protected]%02amp%03#10;%02amp%03%23102;%02amp%03%23108;%02amp%03%23121;%02amp%03%23114;%02amp%03%2397;%02amp%03%23105;%02amp%03%23110;%02amp%03%2348;%02amp%03%2348;%02amp%03%2348;%02amp%03%2364;%02amp%03%23103;%02amp%03%23109;%02amp%03%2397;%02amp%03%23105;%02amp%03%23108;%02amp%03%2346;%02amp%03%2399;%02amp%03%23111;%02amp%03%23109;> wrote: Thanks for sharing more context. I copied the reason for introducing the binding here. Given current use cases, I wonder if a List-to-List relationship between targets and secondary is actually correct in general. I’d imagine in all relevant cases it is a 1:1 association. When there are multiple targets, there are going to be multiple target/secondary pairs. I think the intuition may well be right for many current use cases, many of them do look like 1:1 associations in practice. But I am not sure that means 1:1 should be treated as a required constraint in the model. I am hesitant to make that assumption for a few reasons. First, from a modeling perspective, the relationship is not necessarily 1:1. A target may depend on multiple secondaries, for example, it might grant multiple privileges to a role. You might argue this could be done by flattening them into multiple pairs, but the flattening itself seems unnecessary. Conversely, multiple targets may also share the same secondary, such as supporting the attachment of one policy to multiple tables. Second, even if today’s cases mostly look 1:1, keeping the API as list to list gives us more flexibility going forward. It avoids having to redesign the API when new operations introduce more complex dependency relationships, and also prevents the SPI from being constrained to an overly narrow expressive model. To answer your questions and concerns: Should these generally be treated as independent sets, where each target and secondary is evaluated on its own without any pairing semantics? I’d prefer to treat them as independent sets because the privileges are the same for either the whole target list or secondary list, using RBAC as an example. I think the paring semantics make more sense when we can apply different privileges to different pairs. it’s not immediately clear whether we’re evaluating over pairs, or over the Cartesian product of targets and secondaries, or treating the two lists completely independently. Maybe we just need to document the behavior here? I’m open to the ideas. Yufei On Tue, Apr 14, 2026 at 3:49 PM Sung Yun <[email protected] [email protected]> wrote: Thanks for putting this together Yufei, I think this is a very useful discussion. For context, the notion of “binding” in the new SPI came out of the earlier PR discussion [1], and at the time it felt like a reasonable direction, especially since we don’t have concrete M:N or even 1:N use cases today. One thing I’m still trying to understand is the intended semantic model behind having two lists (targets and secondaries) without an explicit binding. For example, for operations like ATTACH_POLICY_TO_TARGET, the name suggests a pairwise relationship between the policy and target. With the existing shape (List targets, List secondaries), it’s not immediately clear whether we’re evaluating over pairs, or over the Cartesian product of targets and secondaries, or treating the two lists completely independently. I don’t have a strong preference on enforcing 1:1 vs reverting to two independent lists, but I do think it would help to make the evaluation semantics explicit. Otherwise different implementations may interpret the same request differently. Curious how you’re thinking about this: - Should these generally be treated as independent sets, where each target and secondary is evaluated on its own without any pairing semantics? - Or should some operations still be interpreted as relationship-oriented, even without explicit bindings? And importantly, should these two questions be open to interpretation per each PolarisAuthorizer implementation? Sung [1] https://github.com/apache/polaris/pull/3760#discussion_r2830890135 https://github.com/apache/polaris/pull/3760#discussion_r2830890135 On 2026/04/14 21:30:44 Yufei Gu wrote: Hi folks, I’d like to get feedback on a proposal to simplify the authorization API: https://github.com/apache/polaris/pull/4201 < https://github.com/apache/polaris/pull/4201>. This PR removes AuthorizationTargetBinding and replaces it with a simpler model based on two lists: a target list and a secondary list. This avoids enforcing a 1:1 mapping in the binding class (I might miss something regarding this enforcement, feel free to chime in), making it more flexible to support 1:1, 1:N or even N:M relationships. For example, supporting the attachment of one policy to multiple tables requires duplicating bindings, which are then flattened anyway. This design also aligns better with the existing RBAC semantics, where target securables are evaluated as one group and secondary securables as another, instead of enforcing pairwise mappings. Open question: We may not need N:M relationships. I couldn’t come up with a clear use case. Note: This interface was introduced recently and is not part of any release, so it can be removed without deprecation. Would love to hear feedback, especially on the intended semantics and real use cases. Yufei – Dmitri Bourlatchkov Senior Staff Software Engineer, Dremio Dremio.com https://www.dremio.com/?utm_medium=email&utm_source=signature&utm_term=na&utm_content=email-signature&utm_campaign=email-signature / Follow Us on LinkedIn https://www.linkedin.com/company/dremio / Get Started https://www.dremio.com/get-started/ The Agentic Lakehouse The only lakehouse built for agents, managed by agents
