To me Otherwise, B would not have been provided to the engine. Are there cases where an engine might load B but not intend to allow access to the tables it references?
This sounds like the definition of an invoker view. A user is able to load the view definition, but the table load itself is on a per user basis so we don't really have DEFINER behavior imho. I honestly don't have strong feelings either way here, If we want to move forward with the full chain that's fine with me since I feel like Catalogs will get to make these decisions on what their particular permission structures allow. Personally, I wouldn't want to give someone permission to modify a view that is run-as another user if they don't have the permissions as that user to access the underlying tables ;) On Wed, Feb 4, 2026 at 3:49 PM Ryan Blue <[email protected]> wrote: > The DEFINER view referenced by a DEFINER view is a good case to think > about, but I don’t think that it requires the entire reference chain in > order to be secure. > > Using the object names from Russell’s response, when view B is loaded and > referenced-by is A, the catalog must trust that the engine is setting > referenced-by correctly. It trusts that the engine will not lie and say > that B is referenced from A instead of another view, and it trusts that > projections, filters, etc. from A will be applied to data from B. > > I think the question here is whether the first guarantee, that A was > loaded and referenced B, is sufficient when deciding whether the query > has access to B and the tables it references. The catalog *could* assume > that because B is the referenced-by for C from a trusted engine, that the > query must have access to B. Otherwise, B would not have been provided to > the engine. Are there cases where an engine might load B but not intend > to allow access to the tables it references? > > I think there’s a fair argument that those cases exist. When tables or > views are loaded, there’s no intent included. The catalog doesn’t know > whether a view was loaded for a SHOW HISTORY command or because it is > being updated or being run. So a view could be loaded because a user has > some other permission, like MODIFY, but not SELECT. Or maybe a permission > to audit the view but not see data. If the catalog allows those cases, then > being able to load B doesn’t necessarily mean the query has access to the > data that B produces. In that case, you would need to check the > permissions that A has on B to determine whether to load/vend credentials > for C. > > In writing this email, I think I’ve been convinced that Christian is > correct and that it is best to keep the reference chain. Russell and > Prashant, what do you think? > > Ryan > > On Wed, Feb 4, 2026 at 1:12 PM Russell Spitzer <[email protected]> > wrote: > >> I understand the logging concern but not the correctness one. >> >> Are you saying we have to re-check to make sure nothing has changed since >> we started? >> >> I would assume in this auth chain we could get by with a referenced_by in >> the view request as well? >> A (View) => B (View) => C (Table) >> LoadView(A) gets the first view >> LoadView(B, referenced_by A) is for the second view using >> "referenced_by" the first view >> LoadTable(C, referenced_by B) Finally we request the table using >> referenced_by the second view >> >> Do we need the full chain in this case? >> >> I'm kind of convinced though by the logging argument since that would be >> useful information to have, although I'm not >> sure the Catalog couldn't piece this back together. It would definitely >> be simpler to have it just always present. >> >> On Wed, Feb 4, 2026 at 2:34 PM Christian Thiel < >> [email protected]> wrote: >> >>> Your assumption is correct—the 1st DEFINER view is authorized before the >>> query engine retrieves its content and learns it references the 2nd DEFINER. >>> >>> Let me clarify the setup I had in mind: Query engines increasingly >>> support passing user tokens to the catalog for authorization. Examples >>> include Starburst's OAuth2 Token Passthrough [1] and StarRocks' JWT >>> authentication [2]. >>> >>> In such setups, the second request to the 2nd DEFINER view becomes >>> problematic: the catalog receives a request from a user / invoker lacking >>> direct access. Using the hypothetical "referenced-by" field—and assuming a >>> trust relationship with the engine guaranteeing correctness—we must >>> validate both: >>> >>> 1. The authorization decision for the 1st DEFINER still holds >>> 2. The 1st DEFINER's owner has access to the 2nd >>> >>> While catalogs could issue short-lived authorization proof when >>> returning the 1st DEFINER, re-authorizing is equally valid and arguably >>> preferable, as the information is more current. >>> >>> Extending this to the TABLE level: we can either provide authorization >>> proof with the 2nd DEFINER (presented when querying the TABLE), or >>> re-authorize the entire chain. >>> >>> Without carrying client-side trust between requests, having the full >>> (trusted) chain is the only way to authorize TABLE access (again requiring >>> correctness guarantees through other trust mechanisms). Therefore, >>> authorizing table access can only be seamlessly explained with the complete >>> chain. Explicitly providing this information explicitly is preferable to >>> reconstructing it from the TABLE metadata plus all prior authorization >>> requests in my opinion - if only for audit logging. >>> >>> Does that make my thoughts clear? >>> >>> [1] >>> https://docs.starburst.io/latest/object-storage/metastores.html#oauth-2-0-token-pass-through >>> [2] >>> https://docs.starrocks.io/docs/data_source/catalog/iceberg/iceberg_rest_security/#security-mechanisms >>> >>> Best, >>> >>> Christian >>> >>> On Wed, 4 Feb 2026 at 20:20, Prashant Singh <[email protected]> >>> wrote: >>> >>>> Thank you for the feedback Christian ! >>>> I agree having full context could help in Audit purpose. >>>> >>>> Though, I am not able to fully understand your feedback from AuthZ pov >>>> can you please elaborate ? >>>> IIUC in your example 1st DEFINER => 2nd DEFINER => TABLE >>>> user's access to 1st DEFINER view would have been Authorized before >>>> the Query Engine could learn that 1st DEFINER references the 2nd DEFINER, i >>>> am assuming it has a success in getting the view definition ? All it needs >>>> to know when loading the table is what the view is referencing, when >>>> it's authorizing the loadTable. >>>> >>>> regarding the referenced-by in the loadView thats a good >>>> recommendation, let me think more >>>> >>>> Best, >>>> Prashant Singh >>>> >>>> >>>> On Tue, Feb 3, 2026 at 11:28 AM Christian Thiel < >>>> [email protected]> wrote: >>>> >>>>> I prefer to keep the full chain. >>>>> >>>>> Consider this scenario: >>>>> 1st DEFINER => 2nd DEFINER => TABLE >>>>> >>>>> When a user has access only to the outer view and the load table >>>>> endpoint is called, the following authorizations conditions must be >>>>> ensured: >>>>> >>>>> 1. Owners of the DEFINER views still have access to their >>>>> referenced objects >>>>> 2. The querying User has access to his entrypoint - the 1st >>>>> DEFINER View >>>>> >>>>> If the load table endpoint receives only the immediate parent in >>>>> referenced-by, we lose critical information for check (2). This means >>>>> the request data alone—even if trusted—is insufficient to make a complete >>>>> authorization decision unless the server internally correlates the call to >>>>> the 2nd DEFINER load with the load table request, as we can't trace it >>>>> back >>>>> to the 1st DEFINER otherwise. To make this work consistently we would >>>>> require referenced-by also for the load View endpoint. >>>>> >>>>> Additionally, knowing the user's entry point is valuable for auditing >>>>> purposes, particularly in DEFINER-heavy implementations. >>>>> >>>>> I kind of disagree that postgres DEFINER views don't require deeply >>>>> nested context. >>>>> >>>>> Postgres just handles this chain internally: >>>>> 1. User is allowed to query 1st DEFINER >>>>> 2. thus 2nd DEFINER may be used to respond to the query >>>>> 3. thus TABLE maybe used to respond to the query >>>>> But propagating this trust relationship in Icebeberg REST is more >>>>> complex as objects are queried individually, so we can't just validate the >>>>> full plan, but instead need to be able to validate access to each >>>>> individual component it requires. >>>>> >>>>> Best, >>>>> Christian >>>>> >>>>> On Mon, 2 Feb 2026 at 19:44, Russell Spitzer < >>>>> [email protected]> wrote: >>>>> >>>>>> Just to re-up my comments from the discussion. >>>>>> >>>>>> I'm in favor of Immediate Parent only. Full chain seems to be for >>>>>> situations where we want to be able to "override" the security >>>>>> definition of an inner nested view. For users who want to >>>>>> do this, I would encourage them to just make a brand new definer view >>>>>> without referencing the "invoker" view. >>>>>> >>>>>> For example >>>>>> >>>>>> DEFINER => INVOKER => TABLE >>>>>> >>>>>> The "definer" should not be able to remove the "invoked" nature of >>>>>> access to the table. If a user really >>>>>> wants that behavior they should construct >>>>>> >>>>>> DEFINER (Combined with INVOKER SQL) => TABLE >>>>>> >>>>>> I'd rather we didn't encourage more complicated constructions >>>>>> >>>>>> On Mon, Feb 2, 2026 at 12:34 PM Prashant Singh < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I’m currently working on passing additional context via the >>>>>>> referenced-by parameter in loadTable calls. This is a foundational >>>>>>> step toward enabling catalogs to make authorization decisions based on >>>>>>> query execution context. >>>>>>> >>>>>>> While the broader trust relationships and AuthZ constructs are >>>>>>> outside the scope of IRC, I’d like to align on the level of detail we >>>>>>> should provide. Specifically: *Should we send the entire view >>>>>>> reference chain, or only the immediate parent view on nested views?* >>>>>>> >>>>>>> The following are trade-offs: >>>>>>> >>>>>>> - >>>>>>> >>>>>>> *Full Chain:* Provides maximum flexibility for the server to >>>>>>> make complex AuthZ decisions but increases client-side overhead for >>>>>>> tracking nested references. >>>>>>> - >>>>>>> >>>>>>> *Immediate Parent:* Simpler for the client to implement but >>>>>>> provides limited context for sophisticated authorization policies. >>>>>>> >>>>>>> *Prior Art & Research:* As noted in this discussion >>>>>>> <https://github.com/apache/iceberg/pull/13810#discussion_r2747121401> >>>>>>> (thanks Ryan and Russell), Postgres handles this via DEFINER (owner >>>>>>> permissions) and INVOKER (query permissions) without requiring >>>>>>> deeply nested context. My research into other engines hasn't yielded a >>>>>>> standard "gold level" approach yet, as some platforms simply restrict >>>>>>> nested view complexity. >>>>>>> >>>>>>> I’d love to hear your thoughts on which approach aligns better. >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Prashant Singh >>>>>>> >>>>>>
