Re: [DISCUSS][REST] Granularity of referenced-by context in loadTable calls

Prashant Singh Wed, 04 Feb 2026 11:20:15 -0800

Thank you for the feedback Christian !
I agree having full context could help in Audit purpose.


Though, I am not able to fully understand your feedback from AuthZ pov can
you please elaborate ?
IIUC in your example 1st DEFINER => 2nd DEFINER => TABLE
user's access to 1st DEFINER view would have been Authorized before
the Query Engine could learn that 1st DEFINER references the 2nd DEFINER, i
am assuming it has a success in getting the view definition ? All it needs
to know when loading the table is what the view is referencing, when it's
authorizing the loadTable.

regarding the referenced-by in the loadView thats a good recommendation,
let me think more

Best,
Prashant Singh


On Tue, Feb 3, 2026 at 11:28 AM Christian Thiel <[email protected]>
wrote:

> I prefer to keep the full chain.
>
> Consider this scenario:
> 1st DEFINER => 2nd DEFINER => TABLE
>
> When a user has access only to the outer view and the load table endpoint
> is called, the following authorizations conditions must be ensured:
>
>    1. Owners of the DEFINER views still have access to their referenced
>    objects
>    2. The querying User has access to his entrypoint - the 1st DEFINER
>    View
>
> If the load table endpoint receives only the immediate parent in
> referenced-by, we lose critical information for check (2). This means the
> request data alone—even if trusted—is insufficient to make a complete
> authorization decision unless the server internally correlates the call to
> the 2nd DEFINER load with the load table request, as we can't trace it back
> to the 1st DEFINER otherwise. To make this work consistently we would
> require referenced-by also for the load View endpoint.
>
> Additionally, knowing the user's entry point is valuable for auditing
> purposes, particularly in DEFINER-heavy implementations.
>
> I kind of disagree that postgres DEFINER views don't require deeply nested
> context.
>
> Postgres just handles this chain internally:
> 1. User is allowed to query 1st DEFINER
> 2. thus 2nd DEFINER may be used to respond to the query
> 3. thus TABLE maybe used to respond to the query
> But propagating this trust relationship in Icebeberg REST is more complex
> as objects are queried individually, so we can't just validate the full
> plan, but instead need to be able to validate access to each individual
> component it requires.
>
> Best,
> Christian
>
> On Mon, 2 Feb 2026 at 19:44, Russell Spitzer <[email protected]>
> wrote:
>
>> Just to re-up my comments from the discussion.
>>
>> I'm in favor of Immediate Parent only. Full chain seems to be for
>> situations where we want to be able to "override" the security
>> definition of an inner nested view. For users who want to
>> do this, I would encourage them to just make a brand new definer view
>> without referencing the "invoker" view.
>>
>> For example
>>
>> DEFINER => INVOKER => TABLE
>>
>> The "definer" should not be able to remove the "invoked" nature of access
>> to the table. If a user really
>> wants that behavior they should construct
>>
>> DEFINER (Combined with INVOKER SQL) => TABLE
>>
>> I'd rather we didn't encourage more complicated constructions
>>
>> On Mon, Feb 2, 2026 at 12:34 PM Prashant Singh <[email protected]>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I’m currently working on passing additional context via the
>>> referenced-by parameter in loadTable calls. This is a foundational step
>>> toward enabling catalogs to make authorization decisions based on query
>>> execution context.
>>>
>>> While the broader trust relationships and AuthZ constructs are outside
>>> the scope of IRC, I’d like to align on the level of detail we should
>>> provide. Specifically: *Should we send the entire view reference chain,
>>> or only the immediate parent view on nested views?*
>>>
>>> The following are trade-offs:
>>>
>>>    -
>>>
>>>    *Full Chain:* Provides maximum flexibility for the server to make
>>>    complex AuthZ decisions but increases client-side overhead for tracking
>>>    nested references.
>>>    -
>>>
>>>    *Immediate Parent:* Simpler for the client to implement but provides
>>>    limited context for sophisticated authorization policies.
>>>
>>> *Prior Art & Research:* As noted in this discussion
>>> <https://github.com/apache/iceberg/pull/13810#discussion_r2747121401>
>>> (thanks Ryan and Russell), Postgres handles this via DEFINER (owner
>>> permissions) and INVOKER (query permissions) without requiring deeply
>>> nested context. My research into other engines hasn't yielded a standard
>>> "gold level" approach yet, as some platforms simply restrict nested view
>>> complexity.
>>>
>>> I’d love to hear your thoughts on which approach aligns better.
>>>
>>> Best regards,
>>>
>>> Prashant Singh
>>>
>>

Re: [DISCUSS][REST] Granularity of referenced-by context in loadTable calls

Reply via email to