Thank you all for joining the sync today (06/09)

Summarizing the sync for folks who couldn't attend. (AI assisted)
recording : https://youtu.be/GUhCr-PnSo8

Where the room landed:

1. *Capabilities header* - leaning drop. It was effectively a versioning
workaround, and as a security mechanism it's spoofable. Trust must be
established out-of-band (mTLS / OAuth / on-behalf-of), not via a client-set
header.
2. *V1 carries read-restrictions now, with conditional wording*. Adjusted
the spec - please have a look. Field stays optional. Let implementations
move without waiting on V2.
3. *V2 endpoint wording*. V2 would tighten this to an unconditional MUST —
opting into V2 = committing to parse read-restrictions and fail-closed on
anything unknown.
4. Two contracts, kept separate:
  - Correctness contract (in spec): parse + fail-on-unknown.
  - Trust contract (out of spec): admin/catalog decides which clients are
non-malicious and wired up to enforce. Both are required; neither replaces
the other.
5. *No V2 capability header for now*. A V2-side header to signal "I can
enforce" was discussed for fail-fast, but the case felt narrow. Open to
revisiting.

Open items rolling into the broader V2 conversation:
- Optional fail-fast capability header.
- V4 table support (null vs empty snapshot list).

*Call for review:*
- Ryan's expressions-spec PR (read-restrictions has a dependency on it):
https://github.com/apache/iceberg/pull/16652
- Read restrictions PR — updated the wording ....:
https://github.com/apache/iceberg/pull/13879

Best,
Prashant Singh

On Wed, May 27, 2026 at 11:57 PM Prashant Singh <[email protected]>
wrote:

> Hi all,
>   Quick summary from yesterday's community sync for those who couldn't
> attend. Recording: https://youtu.be/-KEesN1udyY
>   1. *Client Capabilities Header (PR #16394)
> <https://github.com/apache/iceberg/pull/16394>*
>   Re-discussed the generic vs. per-feature header debate from the 05/12
> sync. Rough consensus leaned toward a single
>   generic X-Iceberg-Client-Capabilities header, with these clarifications:
>   - *Advisory, not authoritative*. Servers MUST NOT use the header for
> trust or authorization decisions. Trust is
>   established out-of-band (mTLS / OAuth / engine identity).
>   - *No versioning at the header level yet*. The fail-closed contract
> ("client MUST fail on unrecognized payload
>   contents") handles forward compatibility. New incompatible behavior
> would be a new capability token, not a version
>   suffix.
>   - *No change to existing per-request directives*.
> X-Iceberg-Access-Delegation stays as-is for vended-credentials vs.
>   remote-signing selection. Capabilities and per-request preferences are
> kept as separate mechanisms.
>
>   Dan was absent for the final stretch, following up separately on the
> list before revising the PR.
>
>   2. *Expression Spec Enhancement*
>   ID-based field references (needed for unbounded row filter expressions
> in read restrictions) will move forward. Spec
>   write-up to follow. In the meantime, the read-restrictions spec PR will
> reference the planned change.
>
>   3. *Read Restrictions Spec - Call for Review*
>   The spec PR (#13879) <https://github.com/apache/iceberg/pull/13879> is
> ready for another review pass. It will be updated with the expression spec
> reference once that
>    write-up lands, but the rest of the content is in shape for review now.
> Would appreciate eyes on it.
>
>   4. *API for Action functions (PR#16198
> <https://github.com/apache/iceberg/pull/16198>) - Call for Review*
>   Generic functions module + actions wrapper is in the PR with end-to-end
> plumbing.
>
>   Thanks,
>   Prashant
>
> On Mon, May 18, 2026 at 12:01 PM Prashant Singh <[email protected]>
> wrote:
>
>> Hi all,
>>
>>   Sharing a summary of the Iceberg Read Restrictions sync on May 12,
>> 2026, for
>>   folks who couldn't attend. (As always, syncs are for discussion only.)
>>   Recording:  https://youtu.be/b9p6mI-k-0I
>>
>>  Topics discussed
>>
>>   1. NULL handling for mask-to-default
>>
>>   Question: should mask-to-default preserve NULL inputs (NULL → NULL) or
>> replace
>>   them with the type-specific default (NULL → 0 / "" / epoch / etc.)? The
>>   direction in the room was to NOT preserve NULL - preserving leaks the
>>   existence of NULL, which can itself be sensitive information. The other
>>   actions (replace-with-null, mask-alphanum, show-first-4 / show-last-4,
>>   sha-256 variants, truncate-to-year / truncate-to-month) keep their
>> natural
>>   NULL-preserving behavior; mask-to-default is the exception. This is
>>   reflected in the most recent push to PR #13879.
>>
>>  2. Older clients without read-restriction support
>>
>>   How should the spec handle clients that don't understand the
>>   read-restrictions field returned by loadTable? Direction: introduce a
>> generic
>>   client-capability header (X-Iceberg-Client-Capabilities) as a
>> forward-compat
>>   signal, separate from per-request signals like
>> X-Iceberg-Access-Delegation.
>>   Trust establishment between client and catalog stays out of scope -
>>   operator/catalog-implementation concern, not spec.
>>
>>   A follow-up PR (#16394) has been raised to add the header and the
>> parameter
>>   component; a separate [DISCUSS] thread is being raised in parallel.
>>
>>   3. Identity propagation (Trino, Spark, etc.)
>>
>>   Surfaced briefly. Folks acknowledged identity propagation across
>> multi-tenant
>>   query engines is a real problem, but it's orthogonal to the spec - it's
>>   a catalog / auth-manager concern. Not in scope for #13879.
>>
>>   4. Actions API placement
>>
>>   Continued discussion on where action definitions live in the Java API.
>>   Direction: ship as utility functions in the api module, mirroring the
>>   existing Transform pattern. Don't expose a top-level "Action" type in
>>   public Java API — keep "Action" as a REST-spec construct only.
>>
>>
>>   Follow-ups
>>
>>     - PR #13879 (FGAC read restrictions): incorporated the discussion
>> above and
>>       continue iterating on reviewer feedback as they come (please take
>> another look).
>>     - PR #16394 (X-Iceberg-Client-Capabilities header): raised, [DISCUSS]
>>       thread incoming.
>>     - Next sync: bi-weekly cadence  - see the sync notes doc for date.
>>
>>   Links
>>
>>     - Recording: https://youtu.be/b9p6mI-k-0I
>>     - Sync notes doc:
>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit
>>     - PR #13879: https://github.com/apache/iceberg/pull/13879
>>     - PR #16394: https://github.com/apache/iceberg/pull/16394
>>       - DISCUSS:
>> https://lists.apache.org/thread/xlqx6k7g625p38bxxy141wt02d00w2h4
>>
>>   Thanks to everyone who joined.
>>
>>   Prashant
>>
>> On Mon, Apr 27, 2026 at 9:05 AM Prashant Singh <[email protected]>
>> wrote:
>>
>>>   Thank you all for joining the syncs so far!
>>>
>>>   After much discussion and debate, we've narrowed things down to a
>>> final list of 9 predefined actions.
>>>
>>>   Spec update: I updated the spec PR [1] last week - bumping it here as
>>> well. Please take a look when you get a chance!
>>>
>>>   POC progress: I've been prototyping with both a SQL client and a NoSQL
>>> client. The Apache Spark integration [2] fits cleanly. For NoSQL, I'm
>>> working on the Iceberg Generics reader [3] and py-iceberg in parallel.
>>>
>>>   Upcoming sync - *04/28* agenda:
>>>   1. Default mask values per type (I've added some initial proposals to
>>> kick off the conversation)
>>>   2. How and where to put Actions in the iceberg java, where to keep it
>>> to make engine integration seamless and reusable (ofc engines are free to
>>> implement their own)
>>>
>>>   Past sync notes and recordings are available here [4].
>>>
>>>   Looking forward to seeing everyone at the next sync. Thank you for all
>>> your valuable feedback!
>>>
>>>   Best,
>>>   Prashant Singh
>>>
>>>   [1] https://github.com/apache/iceberg/pull/13879
>>>   [2] https://github.com/apache/iceberg/pull/16082
>>>   [3] https://github.com/apache/iceberg/pull/16131
>>>   [4]
>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>>>
>>> On Mon, Mar 23, 2026 at 7:22 PM Prashant Singh <[email protected]>
>>> wrote:
>>>
>>>>   Hi all,
>>>>
>>>>   Here is the summary for *Iceberg Read Restrictions Sync (03/17)*.
>>>>
>>>>   Recording: https://www.youtube.com/watch?v=LObBU3r_GXg
>>>>   Design Doc:
>>>> https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit
>>>>
>>>>   *Execution Order (agreed)*
>>>>   1. Authorization predicates and row filters on unmasked data
>>>>   2. Column masks applied
>>>>   3. User query filters on masked data
>>>>
>>>>   This prevents point attacks where users craft filters to deduce
>>>> masked values. If user filters get pushed down before masking, that's the
>>>> engine's
>>>>   responsibility to handle correctly and not let it be open to
>>>> exploitation.
>>>>
>>>>   *Masking Functions*
>>>>   - Mask to Default — type-specific constant values; preserves schema
>>>> shape for downstream BI tools. Essential for non-nullable columns where
>>>> nulls would break
>>>>   joins or engine operations.
>>>>   - Replace with Null — kept as a separate option for optional columns.
>>>> Serves a different policy intent than mask-to-default.
>>>>   - Alphanumeric Masking — preserves punctuation, redacts
>>>> letters/numbers with Xs.
>>>>   - Show First/Last Four — partial visibility for identifiers like
>>>> SSNs. For short strings (<4 chars), the team favored a "dumb mask, smarter
>>>> admin" approach -
>>>>   admins should pick appropriate masks rather than building complex
>>>> padding/hashing into the function itself.
>>>>   - Date Truncation — truncate to year or month (day/month replaced
>>>> with 01). Standard and undisputed.
>>>>   - SHA-256 Hashing — two approaches discussed, both are needed:
>>>>     - Query-local (random salt): allows joins within a single query but
>>>> not across sessions.
>>>>     - Global stable hash: consistent across sessions for semantic
>>>> layering.
>>>>     - We plan to continue discussing this in upcoming syncs.
>>>>
>>>>   *Action Items*
>>>>   - I will research how Apache Ranger and Oracle handle short-string
>>>> masking before finalizing the spec
>>>>   - Check with the BigQuery team on why some of their masking behaviors
>>>> are the way they are (Thanks Talat)
>>>>
>>>>   Looking forward to seeing you all in the next sync!
>>>>
>>>> Best,
>>>> Prashant Singh
>>>>
>>>> On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]>
>>>> wrote:
>>>>
>>>>> Thank you everyone for joining the call !
>>>>> Please find the recording attached [1]
>>>>> On a high level we discussed the following :
>>>>> - *Deny list vs allow list *:
>>>>> what does the client assume if a given column is not part of the
>>>>> required column projection, is it allowed to see that column or not.
>>>>> The consensus seemed to be having *DENY* as a representation,
>>>>> considering the allowlist can be huge for a very wide table. This does not
>>>>> dictate what catalog should be stored while defining its policy, some
>>>>> catalogs have both ALLOW and DENY.
>>>>>  Essentially what this DENY list means is what a client *should*
>>>>> expect when consuming Policy evaluation results.
>>>>> Note: *DENY* is generally not recommended since it can cause issues
>>>>> specially lets say a column being added and user getting access to it
>>>>> automatically but in this case since the policy evaluation results are
>>>>> coupled with the loadTable request.
>>>>> so we compute the *DENY* list considering the latest schema that was
>>>>> present at the time when it was loaded. Any new column being added to
>>>>> schema will create a new iceberg schema and clients will not have access 
>>>>> to
>>>>> it.
>>>>> I will update the PR soon with this recommendation (request you all to
>>>>> please participate)
>>>>>
>>>>> - *Why Policy Evaluation over Policy Exchange* : we discussed this
>>>>> for a bit and touched why community has been considering this approach
>>>>> mostly due to multitude of policy definition / dialects out there and this
>>>>> is equivalent to vended creds which done based on the grants the users has
>>>>> and
>>>>> defines clear instructions in a portable way to be enforced cross
>>>>> engine
>>>>>
>>>>> - *Predefined masks over dynamic mask* : The spec is trying to have
>>>>> some set of predefined actions mostly inspired by Apache Ranger and there
>>>>> was a discussion / debate around it and there seemed to support for having
>>>>> both rather than choosing one of them specially for masks such as nullify 
>>>>> /
>>>>> hash etc.
>>>>>
>>>>> - *Expression Expansion* : Iceberg expression to be more than
>>>>> predicates and its expansion to have UDF references (Iceberg UDF spec got
>>>>> ratified recently), Ryan said he will be taking a look into it soon (thank
>>>>> you so much !), we debated more on dialects etc additionally from UDF pov
>>>>>
>>>>> We plan to keep this discussion going. I see some new feedback on the
>>>>> spec PR [2] will address them and have them added to be discussed more !
>>>>>
>>>>> [1] https://www.youtube.com/watch?v=_wKszzNtP48
>>>>> [2]
>>>>> https://github.com/apache/iceberg/pull/13879#discussion_r2760180338
>>>>>
>>>>> Best,
>>>>> Prashant Singh
>>>>>
>>>>> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Bumping the thread ^^
>>>>>>
>>>>>> Looking forward to seeing you all tomorrow
>>>>>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring
>>>>>> biweekly): https://meet.google.com/gwy-jxos-jif
>>>>>>
>>>>>> I proactively added some comments in the agenda from the spec PR :
>>>>>> https://github.com/apache/iceberg/pull/13879
>>>>>>
>>>>>> Best,
>>>>>> Prashant Singh
>>>>>>
>>>>>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Iceberg rest catalog returning policy evaluation results for fine
>>>>>>> grained access control enforcement, has been discussed a couple of 
>>>>>>> times in
>>>>>>> the past as well as recently in the community, we pretty much have a
>>>>>>> broader agreement on what we wanna do at a higher level but there are 
>>>>>>> still
>>>>>>> some open questions and details to hash out details for the spec to get
>>>>>>> ratified [1].
>>>>>>>
>>>>>>> I wanted to propose a dedicate sync for discussing these and closing
>>>>>>> them, the time slot, we got was (Thanks Steven) :
>>>>>>>
>>>>>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can
>>>>>>> see the same in your dev event  calendar if you subscribe to "Iceberg 
>>>>>>> Dev
>>>>>>> Events".
>>>>>>>
>>>>>>> Please do join, we will keep the sync recorded and capture notes on
>>>>>>> the doc [2] for this sync.
>>>>>>>
>>>>>>> [1] https://github.com/apache/iceberg/pull/13879
>>>>>>> [2]
>>>>>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>>>>>>>
>>>>>>> Best,
>>>>>>> Prashant Singh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to