Hi all,

  Sharing a summary of the Iceberg Read Restrictions sync on May 12, 2026,
for
  folks who couldn't attend. (As always, syncs are for discussion only.)
  Recording:  https://youtu.be/b9p6mI-k-0I

 Topics discussed

  1. NULL handling for mask-to-default

  Question: should mask-to-default preserve NULL inputs (NULL → NULL) or
replace
  them with the type-specific default (NULL → 0 / "" / epoch / etc.)? The
  direction in the room was to NOT preserve NULL - preserving leaks the
  existence of NULL, which can itself be sensitive information. The other
  actions (replace-with-null, mask-alphanum, show-first-4 / show-last-4,
  sha-256 variants, truncate-to-year / truncate-to-month) keep their natural
  NULL-preserving behavior; mask-to-default is the exception. This is
  reflected in the most recent push to PR #13879.

 2. Older clients without read-restriction support

  How should the spec handle clients that don't understand the
  read-restrictions field returned by loadTable? Direction: introduce a
generic
  client-capability header (X-Iceberg-Client-Capabilities) as a
forward-compat
  signal, separate from per-request signals like
X-Iceberg-Access-Delegation.
  Trust establishment between client and catalog stays out of scope -
  operator/catalog-implementation concern, not spec.

  A follow-up PR (#16394) has been raised to add the header and the
parameter
  component; a separate [DISCUSS] thread is being raised in parallel.

  3. Identity propagation (Trino, Spark, etc.)

  Surfaced briefly. Folks acknowledged identity propagation across
multi-tenant
  query engines is a real problem, but it's orthogonal to the spec - it's
  a catalog / auth-manager concern. Not in scope for #13879.

  4. Actions API placement

  Continued discussion on where action definitions live in the Java API.
  Direction: ship as utility functions in the api module, mirroring the
  existing Transform pattern. Don't expose a top-level "Action" type in
  public Java API — keep "Action" as a REST-spec construct only.


  Follow-ups

    - PR #13879 (FGAC read restrictions): incorporated the discussion above
and
      continue iterating on reviewer feedback as they come (please take
another look).
    - PR #16394 (X-Iceberg-Client-Capabilities header): raised, [DISCUSS]
      thread incoming.
    - Next sync: bi-weekly cadence  - see the sync notes doc for date.

  Links

    - Recording: https://youtu.be/b9p6mI-k-0I
    - Sync notes doc:
https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit
    - PR #13879: https://github.com/apache/iceberg/pull/13879
    - PR #16394: https://github.com/apache/iceberg/pull/16394
      - DISCUSS:
https://lists.apache.org/thread/xlqx6k7g625p38bxxy141wt02d00w2h4

  Thanks to everyone who joined.

  Prashant

On Mon, Apr 27, 2026 at 9:05 AM Prashant Singh <[email protected]>
wrote:

>   Thank you all for joining the syncs so far!
>
>   After much discussion and debate, we've narrowed things down to a final
> list of 9 predefined actions.
>
>   Spec update: I updated the spec PR [1] last week - bumping it here as
> well. Please take a look when you get a chance!
>
>   POC progress: I've been prototyping with both a SQL client and a NoSQL
> client. The Apache Spark integration [2] fits cleanly. For NoSQL, I'm
> working on the Iceberg Generics reader [3] and py-iceberg in parallel.
>
>   Upcoming sync - *04/28* agenda:
>   1. Default mask values per type (I've added some initial proposals to
> kick off the conversation)
>   2. How and where to put Actions in the iceberg java, where to keep it to
> make engine integration seamless and reusable (ofc engines are free to
> implement their own)
>
>   Past sync notes and recordings are available here [4].
>
>   Looking forward to seeing everyone at the next sync. Thank you for all
> your valuable feedback!
>
>   Best,
>   Prashant Singh
>
>   [1] https://github.com/apache/iceberg/pull/13879
>   [2] https://github.com/apache/iceberg/pull/16082
>   [3] https://github.com/apache/iceberg/pull/16131
>   [4]
> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>
> On Mon, Mar 23, 2026 at 7:22 PM Prashant Singh <[email protected]>
> wrote:
>
>>   Hi all,
>>
>>   Here is the summary for *Iceberg Read Restrictions Sync (03/17)*.
>>
>>   Recording: https://www.youtube.com/watch?v=LObBU3r_GXg
>>   Design Doc:
>> https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit
>>
>>   *Execution Order (agreed)*
>>   1. Authorization predicates and row filters on unmasked data
>>   2. Column masks applied
>>   3. User query filters on masked data
>>
>>   This prevents point attacks where users craft filters to deduce masked
>> values. If user filters get pushed down before masking, that's the engine's
>>   responsibility to handle correctly and not let it be open to
>> exploitation.
>>
>>   *Masking Functions*
>>   - Mask to Default — type-specific constant values; preserves schema
>> shape for downstream BI tools. Essential for non-nullable columns where
>> nulls would break
>>   joins or engine operations.
>>   - Replace with Null — kept as a separate option for optional columns.
>> Serves a different policy intent than mask-to-default.
>>   - Alphanumeric Masking — preserves punctuation, redacts letters/numbers
>> with Xs.
>>   - Show First/Last Four — partial visibility for identifiers like SSNs.
>> For short strings (<4 chars), the team favored a "dumb mask, smarter admin"
>> approach -
>>   admins should pick appropriate masks rather than building complex
>> padding/hashing into the function itself.
>>   - Date Truncation — truncate to year or month (day/month replaced with
>> 01). Standard and undisputed.
>>   - SHA-256 Hashing — two approaches discussed, both are needed:
>>     - Query-local (random salt): allows joins within a single query but
>> not across sessions.
>>     - Global stable hash: consistent across sessions for semantic
>> layering.
>>     - We plan to continue discussing this in upcoming syncs.
>>
>>   *Action Items*
>>   - I will research how Apache Ranger and Oracle handle short-string
>> masking before finalizing the spec
>>   - Check with the BigQuery team on why some of their masking behaviors
>> are the way they are (Thanks Talat)
>>
>>   Looking forward to seeing you all in the next sync!
>>
>> Best,
>> Prashant Singh
>>
>> On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]>
>> wrote:
>>
>>> Thank you everyone for joining the call !
>>> Please find the recording attached [1]
>>> On a high level we discussed the following :
>>> - *Deny list vs allow list *:
>>> what does the client assume if a given column is not part of the
>>> required column projection, is it allowed to see that column or not.
>>> The consensus seemed to be having *DENY* as a representation,
>>> considering the allowlist can be huge for a very wide table. This does not
>>> dictate what catalog should be stored while defining its policy, some
>>> catalogs have both ALLOW and DENY.
>>>  Essentially what this DENY list means is what a client *should* expect
>>> when consuming Policy evaluation results.
>>> Note: *DENY* is generally not recommended since it can cause issues
>>> specially lets say a column being added and user getting access to it
>>> automatically but in this case since the policy evaluation results are
>>> coupled with the loadTable request.
>>> so we compute the *DENY* list considering the latest schema that was
>>> present at the time when it was loaded. Any new column being added to
>>> schema will create a new iceberg schema and clients will not have access to
>>> it.
>>> I will update the PR soon with this recommendation (request you all to
>>> please participate)
>>>
>>> - *Why Policy Evaluation over Policy Exchange* : we discussed this for
>>> a bit and touched why community has been considering this approach mostly
>>> due to multitude of policy definition / dialects out there and this is
>>> equivalent to vended creds which done based on the grants the users has and
>>> defines clear instructions in a portable way to be enforced cross engine
>>>
>>> - *Predefined masks over dynamic mask* : The spec is trying to have
>>> some set of predefined actions mostly inspired by Apache Ranger and there
>>> was a discussion / debate around it and there seemed to support for having
>>> both rather than choosing one of them specially for masks such as nullify /
>>> hash etc.
>>>
>>> - *Expression Expansion* : Iceberg expression to be more than
>>> predicates and its expansion to have UDF references (Iceberg UDF spec got
>>> ratified recently), Ryan said he will be taking a look into it soon (thank
>>> you so much !), we debated more on dialects etc additionally from UDF pov
>>>
>>> We plan to keep this discussion going. I see some new feedback on the
>>> spec PR [2] will address them and have them added to be discussed more !
>>>
>>> [1] https://www.youtube.com/watch?v=_wKszzNtP48
>>> [2] https://github.com/apache/iceberg/pull/13879#discussion_r2760180338
>>>
>>> Best,
>>> Prashant Singh
>>>
>>> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh <[email protected]>
>>> wrote:
>>>
>>>> Bumping the thread ^^
>>>>
>>>> Looking forward to seeing you all tomorrow
>>>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring
>>>> biweekly): https://meet.google.com/gwy-jxos-jif
>>>>
>>>> I proactively added some comments in the agenda from the spec PR :
>>>> https://github.com/apache/iceberg/pull/13879
>>>>
>>>> Best,
>>>> Prashant Singh
>>>>
>>>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Iceberg rest catalog returning policy evaluation results for fine
>>>>> grained access control enforcement, has been discussed a couple of times 
>>>>> in
>>>>> the past as well as recently in the community, we pretty much have a
>>>>> broader agreement on what we wanna do at a higher level but there are 
>>>>> still
>>>>> some open questions and details to hash out details for the spec to get
>>>>> ratified [1].
>>>>>
>>>>> I wanted to propose a dedicate sync for discussing these and closing
>>>>> them, the time slot, we got was (Thanks Steven) :
>>>>>
>>>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can see
>>>>> the same in your dev event  calendar if you subscribe to "Iceberg Dev
>>>>> Events".
>>>>>
>>>>> Please do join, we will keep the sync recorded and capture notes on
>>>>> the doc [2] for this sync.
>>>>>
>>>>> [1] https://github.com/apache/iceberg/pull/13879
>>>>> [2]
>>>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>>>>>
>>>>> Best,
>>>>> Prashant Singh
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>

Reply via email to