Thank you all for joining the sync today (06/09) Summarizing the sync for folks who couldn't attend. (AI assisted) recording : https://youtu.be/GUhCr-PnSo8
Where the room landed: 1. *Capabilities header* - leaning drop. It was effectively a versioning workaround, and as a security mechanism it's spoofable. Trust must be established out-of-band (mTLS / OAuth / on-behalf-of), not via a client-set header. 2. *V1 carries read-restrictions now, with conditional wording*. Adjusted the spec - please have a look. Field stays optional. Let implementations move without waiting on V2. 3. *V2 endpoint wording*. V2 would tighten this to an unconditional MUST — opting into V2 = committing to parse read-restrictions and fail-closed on anything unknown. 4. Two contracts, kept separate: - Correctness contract (in spec): parse + fail-on-unknown. - Trust contract (out of spec): admin/catalog decides which clients are non-malicious and wired up to enforce. Both are required; neither replaces the other. 5. *No V2 capability header for now*. A V2-side header to signal "I can enforce" was discussed for fail-fast, but the case felt narrow. Open to revisiting. Open items rolling into the broader V2 conversation: - Optional fail-fast capability header. - V4 table support (null vs empty snapshot list). *Call for review:* - Ryan's expressions-spec PR (read-restrictions has a dependency on it): https://github.com/apache/iceberg/pull/16652 - Read restrictions PR — updated the wording ....: https://github.com/apache/iceberg/pull/13879 Best, Prashant Singh On Wed, May 27, 2026 at 11:57 PM Prashant Singh <[email protected]> wrote: > Hi all, > Quick summary from yesterday's community sync for those who couldn't > attend. Recording: https://youtu.be/-KEesN1udyY > 1. *Client Capabilities Header (PR #16394) > <https://github.com/apache/iceberg/pull/16394>* > Re-discussed the generic vs. per-feature header debate from the 05/12 > sync. Rough consensus leaned toward a single > generic X-Iceberg-Client-Capabilities header, with these clarifications: > - *Advisory, not authoritative*. Servers MUST NOT use the header for > trust or authorization decisions. Trust is > established out-of-band (mTLS / OAuth / engine identity). > - *No versioning at the header level yet*. The fail-closed contract > ("client MUST fail on unrecognized payload > contents") handles forward compatibility. New incompatible behavior > would be a new capability token, not a version > suffix. > - *No change to existing per-request directives*. > X-Iceberg-Access-Delegation stays as-is for vended-credentials vs. > remote-signing selection. Capabilities and per-request preferences are > kept as separate mechanisms. > > Dan was absent for the final stretch, following up separately on the > list before revising the PR. > > 2. *Expression Spec Enhancement* > ID-based field references (needed for unbounded row filter expressions > in read restrictions) will move forward. Spec > write-up to follow. In the meantime, the read-restrictions spec PR will > reference the planned change. > > 3. *Read Restrictions Spec - Call for Review* > The spec PR (#13879) <https://github.com/apache/iceberg/pull/13879> is > ready for another review pass. It will be updated with the expression spec > reference once that > write-up lands, but the rest of the content is in shape for review now. > Would appreciate eyes on it. > > 4. *API for Action functions (PR#16198 > <https://github.com/apache/iceberg/pull/16198>) - Call for Review* > Generic functions module + actions wrapper is in the PR with end-to-end > plumbing. > > Thanks, > Prashant > > On Mon, May 18, 2026 at 12:01 PM Prashant Singh <[email protected]> > wrote: > >> Hi all, >> >> Sharing a summary of the Iceberg Read Restrictions sync on May 12, >> 2026, for >> folks who couldn't attend. (As always, syncs are for discussion only.) >> Recording: https://youtu.be/b9p6mI-k-0I >> >> Topics discussed >> >> 1. NULL handling for mask-to-default >> >> Question: should mask-to-default preserve NULL inputs (NULL → NULL) or >> replace >> them with the type-specific default (NULL → 0 / "" / epoch / etc.)? The >> direction in the room was to NOT preserve NULL - preserving leaks the >> existence of NULL, which can itself be sensitive information. The other >> actions (replace-with-null, mask-alphanum, show-first-4 / show-last-4, >> sha-256 variants, truncate-to-year / truncate-to-month) keep their >> natural >> NULL-preserving behavior; mask-to-default is the exception. This is >> reflected in the most recent push to PR #13879. >> >> 2. Older clients without read-restriction support >> >> How should the spec handle clients that don't understand the >> read-restrictions field returned by loadTable? Direction: introduce a >> generic >> client-capability header (X-Iceberg-Client-Capabilities) as a >> forward-compat >> signal, separate from per-request signals like >> X-Iceberg-Access-Delegation. >> Trust establishment between client and catalog stays out of scope - >> operator/catalog-implementation concern, not spec. >> >> A follow-up PR (#16394) has been raised to add the header and the >> parameter >> component; a separate [DISCUSS] thread is being raised in parallel. >> >> 3. Identity propagation (Trino, Spark, etc.) >> >> Surfaced briefly. Folks acknowledged identity propagation across >> multi-tenant >> query engines is a real problem, but it's orthogonal to the spec - it's >> a catalog / auth-manager concern. Not in scope for #13879. >> >> 4. Actions API placement >> >> Continued discussion on where action definitions live in the Java API. >> Direction: ship as utility functions in the api module, mirroring the >> existing Transform pattern. Don't expose a top-level "Action" type in >> public Java API — keep "Action" as a REST-spec construct only. >> >> >> Follow-ups >> >> - PR #13879 (FGAC read restrictions): incorporated the discussion >> above and >> continue iterating on reviewer feedback as they come (please take >> another look). >> - PR #16394 (X-Iceberg-Client-Capabilities header): raised, [DISCUSS] >> thread incoming. >> - Next sync: bi-weekly cadence - see the sync notes doc for date. >> >> Links >> >> - Recording: https://youtu.be/b9p6mI-k-0I >> - Sync notes doc: >> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit >> - PR #13879: https://github.com/apache/iceberg/pull/13879 >> - PR #16394: https://github.com/apache/iceberg/pull/16394 >> - DISCUSS: >> https://lists.apache.org/thread/xlqx6k7g625p38bxxy141wt02d00w2h4 >> >> Thanks to everyone who joined. >> >> Prashant >> >> On Mon, Apr 27, 2026 at 9:05 AM Prashant Singh <[email protected]> >> wrote: >> >>> Thank you all for joining the syncs so far! >>> >>> After much discussion and debate, we've narrowed things down to a >>> final list of 9 predefined actions. >>> >>> Spec update: I updated the spec PR [1] last week - bumping it here as >>> well. Please take a look when you get a chance! >>> >>> POC progress: I've been prototyping with both a SQL client and a NoSQL >>> client. The Apache Spark integration [2] fits cleanly. For NoSQL, I'm >>> working on the Iceberg Generics reader [3] and py-iceberg in parallel. >>> >>> Upcoming sync - *04/28* agenda: >>> 1. Default mask values per type (I've added some initial proposals to >>> kick off the conversation) >>> 2. How and where to put Actions in the iceberg java, where to keep it >>> to make engine integration seamless and reusable (ofc engines are free to >>> implement their own) >>> >>> Past sync notes and recordings are available here [4]. >>> >>> Looking forward to seeing everyone at the next sync. Thank you for all >>> your valuable feedback! >>> >>> Best, >>> Prashant Singh >>> >>> [1] https://github.com/apache/iceberg/pull/13879 >>> [2] https://github.com/apache/iceberg/pull/16082 >>> [3] https://github.com/apache/iceberg/pull/16131 >>> [4] >>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps >>> >>> On Mon, Mar 23, 2026 at 7:22 PM Prashant Singh <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> >>>> Here is the summary for *Iceberg Read Restrictions Sync (03/17)*. >>>> >>>> Recording: https://www.youtube.com/watch?v=LObBU3r_GXg >>>> Design Doc: >>>> https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit >>>> >>>> *Execution Order (agreed)* >>>> 1. Authorization predicates and row filters on unmasked data >>>> 2. Column masks applied >>>> 3. User query filters on masked data >>>> >>>> This prevents point attacks where users craft filters to deduce >>>> masked values. If user filters get pushed down before masking, that's the >>>> engine's >>>> responsibility to handle correctly and not let it be open to >>>> exploitation. >>>> >>>> *Masking Functions* >>>> - Mask to Default — type-specific constant values; preserves schema >>>> shape for downstream BI tools. Essential for non-nullable columns where >>>> nulls would break >>>> joins or engine operations. >>>> - Replace with Null — kept as a separate option for optional columns. >>>> Serves a different policy intent than mask-to-default. >>>> - Alphanumeric Masking — preserves punctuation, redacts >>>> letters/numbers with Xs. >>>> - Show First/Last Four — partial visibility for identifiers like >>>> SSNs. For short strings (<4 chars), the team favored a "dumb mask, smarter >>>> admin" approach - >>>> admins should pick appropriate masks rather than building complex >>>> padding/hashing into the function itself. >>>> - Date Truncation — truncate to year or month (day/month replaced >>>> with 01). Standard and undisputed. >>>> - SHA-256 Hashing — two approaches discussed, both are needed: >>>> - Query-local (random salt): allows joins within a single query but >>>> not across sessions. >>>> - Global stable hash: consistent across sessions for semantic >>>> layering. >>>> - We plan to continue discussing this in upcoming syncs. >>>> >>>> *Action Items* >>>> - I will research how Apache Ranger and Oracle handle short-string >>>> masking before finalizing the spec >>>> - Check with the BigQuery team on why some of their masking behaviors >>>> are the way they are (Thanks Talat) >>>> >>>> Looking forward to seeing you all in the next sync! >>>> >>>> Best, >>>> Prashant Singh >>>> >>>> On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]> >>>> wrote: >>>> >>>>> Thank you everyone for joining the call ! >>>>> Please find the recording attached [1] >>>>> On a high level we discussed the following : >>>>> - *Deny list vs allow list *: >>>>> what does the client assume if a given column is not part of the >>>>> required column projection, is it allowed to see that column or not. >>>>> The consensus seemed to be having *DENY* as a representation, >>>>> considering the allowlist can be huge for a very wide table. This does not >>>>> dictate what catalog should be stored while defining its policy, some >>>>> catalogs have both ALLOW and DENY. >>>>> Essentially what this DENY list means is what a client *should* >>>>> expect when consuming Policy evaluation results. >>>>> Note: *DENY* is generally not recommended since it can cause issues >>>>> specially lets say a column being added and user getting access to it >>>>> automatically but in this case since the policy evaluation results are >>>>> coupled with the loadTable request. >>>>> so we compute the *DENY* list considering the latest schema that was >>>>> present at the time when it was loaded. Any new column being added to >>>>> schema will create a new iceberg schema and clients will not have access >>>>> to >>>>> it. >>>>> I will update the PR soon with this recommendation (request you all to >>>>> please participate) >>>>> >>>>> - *Why Policy Evaluation over Policy Exchange* : we discussed this >>>>> for a bit and touched why community has been considering this approach >>>>> mostly due to multitude of policy definition / dialects out there and this >>>>> is equivalent to vended creds which done based on the grants the users has >>>>> and >>>>> defines clear instructions in a portable way to be enforced cross >>>>> engine >>>>> >>>>> - *Predefined masks over dynamic mask* : The spec is trying to have >>>>> some set of predefined actions mostly inspired by Apache Ranger and there >>>>> was a discussion / debate around it and there seemed to support for having >>>>> both rather than choosing one of them specially for masks such as nullify >>>>> / >>>>> hash etc. >>>>> >>>>> - *Expression Expansion* : Iceberg expression to be more than >>>>> predicates and its expansion to have UDF references (Iceberg UDF spec got >>>>> ratified recently), Ryan said he will be taking a look into it soon (thank >>>>> you so much !), we debated more on dialects etc additionally from UDF pov >>>>> >>>>> We plan to keep this discussion going. I see some new feedback on the >>>>> spec PR [2] will address them and have them added to be discussed more ! >>>>> >>>>> [1] https://www.youtube.com/watch?v=_wKszzNtP48 >>>>> [2] >>>>> https://github.com/apache/iceberg/pull/13879#discussion_r2760180338 >>>>> >>>>> Best, >>>>> Prashant Singh >>>>> >>>>> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh < >>>>> [email protected]> wrote: >>>>> >>>>>> Bumping the thread ^^ >>>>>> >>>>>> Looking forward to seeing you all tomorrow >>>>>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring >>>>>> biweekly): https://meet.google.com/gwy-jxos-jif >>>>>> >>>>>> I proactively added some comments in the agenda from the spec PR : >>>>>> https://github.com/apache/iceberg/pull/13879 >>>>>> >>>>>> Best, >>>>>> Prashant Singh >>>>>> >>>>>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Iceberg rest catalog returning policy evaluation results for fine >>>>>>> grained access control enforcement, has been discussed a couple of >>>>>>> times in >>>>>>> the past as well as recently in the community, we pretty much have a >>>>>>> broader agreement on what we wanna do at a higher level but there are >>>>>>> still >>>>>>> some open questions and details to hash out details for the spec to get >>>>>>> ratified [1]. >>>>>>> >>>>>>> I wanted to propose a dedicate sync for discussing these and closing >>>>>>> them, the time slot, we got was (Thanks Steven) : >>>>>>> >>>>>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can >>>>>>> see the same in your dev event calendar if you subscribe to "Iceberg >>>>>>> Dev >>>>>>> Events". >>>>>>> >>>>>>> Please do join, we will keep the sync recorded and capture notes on >>>>>>> the doc [2] for this sync. >>>>>>> >>>>>>> [1] https://github.com/apache/iceberg/pull/13879 >>>>>>> [2] >>>>>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps >>>>>>> >>>>>>> Best, >>>>>>> Prashant Singh >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>
