rdblue commented on code in PR #13879:
URL: https://github.com/apache/iceberg/pull/13879#discussion_r3399534956


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3632,6 +3632,343 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client SHOULD support the read-restrictions field. If a client 
supports
+          read-restrictions, it MUST fail if it cannot apply any returned 
restriction
+          (including unrecognized action or expression types). Read 
restrictions returned
+          in a loadTable response apply to every read operation on the loaded 
table
+          performed using this response, including subsequent planTableScan and
+          fetchScanTasks calls.
+
+          In this section, "reader" refers to the read-side actor that applies 
restrictions
+          per row or per column. "Engine" refers to the broader 
query-execution context
+          that defines query lifetime and scope (e.g. a SQL session, a single 
PyIceberg
+          scan), and is the actor responsible for query-scoped behavior such 
as salt
+          generation in sha-256-query-local.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          An empty ReadRestrictions object (no required-column-projections and 
no
+          required-row-filter) imposes no restrictions and is equivalent to 
the field
+          being absent from the response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+
+          NULL handling is action-specific. Each action's description 
specifies its
+          behavior on NULL input.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+
+          Example:
+
+            {
+              "required-column-projections": [
+                { "field-id": 4, "action": "show-last-4" },
+                { "field-id": 6, "action": "replace-with-null" },
+                { "field-id": 8, "action": "truncate-to-year" },
+                { "field-id": 10, "action": "sha-256-global" },
+                { "field-id": 12, "action": "mask-alphanum" }
+              ],
+              "required-row-filter": {
+                "type": "eq",
+                "term": "region",
+                "value": "US"
+              }
+            }
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the 
reader (including
+              unrecognized action types), the reader MUST fail the query with 
an error to
+              the caller. The reader MUST NOT silently return raw, partial, or 
empty
+              results to mask the failure.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions, the output type matches the input column type.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean (TRUE or FALSE; 
Iceberg expressions
+              never produce NULL). A reader MUST discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.
+
+            2. Row filters MUST be evaluated against the original, 
untransformed column values.
+              Required projections MUST be applied only after row filters are 
applied.
+
+            3. If a reader cannot interpret or evaluate a provided filter 
expression, it MUST
+              fail the query with an error to the caller. The reader MUST NOT 
silently return
+              partial, raw, or empty results to mask the failure.
+
+            4. If this property is absent, null, or always true then no 
mandatory filtering is required.
+          allOf:
+            - $ref: '#/components/schemas/Expression'
+
+    Action:
+      discriminator:
+        propertyName: action
+        mapping:
+          mask-alphanum: '#/components/schemas/MaskAlphanum'
+          mask-to-fixed-value: '#/components/schemas/MaskToFixedValue'
+          replace-with-null: '#/components/schemas/ReplaceWithNull'
+          show-first-4: '#/components/schemas/ShowFirst4'
+          show-last-4: '#/components/schemas/ShowLast4'
+          truncate-to-year: '#/components/schemas/TruncateToYear'
+          truncate-to-month: '#/components/schemas/TruncateToMonth'
+          sha-256-global: '#/components/schemas/Sha256Global'
+          sha-256-query-local: '#/components/schemas/Sha256QueryLocal'
+      type: object
+      required:
+        - action
+        - field-id
+      properties:
+        action:
+          type: string
+        field-id:
+          type: integer
+          description: Field ID of the column being projected.
+
+    MaskAlphanum:
+      description: >
+        Redacts the column value Unicode code point by code point using the 
following rules:
+
+        - Digits (U+0030–U+0039, 0-9) are replaced with 'n'
+        - The following punctuation characters are kept as-is:
+            U+0028 '('  LEFT PARENTHESIS
+            U+0029 ')'  RIGHT PARENTHESIS
+            U+002C ','  COMMA
+            U+002E '.'  FULL STOP
+            U+002D '-'  HYPHEN-MINUS
+            U+0040 '@'  COMMERCIAL AT
+        - All other Unicode characters (including letters, whitespace, and any 
punctuation
+          not listed above) are replaced with 'x'
+
+        For example: "[email protected]" → "[email protected]"
+
+        NULL input is preserved (NULL → NULL).
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-alphanum"
+
+    MaskToFixedValue:
+      description: >
+        Replaces the column value with a predefined type-specific fixed value.
+        Readers MUST use exactly the values listed below to ensure consistency
+        across implementations.
+
+        Fixed values by type:
+        - boolean: false
+        - int: 0
+        - long: 0
+        - float: 0.0
+        - double: 0.0
+        - decimal(p, s): 0 (zero with s digits after the decimal point, e.g. 
0.00 for decimal(p,2))
+        - string: "XXXXXXXX"
+        - date: 1970-01-01
+        - time: 00:00:00
+        - timestamp: 1970-01-01T00:00:00
+        - timestamptz: 1970-01-01T00:00:00+00:00
+        - timestamp_ns: 1970-01-01T00:00:00.000000000
+        - timestamptz_ns: 1970-01-01T00:00:00.000000000+00:00
+        - uuid: 00000000-0000-0000-0000-000000000000
+        - fixed(n): n zero bytes
+        - binary: empty byte sequence
+        - variant: {}
+        - geometry: POINT EMPTY
+        - geography: POINT EMPTY
+        - list: empty list []
+        - map: empty map {}
+        - struct: struct with each field set to its type-specific default 
(applied recursively)
+
+        NULL input is also replaced with the type-specific fixed value; NULL 
is not preserved.
+
+        Applicable to: all data types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-to-fixed-value"
+
+    ReplaceWithNull:
+      description: >
+        Replaces the entire column value with NULL. NULL input is preserved 
(NULL → NULL).
+        A server MUST NOT return this action for a non-nullable (required) 
column.
+
+        Applicable to: all nullable types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "replace-with-null"
+
+    ShowFirst4:
+      description: >
+        Preserves the first 4 Unicode code points of the column value and 
redacts the remainder
+        using mask-alphanum rules (see MaskAlphanum for the exact character 
rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "[email protected]" → "[email protected]"
+
+        NULL input is preserved (NULL → NULL).
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-first-4"
+
+    ShowLast4:
+      description: >
+        Redacts all Unicode code points except the last 4 using mask-alphanum 
rules
+        (see MaskAlphanum for the exact character rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "4111-1111-1111-4444" → "nnnn-nnnn-nnnn-4444"
+
+        NULL input is preserved (NULL → NULL).
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-last-4"
+
+    TruncateToYear:
+      description: >
+        Truncates the column value to year precision, setting month, day, and 
time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-01-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        NULL input is preserved (NULL → NULL).
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-year"
+
+    TruncateToMonth:
+      description: >
+        Truncates the column value to year and month precision, setting day 
and time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-07-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        NULL input is preserved (NULL → NULL).
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-month"
+
+    Sha256Global:
+      description: |
+        Applies SHA-256 as specified in NIST FIPS 180-4. Deterministic across 
all queries

Review Comment:
   Is there value to citing the NIST FIPS section? I think most people know 
what SHA-256 means.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to