laurentgo commented on code in PR #13879:
URL: https://github.com/apache/iceberg/pull/13879#discussion_r2450351928
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
Review Comment:
(nit) isn't it `truncate[4](cc)`?
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
Review Comment:
It could be more than that (for example, could be based on the environment
as well). Should we just summarize it as the authorization context?
I guess it also implies that caching may be impacted as well:
- for example `Etag` should be different for each response
- `Vary` header should be set by the server to include `Authorization` (or
any other header related to authorization) and clients supporting caching
should check
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
+ and truncate(3, col)) MUST NOT appear in the required-projection.
+
+ 5. If a projection entry includes a transform that the reader
cannot evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 6. An identity transform is equivalent to projecting the column
directly.
+ A reader MAY represent it in either form.
+
+ 7. The data type of the projected column MUST match the data type
defined for the transform result.
+
+ type: array
+ items:
+ $ref: '#/components/schemas/Term'
Review Comment:
I would favor a more explicit structure to associate a column id/name with a
term (can still be a list of structures) over just a list of `Term`.
The current proposal means that the column whose projection is being applied
to needs to be inferred from the expression instead of being explicit, which
probably explain some of those rules/restriction above like the fact that a
column name can only appear once for all expressions.
It also means that complex expressions (like calling UDF) may be quite
limited (if for example a UDF uses multiple columns in order to compute the
appropriate transformation).
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
Review Comment:
This could be interpreted as ambiguous because an empty list could either
mean no projection or that all columns are not readable (according to 1. ->
"Columns not listed in the required-projection MUST NOT be read.")
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,74 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
+ and truncate(3, col)) MUST NOT appear in the required-projection.
+
+ 5. If a projection entry includes a transform that the reader
cannot evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 6. An identity transform is equivalent to projecting the column
directly.
+ A reader MAY represent it in either form.
+
+ 7. The data type of the projected column MUST match the data type
defined for the transform result.
+
+ type: array
+ items:
+ $ref: '#/components/schemas/Term'
+ required-row-filter:
+ description: >
+ An expression that filters rows in the table.
+
+ 1. A reader MUST discard any row for which the filter evaluates to
false, and
Review Comment:
If we apply the same standard as SQL, the resulting collection is that the
collection of rows for which the filter expression returns `TRUE` (and so it
means in practice that if the expression returns `FALSE` or `null`, row is
being discarded).
Maybe phrased this way, we do not need to really discuss nullability?
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
Review Comment:
I'm trying to understand how this is different from 2. and 4.? Do you have
an example in mind?
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
Review Comment:
I would agree with @adutra that `required-projections` (with an s) seems
more idiomatic as the representation is an array (and the description says `a
list of projections` as well. Although it is kind of implied by the term
"projection", I miss the column mention in the field name (especially because
we have `required-row-filter` vs `required-filter`). Would you be open to use
`required-column-projections` then? or just `column-projections` and
`row-filter` (considering the fact that since this is the `ReadRestriction`
object, everything is required?)
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
Review Comment:
It's actually not quite accurate as row filter should be applied before
column projection, so if `cc` is present in the row filter, the expression
should be on the raw value, not on the transformed one.
(There's a mention about it down below but I believe we should avoid
appearances of contradiction if we can avoid it)
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,74 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
+ and truncate(3, col)) MUST NOT appear in the required-projection.
+
+ 5. If a projection entry includes a transform that the reader
cannot evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 6. An identity transform is equivalent to projecting the column
directly.
+ A reader MAY represent it in either form.
+
+ 7. The data type of the projected column MUST match the data type
defined for the transform result.
Review Comment:
Rules for implicit conversions are tricky (and may be engine specific) so I
would be okay with strict type matching
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
Review Comment:
Not sure why we need such a restriction (except if the reason is because of
the chosen representation)
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
+ and truncate(3, col)) MUST NOT appear in the required-projection.
+
+ 5. If a projection entry includes a transform that the reader
cannot evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 6. An identity transform is equivalent to projecting the column
directly.
+ A reader MAY represent it in either form.
+
+ 7. The data type of the projected column MUST match the data type
defined for the transform result.
+
+ type: array
+ items:
+ $ref: '#/components/schemas/Term'
+ required-row-filter:
+ description: >
+ An expression that filters rows in the table.
+
+ 1. A reader MUST discard any row for which the filter evaluates to
false, and
+ no information derived from discarded rows MAY be included in
the query result.
+
+ 2. Row filters MUST be evaluated against the original,
untransformed column values.
+ Required projections MUST be applied only after row filters are
applied.
+
+ 3. If the catalog supports multiple row access filters for the
table, it is
Review Comment:
not sure if this is necessary considering only only expression can be
returned?
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
Review Comment:
It may be "obvious" but it may be worth mentioning that the expressions are
to be evaluated in the context of this table (meaning that any reference
present in each `Term` instance should be interpreted as a column name from the
table) so that there's no ambiguity about it (if we want to be formal about it)
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3260,6 +3260,71 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projection and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projection:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent or empty, no
mandatory projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate(4, cc) MUST be projected as truncate(4,
cc) AS cc,
+ and all references to cc during query evaluation MUST resolve
to this alias).
+ - If a listed column has no transform, the reader MUST read it
as-is.
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. A projection entry MUST reference either the column itself or
exactly one
+ transformed version of the column, but not both.
+
+ 4. Multiple transformed versions of the same column (e.g.,
truncate(5, col)
+ and truncate(3, col)) MUST NOT appear in the required-projection.
+
+ 5. If a projection entry includes a transform that the reader
cannot evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 6. An identity transform is equivalent to projecting the column
directly.
+ A reader MAY represent it in either form.
+
+ 7. The data type of the projected column MUST match the data type
defined for the transform result.
+
+ type: array
+ items:
+ $ref: '#/components/schemas/Term'
+ required-row-filter:
+ description: >
+ An expression that filters rows in the table.
Review Comment:
It may be "obvious" but it may be worth mentioning that the expression is to
be evaluated in the context of this table (meaning that any reference present
in the expression should be interpreted as a column name from the table) so
that there's no ambiguity about it (if we want to be formal about it)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]