stevenzwu commented on code in PR #12584:
URL: https://github.com/apache/iceberg/pull/12584#discussion_r3370074048


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -4387,338 +4367,393 @@ components:
           items:
             $ref: '#/components/schemas/TableIdentifier'
 
-    ListFunctionsResponse:
+    ListNamespacesResponse:
       type: object
       properties:
         next-page-token:
           $ref: '#/components/schemas/PageToken'
-        identifiers:
+        namespaces:
           type: array
           uniqueItems: true
           items:
-            $ref: '#/components/schemas/CatalogObjectIdentifier'
+            $ref: '#/components/schemas/Namespace'
 
-    ListNamespacesResponse:
+    UpdateNamespacePropertiesResponse:
       type: object
+      required:
+        - updated
+        - removed
       properties:
-        next-page-token:
-          $ref: '#/components/schemas/PageToken'
-        namespaces:
+        updated:
+          description: List of property keys that were added or updated
           type: array
           uniqueItems: true
           items:
-            $ref: '#/components/schemas/Namespace'
-
-    LoadFunctionResult:
-      description: |
-        Result returned when a function is loaded from the catalog.
-
+            type: string
+        removed:
+          description: List of properties that were removed
+          type: array
+          items:
+            type: string
+        missing:
+          type: array
+          items:
+            type: string
+          description:
+            List of properties requested for removal that were not found
+            in the namespace's properties. Represents a partial success 
response.
+            Server's do not need to implement this.
+          nullable: true
 
-        The function metadata JSON is returned in the `metadata` field. The 
location of the metadata
-        file is returned in the `metadata-location` field, if available.
+    CommitTableResponse:
       type: object
       required:
+        - metadata-location
         - metadata
       properties:
         metadata-location:
           type: string
         metadata:
-          $ref: '#/components/schemas/FunctionMetadata'
-
-    FunctionMetadata:
-      description: |
-        Portable UDF metadata format.
-
+          $ref: '#/components/schemas/TableMetadata'
 
-        Each function is represented by a self-contained metadata file. The 
`format-version` field
-        identifies the UDF metadata format.
+    QueryEventsResponse:
       type: object
       required:
-        - function-uuid
-        - format-version
-        - definitions
-        - definition-log
+        - continuation-token
+        - events
       properties:
-        function-uuid:
-          type: string
-          format: uuid
-          description: A UUID that identifies this UDF, generated once at 
creation.
-        format-version:
-          type: integer
-          description: UDF specification format version (must be 1).
-          minimum: 1
-          maximum: 1
-        definitions:
-          type: array
-          description: List of function definition entities.
-          items:
-            $ref: '#/components/schemas/FunctionDefinition'
-        definition-log:
+        continuation-token:
+          type: string
+          description: >
+            An opaque continuation token to fetch the next page of events.
+            This token encodes the server's cursor position and filter state.
+            Clients should treat this as an opaque value and pass it 
unmodified in subsequent requests.
+            When no more events are currently available, the server returns an 
empty `events` array
+            and a `continuation-token` that the client can re-issue later to 
receive events that
+            occur after this point.
+        events:
           type: array
-          description: History of versions within the function's definitions.
           items:
-            $ref: '#/components/schemas/FunctionDefinitionLogEntry'
-        location:
-          type: string
-          description: The function's base location. This is used to store 
function metadata files.
-        properties:
-          type: object
-          description: A string-to-string map of properties.
-          additionalProperties:
-            type: string
-        secure:
-          type: boolean
-          description: Whether it is a secure function.
-          default: false
-        doc:
-          type: string
-          description: Documentation string.
+            $ref: "#/components/schemas/Event"
 
-    FunctionDefinition:
+    Event:
       type: object
       required:
-        - definition-id
-        - parameters
-        - return-type
-        - versions
-        - current-version-id
-        - function-type
+        - event-id
+        - request-id
+        - request-event-count
+        - timestamp-ms
+        - operation
       properties:
-        definition-id:
+        event-id:
           type: string
-          description: A canonical string derived from the parameter types, 
formatted as a comma-separated list with no spaces.
-        parameters:
-          type: array
-          description: Ordered list of function parameters. Invocation order 
must match this list.
-          items:
-            $ref: '#/components/schemas/FunctionParameter'
-        return-type:
-          $ref: '#/components/schemas/FunctionDataType'
-        return-nullable:
-          type: boolean
-          description: A hint to indicate whether the return value is nullable 
or not.
-          default: true
-        versions:
-          type: array
-          description: Versioned implementations of this definition.
-          items:
-            $ref: '#/components/schemas/FunctionDefinitionVersion'
-        current-version-id:
-          type: integer
-          description: Identifier of the current version for this definition.
-        function-type:
-          type: string
-          description: Function type. When set to "udtf", "return-type" must 
be a struct describing the output schema.
-          enum: ["udf", "udtf"]
-        doc:
+          description: Unique ID of this event. Clients should perform 
deduplication based on this ID.
+        request-id:
+          description: >
+            Opaque ID of the request this change belongs to.
+            This ID can be used to identify events that were part of the same 
request.
           type: string
-          description: Documentation string.
-
-    FunctionParameter:
+        request-event-count:
+          type: integer
+          description: >
+            Number of events produced by the originating catalog request (e.g. 
updateTable
+            or commitTransaction) that generated this event. Such requests can 
apply multiple
+            updates atomically, each surfaced as a separate event sharing the 
same `request-id`;
+            this count reports how many events that originating request 
produced in total.
+        timestamp-ms:
+          type: integer
+          format: int64
+          description: >
+            Timestamp when this event occurred (epoch milliseconds).
+            Timestamps are not guaranteed to be unique. Typically all events in
+            a transaction will have the same timestamp.
+        actor:
+          type: object
+          description: >
+            The actor who performed the operation, such as a user or service 
account.
+            Implementations may add arbitrary additional fields; the optional 
`id` field is
+            recommended as a portable identifier that consumers can render and 
key on.
+          properties:
+            id:
+              type: string
+              description: >
+                Recommended, optional. Stable identifier of the actor (e.g. a 
user or
+                service account). Provided when the server can attribute the 
operation.
+          additionalProperties: true
+        operation:
+          description: >
+            The operation that was performed, such as creating or updating a 
table.
+            The concrete type is selected by the `operation-type` 
discriminator defined on
+            `BaseOperation`. Clients should discard events with unknown 
operation types.
+            Operations are emitted only when the underlying change is 
committed; staged changes
+            are not surfaced.
+          anyOf:
+            - $ref: "#/components/schemas/CreateTableOperation"
+            - $ref: "#/components/schemas/RegisterTableOperation"
+            - $ref: "#/components/schemas/DropTableOperation"
+            - $ref: "#/components/schemas/UpdateTableOperation"
+            - $ref: "#/components/schemas/RenameTableOperation"
+            - $ref: "#/components/schemas/CreateViewOperation"
+            - $ref: "#/components/schemas/DropViewOperation"
+            - $ref: "#/components/schemas/UpdateViewOperation"
+            - $ref: "#/components/schemas/RenameViewOperation"
+            - $ref: "#/components/schemas/CreateNamespaceOperation"
+            - $ref: "#/components/schemas/UpdateNamespacePropertiesOperation"
+            - $ref: "#/components/schemas/DropNamespaceOperation"
+
+    BaseOperation:
       type: object
+      description: >
+        Base type for all catalog operations carried by an `Event`. The 
`operation-type`
+        discriminator selects the concrete operation schema. Standard 
operation types are
+        enumerated in `OperationType`. Implementation-specific operation types 
use an `x-` prefix
+        (see `OperationType`) and are carried as a `BaseOperation` with 
implementation-defined
+        additional properties; clients should discard operations with unknown 
`operation-type`

Review Comment:
   On reflection, this is over-specifying parser implementation. Apologies — my 
earlier comments pushed in this direction; on closer look, the existing 
precedent goes the other way.
   
   The two schemas in the spec where Java parsers actually fall through to an 
Unknown variant — `ReportMetricsRequest` and `ViewRepresentation` — have **no** 
prose at all telling clients what to do with unknown types. They just list the 
known concrete subtypes and stop:
   
   Whether and how to dispatch unknowns is an impl decision, not a wire-level 
contract. The "clients should ignore (discard) events with unknown operation 
types" + `x-` prefix guidance already lives on `OperationType.description` — 
one place is enough.
   
   Suggest trimming to:
   
   ```
   Base type for all catalog operations carried by an `Event`. The 
`operation-type`
   discriminator selects the concrete operation schema. Standard operation 
types are
   enumerated in `OperationType`; implementation-specific operation types use an
   `x-` prefix (see `OperationType`).
   ```



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -4387,338 +4367,393 @@ components:
           items:
             $ref: '#/components/schemas/TableIdentifier'
 
-    ListFunctionsResponse:
+    ListNamespacesResponse:
       type: object
       properties:
         next-page-token:
           $ref: '#/components/schemas/PageToken'
-        identifiers:
+        namespaces:
           type: array
           uniqueItems: true
           items:
-            $ref: '#/components/schemas/CatalogObjectIdentifier'
+            $ref: '#/components/schemas/Namespace'
 
-    ListNamespacesResponse:
+    UpdateNamespacePropertiesResponse:
       type: object
+      required:
+        - updated
+        - removed
       properties:
-        next-page-token:
-          $ref: '#/components/schemas/PageToken'
-        namespaces:
+        updated:
+          description: List of property keys that were added or updated
           type: array
           uniqueItems: true
           items:
-            $ref: '#/components/schemas/Namespace'
-
-    LoadFunctionResult:
-      description: |
-        Result returned when a function is loaded from the catalog.
-
+            type: string
+        removed:
+          description: List of properties that were removed
+          type: array
+          items:
+            type: string
+        missing:
+          type: array
+          items:
+            type: string
+          description:
+            List of properties requested for removal that were not found
+            in the namespace's properties. Represents a partial success 
response.
+            Server's do not need to implement this.
+          nullable: true
 
-        The function metadata JSON is returned in the `metadata` field. The 
location of the metadata
-        file is returned in the `metadata-location` field, if available.
+    CommitTableResponse:
       type: object
       required:
+        - metadata-location
         - metadata
       properties:
         metadata-location:
           type: string
         metadata:
-          $ref: '#/components/schemas/FunctionMetadata'
-
-    FunctionMetadata:
-      description: |
-        Portable UDF metadata format.
-
+          $ref: '#/components/schemas/TableMetadata'
 
-        Each function is represented by a self-contained metadata file. The 
`format-version` field
-        identifies the UDF metadata format.
+    QueryEventsResponse:
       type: object
       required:
-        - function-uuid
-        - format-version
-        - definitions
-        - definition-log
+        - continuation-token
+        - events
       properties:
-        function-uuid:
-          type: string
-          format: uuid
-          description: A UUID that identifies this UDF, generated once at 
creation.
-        format-version:
-          type: integer
-          description: UDF specification format version (must be 1).
-          minimum: 1
-          maximum: 1
-        definitions:
-          type: array
-          description: List of function definition entities.
-          items:
-            $ref: '#/components/schemas/FunctionDefinition'
-        definition-log:
+        continuation-token:
+          type: string
+          description: >
+            An opaque continuation token to fetch the next page of events.
+            This token encodes the server's cursor position and filter state.
+            Clients should treat this as an opaque value and pass it 
unmodified in subsequent requests.
+            When no more events are currently available, the server returns an 
empty `events` array
+            and a `continuation-token` that the client can re-issue later to 
receive events that
+            occur after this point.
+        events:
           type: array
-          description: History of versions within the function's definitions.
           items:
-            $ref: '#/components/schemas/FunctionDefinitionLogEntry'
-        location:
-          type: string
-          description: The function's base location. This is used to store 
function metadata files.
-        properties:
-          type: object
-          description: A string-to-string map of properties.
-          additionalProperties:
-            type: string
-        secure:
-          type: boolean
-          description: Whether it is a secure function.
-          default: false
-        doc:
-          type: string
-          description: Documentation string.
+            $ref: "#/components/schemas/Event"
 
-    FunctionDefinition:
+    Event:
       type: object
       required:
-        - definition-id
-        - parameters
-        - return-type
-        - versions
-        - current-version-id
-        - function-type
+        - event-id
+        - request-id
+        - request-event-count
+        - timestamp-ms
+        - operation
       properties:
-        definition-id:
+        event-id:
           type: string
-          description: A canonical string derived from the parameter types, 
formatted as a comma-separated list with no spaces.
-        parameters:
-          type: array
-          description: Ordered list of function parameters. Invocation order 
must match this list.
-          items:
-            $ref: '#/components/schemas/FunctionParameter'
-        return-type:
-          $ref: '#/components/schemas/FunctionDataType'
-        return-nullable:
-          type: boolean
-          description: A hint to indicate whether the return value is nullable 
or not.
-          default: true
-        versions:
-          type: array
-          description: Versioned implementations of this definition.
-          items:
-            $ref: '#/components/schemas/FunctionDefinitionVersion'
-        current-version-id:
-          type: integer
-          description: Identifier of the current version for this definition.
-        function-type:
-          type: string
-          description: Function type. When set to "udtf", "return-type" must 
be a struct describing the output schema.
-          enum: ["udf", "udtf"]
-        doc:
+          description: Unique ID of this event. Clients should perform 
deduplication based on this ID.
+        request-id:
+          description: >
+            Opaque ID of the request this change belongs to.
+            This ID can be used to identify events that were part of the same 
request.
           type: string
-          description: Documentation string.
-
-    FunctionParameter:
+        request-event-count:
+          type: integer
+          description: >
+            Number of events produced by the originating catalog request (e.g. 
updateTable
+            or commitTransaction) that generated this event. Such requests can 
apply multiple
+            updates atomically, each surfaced as a separate event sharing the 
same `request-id`;
+            this count reports how many events that originating request 
produced in total.
+        timestamp-ms:
+          type: integer
+          format: int64
+          description: >
+            Timestamp when this event occurred (epoch milliseconds).
+            Timestamps are not guaranteed to be unique. Typically all events in
+            a transaction will have the same timestamp.
+        actor:
+          type: object
+          description: >
+            The actor who performed the operation, such as a user or service 
account.
+            Implementations may add arbitrary additional fields; the optional 
`id` field is
+            recommended as a portable identifier that consumers can render and 
key on.
+          properties:
+            id:
+              type: string
+              description: >
+                Recommended, optional. Stable identifier of the actor (e.g. a 
user or
+                service account). Provided when the server can attribute the 
operation.
+          additionalProperties: true
+        operation:
+          description: >
+            The operation that was performed, such as creating or updating a 
table.
+            The concrete type is selected by the `operation-type` 
discriminator defined on
+            `BaseOperation`. Clients should discard events with unknown 
operation types.

Review Comment:
   Same over-specification concern as on `BaseOperation`. Apologies — my 
earlier comments pushed this in.
   
   `ReportMetricsRequest` and `ViewRepresentation` (the two existing schemas 
with parser-side graceful fallback) carry no "clients should discard" prose in 
their spec descriptions; the contract is conveyed by typing the discriminator 
field as `type: string` and stating it on the type definition. The same is true 
here — `OperationType.description` already says clients should discard unknown 
values; duplicating it on this property is redundant and prescribes parser 
behavior the spec doesn't need to dictate.
   
   Suggest trimming to:
   
   ```
   The operation that was performed, such as creating or updating a table. The
   concrete type is selected by the `operation-type` discriminator. Operations 
are
   emitted only when the underlying change is committed; staged changes are not
   surfaced.
   ```
   
   Keeps the committed-only emission rule (a real wire-level contract) and 
drops the parser-implementation prescription.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3708,6 +3778,74 @@ components:
       allOf:
         - $ref: '#/components/schemas/ScanTasks'
 
+    QueryEventsRequest:
+      type: object
+      properties:
+        continuation-token:
+          type: string
+          description: >
+            A continuation token returned by a previous response. Clients 
should treat the
+            token as an opaque value and pass it unmodified. If not provided, 
events are
+            returned from the beginning of the event log subject to other 
filters.
+        page-size:
+          type: integer
+          format: int32
+          minimum: 1
+          description: >
+            The maximum number of events to return in a single response.
+            If not provided, the server may choose a default page size.
+            Servers may return less results than requested for various 
reasons, such as
+            server side limits, payload size or processing time.
+        since-timestamp-ms:
+          type: integer
+          format: int64
+          description: >
+            Optional starting timestamp (seek-to-timestamp) for the initial 
request, in
+            milliseconds (inclusive). Lets clients begin consumption from a 
rough point in time
+            without iterating the full history. If not provided, no filtering 
based on timestamp
+            values is applied.
+        operation-types:
+          type: array
+          items:
+            $ref: "#/components/schemas/OperationType"
+          description: >
+            Filter events by the type of operation.
+            If not provided, all types are returned.
+        catalog-objects-by-name:
+          type: array
+          items:
+            $ref: "#/components/schemas/CatalogObjectIdentifier"

Review Comment:
   The kind-agnostic, prefix-matching design is the right call — composing 
`catalog-objects-by-name` with `object-types` is symmetric with the 
`catalog-objects-by-uuid` + `object-types` pattern one field below.
   
   The current description carries the load-bearing principles but stacks them 
into one dense paragraph: matching rule, kind-independence, subtree-expansion, 
leaf-vs-namespace, sibling-exclusion, and (implicitly) cross-kind collision are 
interleaved across successive clauses. A reordered version that leads with the 
matching rule, then separates leaf-vs-namespace cases, then handles the 
collision corner:
   
   ```
   Each filter entry has the same shape as a `CatalogObjectIdentifier`: an array
   of name levels — e.g. `["a","b","t1"]` is the table `a.b.t1`. The filter
   matches an event when the entry's levels equal the event object's path, or
   form a leading prefix of it. Comparison is level-by-level: each array element
   is one path segment, so `["acc"]` is a prefix of `["acc","tax"]` but does
   not match `["accounting"]`.
   
   Matching is by name only, independent of object kind. Tables and views are
   leaves of the name tree, so a leaf entry matches a single object —
   `["a","b","t1"]` matches only the table or view `a.b.t1`. A namespace entry
   expands recursively: `["a","b"]` matches the namespace object `a.b` itself
   plus every descendant (`a.b.t1`, `a.b.v1`, sub-namespace `a.b.c`, and every
   object under it). It does not match the sibling namespace `a.c`.
   
   If a path exists as more than one kind (e.g. a namespace and a same-named
   table), the filter matches events for all of them; combine with 
`object-types`
   to narrow. If not provided, events for all objects must be returned subject
   to other filters.
   ```
   
   Three structural moves vs. the current text:
   
   1. Lead with the matching rule as a self-contained block — 
`CatalogObjectIdentifier` shape, exact-or-prefix match, level-by-level, with a 
negative example. One rule before any cases.
   2. Separate leaf entries from namespace entries explicitly — makes the 
implicit subtree expansion visible (a reader filtering by `["accounting"]` to 
"watch the namespace" will realize they're subscribing to the entire subtree).
   3. Surface the cross-kind name-collision case (which the current description 
doesn't mention but the discussion above this comment establishes as 
design-intentional) right next to the `object-types` guidance, so the 
resolution path is in the same paragraph as the problem.
   
   Same semantic content, ordered to be skimmable on first read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to