c-thiel commented on code in PR #12584:
URL: https://github.com/apache/iceberg/pull/12584#discussion_r3369286461


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3708,6 +3778,74 @@ components:
       allOf:
         - $ref: '#/components/schemas/ScanTasks'
 
+    QueryEventsRequest:
+      type: object
+      properties:
+        continuation-token:
+          type: string
+          description: >
+            A continuation token returned by a previous response. Clients 
should treat the
+            token as an opaque value and pass it unmodified. If not provided, 
events are
+            returned from the beginning of the event log subject to other 
filters.
+        page-size:
+          type: integer
+          format: int32
+          minimum: 1
+          description: >
+            The maximum number of events to return in a single response.
+            If not provided, the server may choose a default page size.
+            Servers may return less results than requested for various 
reasons, such as
+            server side limits, payload size or processing time.
+        since-timestamp-ms:
+          type: integer
+          format: int64
+          description: >
+            Optional starting timestamp (seek-to-timestamp) for the initial 
request, in
+            milliseconds (inclusive). Lets clients begin consumption from a 
rough point in time
+            without iterating the full history. If not provided, no filtering 
based on timestamp
+            values is applied.
+        operation-types:
+          type: array
+          items:
+            $ref: "#/components/schemas/OperationType"
+          description: >
+            Filter events by the type of operation.
+            If not provided, all types are returned.
+        catalog-objects-by-name:
+          type: array
+          items:
+            $ref: "#/components/schemas/CatalogObjectIdentifier"

Review Comment:
   My take on this is:
   We deliberately *don't* resolve the kind from the array — 
`catalog-objects-by-name` matches purely by **name path**, independent of 
object kind.
   
   An entry matches any object whose path it equals or is a leading prefix of 
(compared level by level). So `["a","b"]` matches the namespace `a.b` and 
everything beneath it (`a.b.t1`, the view `a.b.v1`, the sub-namespace `a.b.c` 
and its contents), without the server needing to decide whether `[a,b]` "is" a 
namespace or a table. Tables/views are leaves, so a full path like 
`["a","b","t1"]` matches just that object; only namespace paths expand to a 
subtree.
   
   That's also why the identifier can stay a plain array of strings: kind isn't 
encoded in it by design (it can even change over time — drop a view, create a 
table with the same name). If you want to narrow to a specific kind, that's the 
orthogonal `object-types` filter.
   
   One consequence worth noting: because matching is by name and names aren't 
unique across kinds, a path that exists as both a namespace and a table/view 
will match both. That's an accepted trade-off (consistent with the dev@ 
alignment that identifiers aren't unique across object types); clients dedup by 
`event-id` and can use `object-types` to narrow. I've reworded the field 
description to make the prefix semantics explicit.
   
   @stevenzwu please also have a look at this. Because this is a semantic 
rephrasing that needs some discussion I have separated it into a commit on its 
own: 
https://github.com/apache/iceberg/pull/12584/commits/b62b1d02ae1f7dd06933c58590b6761f48c7735e



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to