Fokko commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1689996232


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -537,6 +537,113 @@ paths:
         5XX:
           $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:

Review Comment:
   I'm not sure if we really should force the client into certain directions. 
For example, I think PyIceberg will mainly go to `plan` directly as it is more 
likely to consume smaller tables than your typical ETL workloads.
   
   > I think no, a server can decide to only implement one API. The server can 
require the client to go through the preplan API by indicating a server 
capability (e.g. preplanning), and in that case if the client still directly 
hits the plan endpoint without a plan task, then a 421 Misdirected Request 
should be thrown.
   
   So then a redirect would be sufficient? 
   
   > And did we align for scan-planning capability that a server can choose to 
implement none of these endpoints, can implement both preplan and plan, or just 
implement plan?
   
   I think there are situations, like Amogh described, where you don't want to 
accumulate too much state on the server-side (for example, data sharing 
servers), so in that case I would expect that it would only expose the plan 
endpoint. But in such a case, the pre-plan could also be a no-op like Ryan 
suggested.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to