Thanks Jack and Ryan, I will give an overview of the current design of the
PR[1].
1. To avoid duplication between the endpoints, we now have a single endpoint
planTableScan which accepts columns projections, a filter, etc. and initiates a
plan. The server will respond back to the client in either two ways.
* Either the server has completed planning and returns plan-tasks,
file-scan-tasks.
* The server has not completed planning, and returns a plan-id.
2. If the client receives a plan-id, they will call the fetchPlanningResult
endpoint which takes in plan-id as input. Servers will then return a status on
if planning is completed or not. The client will continue to call this endpoint
until the server returns a completed status with plan-tasks, file-scan-tasks.
3. If a client wants to release resources for a given plan, a delete
endpoint was added cancelPlanning which takes in plan-id as input.
4. An endpoint fetchScanTasks was added in order for a client to get the
file-scan-tasks associated with a plan-task by providing a plan-task as input.
[1] https://github.com/apache/iceberg/pull/9695
Regards,
Rahil Chertara
From: "[email protected]" <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, September 3, 2024 at 1:24 PM
To: "[email protected]" <[email protected]>
Subject: RE: [VOTE] Merge REST Spec Change To Add New Scan Planning APIs
CAUTION: This email originated from outside of the organization. Do not click
links or open attachments unless you can confirm the sender and know the
content is safe.
+1
I think it would be good to give an overview of the current proposal since it
has evolved quite a bit from the original like Jack said.
On Tue, Sep 3, 2024 at 9:09 AM Jack Ye
<[email protected]<mailto:[email protected]>> wrote:
Thanks for keeping pushing for this Rahil. Personally I am +1 (binding) for
this, with just some minor comments in the latest PR.
But I think the initial DISCUSS thread [1] was quite a while ago and a lot has
changed after a lot of comments and reviews. Should we restart another DISCUSS
thread before voting, to make sure people are aware of the latest design and
address any additional comments?
Best,
Jack Ye
[1] https://lists.apache.org/thread/qq13468x6gk0vxnsckzc5xd02tjlvpkm
On Mon, Sep 2, 2024 at 9:22 PM Chertara, Rahil <[email protected]>
wrote:
Hi all,
I've opened a PR [1] to add REST spec changes for a new protocol around table
scan planning. For context around the design discussions, see the original
google doc proposal [2], the dev list discussion thread [3], and finally the
discussion that has happened on the spec change PR.
Please vote on merging this change. The vote will remain open for at least 72
hours.
[] +1
[] +0
[] -1, do not merge because ...
[1] https://github.com/apache/iceberg/pull/9695
[2]
https://docs.google.com/document/d/1FdjCnFZM1fNtgyb9-v9fU4FwOX4An-pqEwSaJe8RgUg/edit#heading=h.cftjlkb2wh4h
[3] https://lists.apache.org/thread/qq13468x6gk0vxnsckzc5xd02tjlvpkm
Thanks,
Rahil Chertara