MisterRaindrop commented on PR #1571:
URL: https://github.com/apache/cloudberry/pull/1571#issuecomment-3949872613

   Thank you for the detailed review comments. Regarding the core issues raised 
(the security impact of partial paths on FDWs that do not support parallel 
callbacks, the mixing of Gather and CBDB gang models, locus conversion, and the 
scope of changes in execMain.c), I agree that these are all issues that need to 
be addressed seriously.
   
   After reconsideration, I am inclined to withdraw the kernel-side 
modifications and adopt a pure FDW-layer solution instead. The core idea is:
   
   1. Do not modify the kernel's partial path generation, execMain.c, or locus 
logic—avoiding all the risks mentioned above.
   2. The FDW directly uses CBDB's existing parallel variables 
(ParallelWorkerNumberOfSlice / TotalParallelWorkerNumberOfSlice) to obtain the 
current worker number and total count.
   3. During execution, the FDW calculates the virtual segment ID based on 
these two values, modifies the HTTP header sent to PXF, and allows PXF's 
round-robin sharding mechanism to automatically distribute data evenly among 
all gang workers.
   
   This solution requires no kernel modifications and will not affect other 
FDWs.
   
   I would like to confirm: Is this direction reasonable? Are the variables 
ParallelWorkerNumberOfSlice and TotalParallelWorkerNumberOfSlice stable and 
reliable under the current CBDB parallel framework? Or do you have a more 
recommended way for the FDW to perceive gang parallel information?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to