MisterRaindrop commented on PR #1571: URL: https://github.com/apache/cloudberry/pull/1571#issuecomment-3949872613
Thank you for the detailed review comments. Regarding the core issues raised (the security impact of partial paths on FDWs that do not support parallel callbacks, the mixing of Gather and CBDB gang models, locus conversion, and the scope of changes in execMain.c), I agree that these are all issues that need to be addressed seriously. After reconsideration, I am inclined to withdraw the kernel-side modifications and adopt a pure FDW-layer solution instead. The core idea is: 1. Do not modify the kernel's partial path generation, execMain.c, or locus logic—avoiding all the risks mentioned above. 2. The FDW directly uses CBDB's existing parallel variables (ParallelWorkerNumberOfSlice / TotalParallelWorkerNumberOfSlice) to obtain the current worker number and total count. 3. During execution, the FDW calculates the virtual segment ID based on these two values, modifies the HTTP header sent to PXF, and allows PXF's round-robin sharding mechanism to automatically distribute data evenly among all gang workers. This solution requires no kernel modifications and will not affect other FDWs. I would like to confirm: Is this direction reasonable? Are the variables ParallelWorkerNumberOfSlice and TotalParallelWorkerNumberOfSlice stable and reliable under the current CBDB parallel framework? Or do you have a more recommended way for the FDW to perceive gang parallel information? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
