drin commented on issue #40583: URL: https://github.com/apache/arrow/issues/40583#issuecomment-2008250251
> The hope is that acero's substrait support will continue to improve and that cross-engine plan optimizers will be written against substrait I am working on enabling cross-engine plan optimizers, so that's not a general issue, just a necessary mechanism to enable specifying arbitrary computation to skyhook. > If multiple ceph servers are in play, a UnionNode can be used in the client side plan to concatenate their streams but there must be something which accepts `ceph::bufferlist`. It seems a UnionNode should be an input to the SinkNode and used to concatenate SkyhookSourceNodes. > The design above does not require serializing custom nodes, since it only requires serialization of the server side plan Then I may be misunderstanding. If SkyhookSourceNode is a client-only node, then it seems like you're proposing execution of 2 independent ExecPlans, and not the use of SkyhookSourceNode for facilitating network communication in the execution of a single ExecPlan, correct? In which case I suppose I can understand the recommendation of a UnionNode, but I am not sure it would even be needed if execution on the server side and client side are independent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
