I recently created a small PoC of distributed query execution on Kubernetes using the Rust implementation of Apache Arrow and the DataFusion query engine [1].
This PoC uses gRPC to pass query plans to executor nodes and the proto file [2] is largely based on the Gandiva proto file [3]. The PoC is very basic but I think it demonstrates the power of having query plans as part of the proto file. This would allow distributed applications to be built based on Arrow standards in a way that is not dependent on any particular implementation of Arrow and would even allow mixing and matching query engines. I wanted to start this discussion to see what the appetite is here for accepting PRs to add query plan structures to the Gandiva proto file and also whether we can consider making this an Arrow proto file rather than being Gandiva-specific, over time. Thanks, Andy. [1] https://github.com/andygrove/ballista [2] https://github.com/andygrove/ballista/blob/master/proto/ballista/ballista.proto [3] https://github.com/apache/arrow/blob/master/cpp/src/gandiva/proto/Types.proto