thinkharderdev commented on issue #30: URL: https://github.com/apache/arrow-ballista/issues/30#issuecomment-1263437950
> I also have some thoughts about a unified execution engine, welcome to take a look and comment: > > https://www.notion.so/liurenjie1024/A-Cloud-Native-Universal-Execution-Engine-7903dd9eeea143c48049631a2d1cb845 > > cc @andygrove @mingmwang Thanks @liurenjie1024. I recently read the F1 paper (https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41344.pdf) which I found very interesting. It's quite a bit larger in scope than Ballista (it's intended to be more of a full DBMS than a query engine like Ballista), but the query processing section is very interesting. It has some very interesting properties: 1. The scheduler/planner can dynamically determine whether queries should be distributed or executed centrally on a single executor so you can potentially eliminate the scheduling and shuffle overhead on fast OLTP-style queries. 2. Distributed queries can be either batched (similar to Ballista now where each stage is fully materialized) or pipelined (where data is streamed between the stages). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
