thinkharderdev commented on issue #30:
URL: https://github.com/apache/arrow-ballista/issues/30#issuecomment-1263437950

   > I also have some thoughts about a unified execution engine, welcome to 
take a look and comment:
   > 
   > 
https://www.notion.so/liurenjie1024/A-Cloud-Native-Universal-Execution-Engine-7903dd9eeea143c48049631a2d1cb845
   > 
   > cc @andygrove @mingmwang
   
   Thanks @liurenjie1024. I recently read the F1 paper 
(https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41344.pdf)
 which I found very interesting. It's quite a bit larger in scope than Ballista 
(it's intended to be more of a full DBMS than a query engine like Ballista), 
but the query processing section is very interesting. It has some very 
interesting properties:
   
   1. The scheduler/planner can dynamically determine whether queries should be 
distributed or executed centrally on a single executor so you can potentially 
eliminate the scheduling and shuffle overhead on fast OLTP-style queries. 
   2. Distributed queries can be either batched (similar to Ballista now where 
each stage is fully materialized) or pipelined (where data is streamed between 
the stages). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to