andygrove commented on issue #23194:
URL: https://github.com/apache/datafusion/issues/23194#issuecomment-4848793652

   > > Now that AQE/AQP has been fleshed out with an implementation and 
substantive discussion, I'm curious to know if you still prefer:
   > 
   > My personal preference is to see AQP implemented outside the core 
datafusion crate / repo, be validated that it works, and is broadly applicable, 
and then potentially bring it into the core to be maintained along with the 
other code here
   
   It makes sense to prove this out in real distributed consumers first. We are 
already doing that in Ballista, and @gabotechs is in datafusion-distributed.
   
   The friction for Ballista isn't the AQE logic. It is that core DataFusion 
increasingly assumes a plan runs in one process, and that breaks the moment a 
plan is split at a pipeline breaker, serialized, and executed one partition per 
task. Three we hit just upgrading to DF 54 (apache/datafusion-ballista#1906):
   
   - `DataSourceExec`'s shared scan work queue is only divided correctly when 
all partitions are polled together. A task polling its one partition drained 
the queue and read the whole table 
(https://github.com/apache/datafusion-ballista/issues/1907).
   - An uncorrelated scalar subquery's `ScalarSubqueryExpr` only decodes inside 
its `ScalarSubqueryExec`. Once stage splitting separates them, the stage no 
longer round trips (https://github.com/apache/datafusion-ballista/issues/1909).
   - Runtime dynamic filter pushdown delivers a filter from build side to probe 
scan within one plan instance. Split across a stage boundary it never arrives 
and the probe stalls.
   
   I'll start filing issues in core DF when I hit things like this, that are 
regressions from Ballista's point of view, to help (hopefully) with the 
conversation.
   
   My own preference is that modeling some of this in core (eventually) would 
help, because it lets core catch these regressions before a release rather than 
after. All three shipped in DF 54 and only surfaced once we upgraded. Perhaps 
this is a new crate in core that models a distributed-style approach, but still 
in-process, jut to catch these kind of issues.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to