[I] Pull based native execution [arrow-datafusion-comet]

via GitHub Wed, 21 Feb 2024 00:18:16 -0800


viirya opened a new issue, #70:
URL: https://github.com/apache/arrow-datafusion-comet/issues/70


   ### What is the problem the feature request solves?
   
   Comet native execution's scan is not started from native but from JVM. Thus 
Comet scan is push-based instead of pull-based. Although we pull next input 
batches from child operator in JVM, this new input is not pulled from native 
but pushed from JVM side.
   
   For an operator like Expand, one input batch can produces multiple output 
batches. So we cannot pull next batch directly and push into native without 
peeking it. We need to "peek" into native side and see if any more output batch 
there. If so, we take it as next output, if not, we pull next input batch and 
push into native side to execute on it.
   
   If we pull next input from child operator and push it into native without 
peek, new input will be ignored.
   
   Not only we cannot have consistent way to get input for native operators. 
The code of input/output to native execution is harder to understand because we 
mix push-based and pull-based processing modes. This patch tries to make native 
execution fully pull-based.
   
   ### Describe the potential solution
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Pull based native execution [arrow-datafusion-comet]

Reply via email to