iChauster commented on PR #13314:
URL: https://github.com/apache/arrow/pull/13314#issuecomment-1149099821

   > > I did observe a speedup, but it does not exactly match my outputs for 
the ExpressionOverhead testing I have on my laptop. Let me know if I missed 
something in my code.
   > 
   > So I guess the problem is that if we call `InputReceived` manually then we 
do not get any parallelism (that is today handled by the source node). So I 
think we will need to do that manually as well.
   > 
   > We could manually schedule in the benchmark by creating a new 
TaskScheduler. [Here is a rough 
example](https://github.com/apache/arrow/commit/6b0069d97e70394923fcaea5ab468f85eb282d1c)
 that could be cleaned up. It's a bit complex but we could start to share this 
logic if we are going to test all the nodes.
   
   Yes, I had an inkling it had to do with the batch delivery especially with 
some of the metrics being slower than the source + projection + sink versions.
   
   By the way, I found that the `TaskScheduler` is actually less performant for 
some reason, does it compete with `ExpressionOverhead` on your setup?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to