[GitHub] [arrow-datafusion] Jimexist commented on pull request #546: parallelize window function evaluations

GitBox Thu, 24 Jun 2021 19:30:23 -0700


Jimexist commented on pull request #546:
URL: https://github.com/apache/arrow-datafusion/pull/546#issuecomment-868154982



   > In my view, creating tasks this small is unlikely to be of benefit for 
query execution. Note that spawn blocking will create new threads on demand 
(512 by default), which will lead to higher memory use, and additional CPU 
usage / context switching (which can hurt performance of other parts of the 
query execution). The spawn blocking is only meant for longer running tasks.
   > 
   > I am not sure the customized scheduler will handle tasks this small.
   > 
   > I think much larger gains at this moment can be achieved with more 
efficient implementations and parallization at a much higher level.
   
   thanks for the comment, i agree with the assessment above, plus that in most 
cases the number of window functions within a logically planned phase should be 
1 or 2, there's little point to parallize.
   
   i think #569 is more promising


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Jimexist commented on pull request #546: parallelize window function evaluations

Reply via email to