Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

via GitHub Sat, 25 Jan 2025 06:01:06 -0800


tustvold commented on PR #14286:
URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2613975126


   In the interests of avoiding confusion, as my objections appear to have 
gotten a little misinterpreted, I'd like to clarify the fact this approach 
comes with non-trivial overheads is **not** what concerns me with this 
approach. Rather that we know from experience at InfluxData that this pattern 
is fragile, easy to mess up, and leads to emergent behaviour that is highly 
non-trivial to reproduce and debug.
   
   That being said as Andrew says, nobody has emerged who is able/willing to 
resolve this with a more holistic approach, e.g. something closer to what 
polars/DuckDb/Hyper are doing to separate IO/compute, and so proceeding with 
something is better than nothing. I just hoped someone might step up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

Reply via email to