wesm commented on pull request #6744:
URL: https://github.com/apache/arrow/pull/6744#issuecomment-622529037
thanks @lidavidm! I'm confident we'll be able to devise some solutions to
the resource allocation problem
This is
wesm commented on pull request #6744:
URL: https://github.com/apache/arrow/pull/6744#issuecomment-621308415
I wrote up a ticket for round-robin task scheduling which might help with
this https://issues.apache.org/jira/browse/ARROW-8626
wesm commented on pull request #6744:
URL: https://github.com/apache/arrow/pull/6744#issuecomment-621305031
Yeah, I think one definite thing that needs to happen at minimum is
externalizing the thread pool used for asynchronous IO calls so that the user
is able to set whatever concurrency
wesm commented on pull request #6744:
URL: https://github.com/apache/arrow/pull/6744#issuecomment-621299892
@pitrou I think the problem is the global IO thread pool
https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/interfaces.cc#L310
So if you read multiple files
wesm commented on pull request #6744:
URL: https://github.com/apache/arrow/pull/6744#issuecomment-621266083
Yes, we should discuss on the mailing list.
For the record, IO-related tasks should almost certainly not be using the
default global thread pool, which is intended for