[
https://issues.apache.org/jira/browse/ARROW-12097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-12097:
-----------------------------------
Labels: pull-request-available (was: )
> [C++] Modify BackgroundGenerator so it creates fewer threads
> ------------------------------------------------------------
>
> Key: ARROW-12097
> URL: https://issues.apache.org/jira/browse/ARROW-12097
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Weston Pace
> Assignee: Weston Pace
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The current implementation creates a thread per block and in the CSV reader
> this hurts performance just a bit. However, in the IPC reader this hurts
> performance even more.
> Instead the readahead can move inside the background generator and the
> background generator task can keep running until the queue fills up and then
> restart when the queue has drained enough for a substantial amount of work to
> be done.
> In my test CSV case this dropped the # of thread tasks created from ~2.5k to
> ~100.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)