[
https://issues.apache.org/jira/browse/ARROW-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384373#comment-17384373
]
Weston Pace commented on ARROW-13386:
-------------------------------------
The challenge is that it isn't as simple as disabling multi threading. The R
tests still set use_threads to False. The CSV reader has always used at least
two threads, one for background reading and one for processing the file. This
is true even when use_threads is false. The flag only intends to control
whether multiple CPU threads are used for parsing. Since the reading thread
uses barely any CPU power this still qualifies as "serial". With this latest
PR it changed slightly to 3 threads when use_threads is false. The calling
thread fetching batches, the reading thread, and a thread in between doing the
decoding.
Moving back to 2 threads is a little tricky (and maybe no guarantee it will
satisfy RTools 3.5). I should have a PR but just wanted to explain why it
isn't as simple as turning a flag off.
If we want to degrade functionality we need to disable CSV dataset scanning on
RTools 3.5 (which I hadn't considered an alternative but will be happy to do so
if need be). At the moment however I think I should be able to move back to 2
threads for use_threads=False and I'm optimistic that will help.
> [R][C++] CSV streaming changes break Rtools 35 32-bit build
> -----------------------------------------------------------
>
> Key: ARROW-13386
> URL: https://issues.apache.org/jira/browse/ARROW-13386
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Continuous Integration, R
> Reporter: Neal Richardson
> Assignee: Weston Pace
> Priority: Critical
> Fix For: 5.0.0
>
>
> [https://github.com/ursacomputing/crossbow/runs/3106661055] on the commit
> "8ce0c01c3 ARROW-12745: [C++][Compute] Add floor, ceiling, and truncate
> kernels" passes.
>
> [https://github.com/ursacomputing/crossbow/runs/3104398258] crashes on the
> commit "17e6f23cf ARROW-11889: [C++] Add parallelism to streaming CSV reader"
--
This message was sent by Atlassian Jira
(v8.3.4#803005)