[ 
https://issues.apache.org/jira/browse/ARROW-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-13317:
-----------------------------------
    Labels: documentation good-first-issue pull-request-available  (was: 
documentation good-first-issue)

> [Python] Improve documentation on what 'use_threads' does in 'read_feather'
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-13317
>                 URL: https://issues.apache.org/jira/browse/ARROW-13317
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>    Affects Versions: 4.0.1
>            Reporter: Arun Joseph
>            Assignee: Alenka Frim
>            Priority: Trivial
>              Labels: documentation, good-first-issue, pull-request-available
>             Fix For: 7.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current documentation for 
> [read_feather|https://arrow.apache.org/docs/python/generated/pyarrow.feather.read_feather.html]
>  states the following:
> *use_threads* (_bool__,_ _default True_) – Whether to parallelize reading 
> using multiple threads.
> if the underlying file uses compression, then multiple threads can still be 
> spawned. The verbiage of the *use_threads* is ambiguous on whether the 
> restriction on multiple threads is only for the conversion from pyarrow to 
> the pandas dataframe vs the reading/decompression of the file itself which 
> might spawn additional threads.
> [set_cpu_count|http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count]
>  might be good to mention as a way to actually limit threads spawned



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to