Re: [Python] pyarrow.read_feather use_threads option not respected?

Wes McKinney Mon, 12 Jul 2021 13:40:56 -0700

hi Burke — to remove yourself, you have to e-mail

[email protected]


On Mon, Jul 12, 2021 at 3:11 PM Burke Kaltenberger
<[email protected]> wrote:
>
> Please take me off the mailing list
>
> On Mon, Jul 12, 2021 at 1:08 PM Wes McKinney <[email protected]> wrote:
>>
>> hi Arun — the `use_threads` argument here only toggles whether
>> multiple threads are used in the conversion from the Arrow/Feather
>> representation to pandas. Since you elected to use compression,
>> multiple threads are used when decompressing the data, and this can
>> only be changed by setting the number of threads globally in the
>> pyarrow library [1]
>>
>> This seems a bit misleading to me, so it would be good to open a Jira
>> issue to clarify in the documentation what "use_threads" does
>>
>> [1]: 
>> http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count
>>
>> On Mon, Jul 12, 2021 at 3:00 PM Arun Joseph <[email protected]> wrote:
>> >
>> > I'm running the following:
>> >
>> > Python 3.7.4 (default, Aug 13 2019, 20:35:49)
>> > [GCC 7.3.0] :: Anaconda, Inc. on linux
>> > Type "help", "copyright", "credits" or "license" for more information.
>> > >>> import pyarrow
>> > >>> pyarrow.__version__
>> > '4.0.1'
>> >
>> > from pyarrow import feather
>> >
>> > feather.write_feather(df, dest=file_path, compression='zstd', 
>> > compression_level=19)
>> > file_path=f'{valid_file_path}'
>> > feather.read_feather(file_path, use_threads=False)
>> >
>> > It seems like the use_threads argument does not alter the number of 
>> > threads launched. I've tested with both use_threads=True and 
>> > use_threads=False. Am I misunderstanding what use_threads actually means? 
>> > It seems like it launches ~12 threads.
>> >
>> > Could this be related to the compression strategy of the file itself?
>> >
>> > Thank You,
>> > Arun Joseph
>> >
>
>
>
> --
> First Talent Search & Placement
> Burke Kaltenberger | Founder
> 408.458.0071

Re: [Python] pyarrow.read_feather use_threads option not respected?

Reply via email to