Please take me off the mailing list On Mon, Jul 12, 2021 at 1:08 PM Wes McKinney <[email protected]> wrote:
> hi Arun — the `use_threads` argument here only toggles whether > multiple threads are used in the conversion from the Arrow/Feather > representation to pandas. Since you elected to use compression, > multiple threads are used when decompressing the data, and this can > only be changed by setting the number of threads globally in the > pyarrow library [1] > > This seems a bit misleading to me, so it would be good to open a Jira > issue to clarify in the documentation what "use_threads" does > > [1]: > http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count > > On Mon, Jul 12, 2021 at 3:00 PM Arun Joseph <[email protected]> wrote: > > > > I'm running the following: > > > > Python 3.7.4 (default, Aug 13 2019, 20:35:49) > > [GCC 7.3.0] :: Anaconda, Inc. on linux > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import pyarrow > > >>> pyarrow.__version__ > > '4.0.1' > > > > from pyarrow import feather > > > > feather.write_feather(df, dest=file_path, compression='zstd', > compression_level=19) > > file_path=f'{valid_file_path}' > > feather.read_feather(file_path, use_threads=False) > > > > It seems like the use_threads argument does not alter the number of > threads launched. I've tested with both use_threads=True and > use_threads=False. Am I misunderstanding what use_threads actually means? > It seems like it launches ~12 threads. > > > > Could this be related to the compression strategy of the file itself? > > > > Thank You, > > Arun Joseph > > > -- *First Talent Search & Placement* *Burke Kaltenberger <https://www.linkedin.com/in/burke-kaltenberger-3a41731/> | Founder* *408.458.0071*
