guozhans commented on issue #40738:
URL: https://github.com/apache/arrow/issues/40738#issuecomment-2024456764
Hi @kyle-ip,
I had Arrow 14.0.0 and 16.0.0 DEV version installed in different folders
before, and i am not aware of the old version until that day. I removed Arrow
14.0.0 complete from my ubuntu docker, and re-build source froma main branch
again with these commands. And then re-install PyArrow 16.0.0 dev (I know i can
build it as well, but i am bit lazy). Now everything looks fine now.
```shell
cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_BUILD_TYPE=Release \
-DARROW_BUILD_TESTS=ON \
-DARROW_COMPUTE=ON \
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-DARROW_FILESYSTEM=ON \
-DARROW_HDFS=ON \
-DARROW_JSON=ON \
-DARROW_PARQUET=ON \
-DARROW_WITH_BROTLI=ON \
-DARROW_WITH_BZ2=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_SNAPPY=ON \
-DARROW_WITH_ZLIB=ON \
-DARROW_WITH_ZSTD=ON \
-DPARQUET_REQUIRE_ENCRYPTION=ON \
.. \
&& make -j4 \
&& make install
```
Result:
```shell
Line # Mem usage Increment Occurrences Line Contents
=============================================================
29 395.4 MiB 395.4 MiB 1 @profile
30 def to_parquet(df:
pd.DataFrame, filename: str):
31 372.2 MiB -23.2 MiB 1 table =
Table.from_pandas(df)
32 372.2 MiB 0.0 MiB 1 pool =
pa.default_memory_pool()
33 396.4 MiB 24.2 MiB 1
parquet.write_table(table, filename, compression="snappy")
34 396.4 MiB 0.0 MiB 1 del table
35 396.4 MiB 0.0 MiB 1 pool.release_unused()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]