Harshitg opened a new issue #7882: URL: https://github.com/apache/arrow/issues/7882
Wanted to report the performance difference observed between Pandas and Pyarrow. ``` import numpy as np import pandas as pd import pyarrow as pa import pyarrow.compute as pc df = pd.DataFrame(np.random.randn(100000000)) %timeit -n 5 -r 5 df.multiply(df) table = pa.Table.from_pandas(df) %timeit -n 5 -r 5 pc.multiply(table[0],table[0]) ``` Results: ``` %timeit -n 5 -r 5 df.multiply(df) 374 ms ± 15.9 ms per loop (mean ± std. dev. of 5 runs, 5 loops each) ``` ``` %timeit -n 5 -r 5 pc.multiply(table[0],table[0]) 698 ms ± 297 ms per loop (mean ± std. dev. of 5 runs, 5 loops each) ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org