Harshitg opened a new issue #7882:
URL: https://github.com/apache/arrow/issues/7882


   Wanted to report the performance difference observed between Pandas and 
Pyarrow. 
   
   ```
   import numpy as np
   import pandas as pd
   import pyarrow as pa
   import pyarrow.compute as pc
   
   df = pd.DataFrame(np.random.randn(100000000))
   %timeit -n 5 -r 5 df.multiply(df)
   
   table = pa.Table.from_pandas(df)
   %timeit -n 5 -r 5 pc.multiply(table[0],table[0])
   ```
   
   Results:
   ```
   %timeit -n 5 -r 5 df.multiply(df)
   374 ms ± 15.9 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)
   ```
   
   ```
   %timeit -n 5 -r 5 pc.multiply(table[0],table[0])
   698 ms ± 297 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to