amoeba commented on issue #38260:
URL: https://github.com/apache/arrow/issues/38260#issuecomment-1761945671

   Here are two flamegraphs produced from py-spy. It looks like the major 
difference between the two is in the proportion of time spent in 
`dataframe_to_arrays`.
   
   ```
   python version:  3.11.6
   pyarrow version: 11.0.0
   pandas version:  1.5.3
   numpy version:   1.26.0
   Conversion from pandas to pyarrow took 1.1433022079290822 seconds for 20000 
columns
   ```
   
   
![flamegraph-pandas153-20000](https://github.com/apache/arrow/assets/563/3f259475-0a73-4517-b4c0-7671e0f434f3)
   
   ```
   python version:  3.11.6
   pyarrow version: 13.0.0
   pandas version:  2.1.1
   numpy version:   1.26.0
   Conversion from pandas to pyarrow took 3.7711586660007015 seconds for 20000 
columns
   ```
   
   
![flamegraph-pandas211-20000](https://github.com/apache/arrow/assets/563/7eba16bc-87b0-4989-8809-25854d4a766d)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to