Florian Jetter created ARROW-4629:
-------------------------------------
Summary: [Python] Pandas to arrow conversion slowed down by local
imports
Key: ARROW-4629
URL: https://issues.apache.org/jira/browse/ARROW-4629
Project: Apache Arrow
Issue Type: Bug
Reporter: Florian Jetter
Assignee: Florian Jetter
Attachments: image-2019-02-19-19-10-46-330.png
The pandas to arrow conversion is currently slowed down significantly by
various local import statements.
{code}
import pandas as pd
import pyarrow as pa
import cProfile
ser = pd.Series(range(10000))
df = pd.DataFrame({col: ser.copy(deep=True) for col in range(50)})
# Simulate a real dataset, i.e. force copy of data
df = df.astype({col: str for col in range(25)})
prof = cProfile.Profile()
prof.enable()
# a few times to collect statistics
for _ in range(100):
pa.Table.from_pandas(df, nthreads=1)
prof.disable()
prof.dump_stats("array_conversion.prof")
{code}
!image-2019-02-19-19-10-46-330.png!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)