Florian Jetter created ARROW-4629: ------------------------------------- Summary: [Python] Pandas to arrow conversion slowed down by local imports Key: ARROW-4629 URL: https://issues.apache.org/jira/browse/ARROW-4629 Project: Apache Arrow Issue Type: Bug Reporter: Florian Jetter Assignee: Florian Jetter Attachments: image-2019-02-19-19-10-46-330.png
The pandas to arrow conversion is currently slowed down significantly by various local import statements. {code} import pandas as pd import pyarrow as pa import cProfile ser = pd.Series(range(10000)) df = pd.DataFrame({col: ser.copy(deep=True) for col in range(50)}) # Simulate a real dataset, i.e. force copy of data df = df.astype({col: str for col in range(25)}) prof = cProfile.Profile() prof.enable() # a few times to collect statistics for _ in range(100): pa.Table.from_pandas(df, nthreads=1) prof.disable() prof.dump_stats("array_conversion.prof") {code} !image-2019-02-19-19-10-46-330.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)