[
https://issues.apache.org/jira/browse/ARROW-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17410642#comment-17410642
]
Krisztian Szucs commented on ARROW-13914:
-----------------------------------------
Despite that {make_unions_} is always false I have the following results
locally:
{code}
In [29]: %timeit pa.array(data)
1.31 ms ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [30]: %timeit pa.array(data, type=ty)
647 µs ± 9.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [31]: %timeit pa.infer_type(data)
669 µs ± 3.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
{code}
So the inference doubles the conversion time.
> [C++][Python] Optimize type inference when converting from python values
> ------------------------------------------------------------------------
>
> Key: ARROW-13914
> URL: https://issues.apache.org/jira/browse/ARROW-13914
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Krisztian Szucs
> Priority: Minor
>
> Currently we use an extensive set of checks to infer arrow type from python
> sequences.
> Last time I checked using asv, the inference part had a significant overhead.
> We could try other approaches to speed-up the type inference, see comments:
> https://github.com/apache/arrow/pull/11076#discussion_r702808196
--
This message was sent by Atlassian Jira
(v8.3.4#803005)