[ https://issues.apache.org/jira/browse/ARROW-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426726#comment-16426726 ]
ASF GitHub Bot commented on ARROW-2380: --------------------------------------- pitrou commented on issue #1835: ARROW-2380: [Python] Streamline conversions URL: https://github.com/apache/arrow/pull/1835#issuecomment-378891901 Without this PR: ``` [ 28.57%] ··· Running convert_builtins.ConvertPyListToArray.time_convert ok [ 28.57%] ···· ==================== ============= type -------------------- ------------- int32 5.67±0.06ms uint32 5.77±0.06ms int64 5.72±0.1ms uint64 5.09±0.1ms float32 5.16±0.01ms float64 5.32±0.05ms bool 4.51±0.02ms decimal 187±0.2ms binary 8.40±0.1ms binary10 8.35±0.1ms ascii 13.2±0.2ms unicode 28.8±0.7ms int64 list 51.0±0.6ms struct 31.6±1ms struct from tuples 31.0±2ms ==================== ============= [ 42.86%] ··· Running convert_builtins.InferPyListToArray.time_infer ok [ 42.86%] ···· ============ ============= type ------------ ------------- int64 11.3±0.1ms float64 10.4±0.02ms bool 9.85±0.04ms decimal 383±1ms binary 14.8±0.1ms ascii 20.3±0.3ms unicode 38.2±0.4ms int64 list 102±0.3ms ============ ============= ``` With this PR: ``` [ 28.57%] ··· Running convert_builtins.ConvertPyListToArray.time_convert ok [ 28.57%] ···· ==================== ============= type -------------------- ------------- int32 5.20±0.05ms uint32 5.03±0.06ms int64 5.69±0.2ms uint64 5.83±0.1ms float32 4.84±0.02ms float64 4.99±0.03ms bool 4.27±0.02ms decimal 179±0.7ms binary 8.61±0.1ms binary10 12.1±0.1ms ascii 10.5±0.04ms unicode 17.2±0.7ms int64 list 49.8±0.7ms struct 32.8±1ms struct from tuples 32.7±2ms ==================== ============= [ 42.86%] ··· Running convert_builtins.InferPyListToArray.time_infer ok [ 42.86%] ···· ============ ============= type ------------ ------------- int64 11.1±0.1ms float64 10.2±0.02ms bool 9.27±0.03ms decimal 371±0.5ms binary 15.0±0.1ms ascii 17.2±0.1ms unicode 28.7±0.6ms int64 list 97.1±0.4ms ============ ============= ``` It's mostly a wash, except that convert unicode strings to Arrow became faster (on Python 3). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] Correct issues in numpy_to_arrow conversion routines > ------------------------------------------------------------- > > Key: ARROW-2380 > URL: https://issues.apache.org/jira/browse/ARROW-2380 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.9.0 > Reporter: Bryan Cutler > Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Following the discussion at [https://github.com/apache/arrow/pull/1689,] > there are a few issues with conversion of various types to arrow that are > incorrect or could be improved: > * PyBytes_GET_SIZE is being casted to the wrong type, for example > {{const int32_t length = static_cast<int32_t>(PyBytes_GET_SIZE(obj));}} > * Handle the possibility with the statement > {{builder->value_data_length() + length > kBinaryMemoryLimit}} > if length is larger than kBinaryMemoryLimit > * Look into using common code for binary object conversion to avoid > duplication, and allow support for bytes and bytearray objects in other > places than numpy_to_arrow. (possibly put in src/arrow/python/helpers.h) -- This message was sent by Atlassian JIRA (v7.6.3#76005)