[ 
https://issues.apache.org/jira/browse/ARROW-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16395150#comment-16395150
 ] 

ASF GitHub Bot commented on ARROW-2141:
---------------------------------------

pitrou commented on a change in pull request #1689: ARROW-2141: [Python] 
Support variable length binary conversion from Pandas
URL: https://github.com/apache/arrow/pull/1689#discussion_r173777749
 
 

 ##########
 File path: cpp/src/arrow/python/numpy_to_arrow.cc
 ##########
 @@ -164,18 +163,26 @@ static Status AppendObjectBinaries(PyArrayObject* arr, 
PyArrayObject* mask,
     if ((have_mask && mask_values[offset]) || PandasObjectIsNull(obj)) {
       RETURN_NOT_OK(builder->AppendNull());
       continue;
-    } else if (!PyBytes_Check(obj)) {
+    } else if (PyBytes_Check(obj)) {
+      const int32_t length = static_cast<int32_t>(PyBytes_GET_SIZE(obj));
+      if (ARROW_PREDICT_FALSE(builder->value_data_length() + length >
+                              kBinaryMemoryLimit)) {
+        break;
 
 Review comment:
   After reading the code a bit more carefully, I understand... though there is 
still a problem: what if `length` is larger than `kBinaryMemoryLimit`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Conversion from Numpy object array to varsize binary unimplemented
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-2141
>                 URL: https://issues.apache.org/jira/browse/ARROW-2141
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>    Affects Versions: 0.8.0
>            Reporter: Antoine Pitrou
>            Assignee: Bryan Cutler
>            Priority: Major
>              Labels: pull-request-available
>
> {code:python}
> >>> arr = np.array([b'xx'], dtype=np.object)
> >>> pa.array(arr, type=pa.binary(2))
> <pyarrow.lib.FixedSizeBinaryArray object at 0x7fe1ecaefa98>
> [
>   b'xx'
> ]
> >>> pa.array(arr, type=pa.binary())
> Traceback (most recent call last):
>   File "<ipython-input-12-e40948b94b33>", line 1, in <module>
>     pa.array(arr, type=pa.binary())
>   File "array.pxi", line 177, in pyarrow.lib.array
>   File "error.pxi", line 77, in pyarrow.lib.check_status
>   File "error.pxi", line 85, in pyarrow.lib.check_status
> ArrowNotImplementedError: 
> /home/antoine/arrow/cpp/src/arrow/python/numpy_to_arrow.cc:1585 code: 
> converter.Convert()
> /home/antoine/arrow/cpp/src/arrow/python/numpy_to_arrow.cc:1098 code: 
> compute::Cast(&context, *arr, type_, options, &casted)
> /home/antoine/arrow/cpp/src/arrow/compute/kernels/cast.cc:1022 code: 
> Cast(ctx, Datum(array.data()), out_type, options, &datum_out)
> /home/antoine/arrow/cpp/src/arrow/compute/kernels/cast.cc:1009 code: 
> GetCastFunction(*value.type(), out_type, options, &func)
> No cast implemented from binary to binary
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to