[ 
https://issues.apache.org/jira/browse/ARROW-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951364#comment-16951364
 ] 

Joris Van den Bossche edited comment on ARROW-6882 at 10/14/19 9:12 PM:
------------------------------------------------------------------------

Although, it is only a regression because we now validate the resulting array 
automatically. On 0.14.1 manually validating the resulting array gives the same 
error.

And you get the same when actually validating the indices array:

{code}
In [23]: fca.indices.validate()   
...
ArrowInvalid: Unexpected dictionary values in array of type int32
{code}


was (Author: jorisvandenbossche):
Although, it is only a regression because we now validate the resulting array 
automatically. On 0.14.1 manually validating the resulting array gives the same 
error.

> cannot create a chunked_array from dictionary_encoding result
> -------------------------------------------------------------
>
>                 Key: ARROW-6882
>                 URL: https://issues.apache.org/jira/browse/ARROW-6882
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.15.0
>            Reporter: Artem KOZHEVNIKOV
>            Priority: Major
>             Fix For: 0.15.1
>
>
> I've experienced a strange error raise when trying to apply 
> `pa.chunked_array` directly on the indices of dictionary_encoding (code is 
> below). Making a memory view solves the problem.
> {code:python}
> import pyarrow as pa
> ca = pa.array(['a', 'a', 'b', 'b', 'c'])                                      
>                                                      
> fca = ca.dictionary_encode()                                                  
>                                                      
> fca.indices                                                                   
>                                                      
> <pyarrow.lib.Int32Array object at 0x1250fb888>
> [
>   0,
>   0,
>   1,
>   1,
>   2
> ]
> pa.chunked_array([fca.indices])                                               
>                                                      
> ---------------------------------------------------------------------------
> ArrowInvalid                              Traceback (most recent call last)
> <ipython-input-44-71ca3b877e1c> in <module>
> ----> 1 pa.chunked_array([fca.indices])
> ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi
>  in pyarrow.lib.chunked_array()
> ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi
>  in pyarrow.lib.check_status()
> ArrowInvalid: Unexpected dictionary values in array of type int32
> # with another memory view it's  OK
> pa.chunked_array([fca.indices.view(fca.indices.type)])                 
> Out[45]: 
> <pyarrow.lib.ChunkedArray object at 0x12508dc78>
> [
>   [
>     0,
>     0,
>     1,
>     1,
>     2
>   ]
> ]
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to