[ https://issues.apache.org/jira/browse/ARROW-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Artem KOZHEVNIKOV updated ARROW-6882: ------------------------------------- Description: I've experienced a strange error raise when trying to apply `pa.chunked_array` directly on the indices of dictionary_encoding (code is below). Making a memory view solves the problem. {code:python} import pyarrow as pa ca = pa.array(['a', 'a', 'b', 'b', 'c']) fca = ca.dictionary_encode() fca.indices <pyarrow.lib.Int32Array object at 0x1250fb888> [ 0, 0, 1, 1, 2 ] pa.chunked_array([fca.indices]) --------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-44-71ca3b877e1c> in <module> ----> 1 pa.chunked_array([fca.indices]) ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.chunked_array() ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: Unexpected dictionary values in array of type int32 # with another memory view it's OK pa.chunked_array([fca.indices.view(fca.indices.type)]) Out[45]: <pyarrow.lib.ChunkedArray object at 0x12508dc78> [ [ 0, 0, 1, 1, 2 ] ] {code} was: I've experienced a strange error raise when trying to apply `pa.chunked_array` directly on the indices of dictionary_encoding (code is below). Making a memory view solves the problem. {code:python} import pyarrow as pa ca = pa.array(['a', 'a', 'b', 'b', 'c']) fca = ca.dictionary_encode() fca.indices <pyarrow.lib.Int32Array object at 0x1250fb888> [ 0, 0, 1, 1, 2 ] pa.chunked_array([fca.indices]) --------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-44-71ca3b877e1c> in <module> ----> 1 pa.chunked_array([fca.indices]) ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.chunked_array() ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: Unexpected dictionary values in array of type int32 # with another memory view it's OK pa.chunked_array([pa.Array.from_buffers(type=pa.int32(), length=len(fca.indices), buffers=fca.indices.buffers())]) Out[45]: <pyarrow.lib.ChunkedArray object at 0x12508dc78> [ [ 0, 0, 1, 1, 2 ] ] {code} > cannot create a chunked_array from dictionary_encoding result > ------------------------------------------------------------- > > Key: ARROW-6882 > URL: https://issues.apache.org/jira/browse/ARROW-6882 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.15.0 > Reporter: Artem KOZHEVNIKOV > Priority: Major > Fix For: 0.15.1 > > > I've experienced a strange error raise when trying to apply > `pa.chunked_array` directly on the indices of dictionary_encoding (code is > below). Making a memory view solves the problem. > {code:python} > import pyarrow as pa > ca = pa.array(['a', 'a', 'b', 'b', 'c']) > > fca = ca.dictionary_encode() > > fca.indices > > <pyarrow.lib.Int32Array object at 0x1250fb888> > [ > 0, > 0, > 1, > 1, > 2 > ] > pa.chunked_array([fca.indices]) > > --------------------------------------------------------------------------- > ArrowInvalid Traceback (most recent call last) > <ipython-input-44-71ca3b877e1c> in <module> > ----> 1 pa.chunked_array([fca.indices]) > ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi > in pyarrow.lib.chunked_array() > ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi > in pyarrow.lib.check_status() > ArrowInvalid: Unexpected dictionary values in array of type int32 > # with another memory view it's OK > pa.chunked_array([fca.indices.view(fca.indices.type)]) > Out[45]: > <pyarrow.lib.ChunkedArray object at 0x12508dc78> > [ > [ > 0, > 0, > 1, > 1, > 2 > ] > ] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)