[
https://issues.apache.org/jira/browse/ARROW-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Artem KOZHEVNIKOV updated ARROW-6882:
-------------------------------------
Description:
I've experienced a strange error raise when trying to apply `pa.chunked_array`
directly on the indices of dictionary_encoding (code is below). Making a memory
view solves the problem.
{code:python}
import pyarrow as pa
ca = pa.array(['a', 'a', 'b', 'b', 'c'])
fca = ca.dictionary_encode()
fca.indices
<pyarrow.lib.Int32Array object at 0x1250fb888>
[
0,
0,
1,
1,
2
]
pa.chunked_array([fca.indices])
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
<ipython-input-44-71ca3b877e1c> in <module>
----> 1 pa.chunked_array([fca.indices])
~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi
in pyarrow.lib.chunked_array()
~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi
in pyarrow.lib.check_status()
ArrowInvalid: Unexpected dictionary values in array of type int32
# with another memory view it's OK
pa.chunked_array([fca.indices.view(fca.indices.type)])
Out[45]:
<pyarrow.lib.ChunkedArray object at 0x12508dc78>
[
[
0,
0,
1,
1,
2
]
]
{code}
was:
I've experienced a strange error raise when trying to apply `pa.chunked_array`
directly on the indices of dictionary_encoding (code is below). Making a memory
view solves the problem.
{code:python}
import pyarrow as pa
ca = pa.array(['a', 'a', 'b', 'b', 'c'])
fca = ca.dictionary_encode()
fca.indices
<pyarrow.lib.Int32Array object at 0x1250fb888>
[
0,
0,
1,
1,
2
]
pa.chunked_array([fca.indices])
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
<ipython-input-44-71ca3b877e1c> in <module>
----> 1 pa.chunked_array([fca.indices])
~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi
in pyarrow.lib.chunked_array()
~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi
in pyarrow.lib.check_status()
ArrowInvalid: Unexpected dictionary values in array of type int32
# with another memory view it's OK
pa.chunked_array([pa.Array.from_buffers(type=pa.int32(),
length=len(fca.indices), buffers=fca.indices.buffers())])
Out[45]:
<pyarrow.lib.ChunkedArray object at 0x12508dc78>
[
[
0,
0,
1,
1,
2
]
]
{code}
> cannot create a chunked_array from dictionary_encoding result
> -------------------------------------------------------------
>
> Key: ARROW-6882
> URL: https://issues.apache.org/jira/browse/ARROW-6882
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.15.0
> Reporter: Artem KOZHEVNIKOV
> Priority: Major
> Fix For: 0.15.1
>
>
> I've experienced a strange error raise when trying to apply
> `pa.chunked_array` directly on the indices of dictionary_encoding (code is
> below). Making a memory view solves the problem.
> {code:python}
> import pyarrow as pa
> ca = pa.array(['a', 'a', 'b', 'b', 'c'])
>
> fca = ca.dictionary_encode()
>
> fca.indices
>
> <pyarrow.lib.Int32Array object at 0x1250fb888>
> [
> 0,
> 0,
> 1,
> 1,
> 2
> ]
> pa.chunked_array([fca.indices])
>
> ---------------------------------------------------------------------------
> ArrowInvalid Traceback (most recent call last)
> <ipython-input-44-71ca3b877e1c> in <module>
> ----> 1 pa.chunked_array([fca.indices])
> ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/table.pxi
> in pyarrow.lib.chunked_array()
> ~/Projects/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pyarrow/error.pxi
> in pyarrow.lib.check_status()
> ArrowInvalid: Unexpected dictionary values in array of type int32
> # with another memory view it's OK
> pa.chunked_array([fca.indices.view(fca.indices.type)])
> Out[45]:
> <pyarrow.lib.ChunkedArray object at 0x12508dc78>
> [
> [
> 0,
> 0,
> 1,
> 1,
> 2
> ]
> ]
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)