Igor Yastrebov created ARROW-6578:
-------------------------------------
Summary: [Python] Casting int64 to string columns
Key: ARROW-6578
URL: https://issues.apache.org/jira/browse/ARROW-6578
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Affects Versions: 0.14.1
Reporter: Igor Yastrebov
I wanted to cast a list of a tables to the same schema so I could use
concat_tables later. However, I encountered ArrowNotImplementedError:
{code:java}
---------------------------------------------------------------------------
ArrowNotImplementedError Traceback (most recent call last)
<ipython-input-11-bd4916c221bf> in <module>
----> 1 list_tb = [i.cast(mts_schema, safe = True) for i in list_tb]
<ipython-input-11-bd4916c221bf> in <listcomp>(.0)
----> 1 list_tb = [i.cast(mts_schema, safe = True) for i in list_tb]
~\AppData\Local\Continuum\miniconda3\envs\cyclone\lib\site-packages\pyarrow\table.pxi
in itercolumns()
~\AppData\Local\Continuum\miniconda3\envs\cyclone\lib\site-packages\pyarrow\table.pxi
in pyarrow.lib.Column.cast()
~\AppData\Local\Continuum\miniconda3\envs\cyclone\lib\site-packages\pyarrow\error.pxi
in pyarrow.lib.check_status()
ArrowNotImplementedError: No cast implemented from int64 to string
{code}
Some context: I want to read and concatenate a bunch of csv files that come
from partitioning of the same table. Using cast after reading csv is usually
significantly faster than specifying column_types in ConvertOptions. There are
string columns that are mostly populated with integer-like values so a
particular file can have an integer-only column. This situation is rather
common so having an option to cast int64 column to string column would be
helpful.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)