[
https://issues.apache.org/jira/browse/ARROW-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330335#comment-17330335
]
Antoine Pitrou commented on ARROW-11673:
----------------------------------------
There is no reason to accept unsafe casts for this feature. A dictionary with
invalid indices would not have merely incorrect data, it would crash as soon as
you would access it.
Also, we have internal utilities to do this:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/int_util.h
> [C++] Casting dictionary type to use different index type
> ---------------------------------------------------------
>
> Key: ARROW-11673
> URL: https://issues.apache.org/jira/browse/ARROW-11673
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Joris Van den Bossche
> Assignee: Eduardo Ponce
> Priority: Major
>
> It's currently not implemented to cast from one dictionary type to another
> dictionary type to change the index type.
> Small example:
> {code}
> In [2]: arr = pa.array(['a', 'b', 'a']).dictionary_encode()
> In [3]: arr.type
> Out[3]: DictionaryType(dictionary<values=string, indices=int32, ordered=0>)
> In [5]: arr.cast(pa.dictionary(pa.int8(), pa.string()))
> ...
> ArrowNotImplementedError: Unsupported cast from dictionary<values=string,
> indices=int32, ordered=0> to dictionary<values=string, indices=int8,
> ordered=0> (no available cast function for target type)
> ../src/arrow/compute/cast.cc:112
> GetCastFunctionInternal(cast_options->to_type, args[0].type().get())
> {code}
> From
> https://stackoverflow.com/questions/66223730/how-to-change-column-datatype-with-pyarrow
--
This message was sent by Atlassian Jira
(v8.3.4#803005)