[ 
https://issues.apache.org/jira/browse/ARROW-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-4131:
----------------------------------
    Fix Version/s:     (was: 0.14.0)
                   0.15.0

> [Python] Coerce mixed columns to String
> ---------------------------------------
>
>                 Key: ARROW-4131
>                 URL: https://issues.apache.org/jira/browse/ARROW-4131
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Leo Meyerovich
>            Priority: Major
>             Fix For: 0.15.0
>
>
> Continuing [https://github.com/apache/arrow/issues/3280] 
>  
> ===
>  
> I'm seeing variants of this elsewhere (e.g., 
> [wesm/feather#349|https://github.com/wesm/feather/issues/349] ) --
> Not all Pandas tables coerce to Arrow tables, and when they fail, not in a 
> way that is conducive to automation:
> Sample:
> {{mixed_df = pd.DataFrame(\{'mixed': [1, 'b']}) 
> pa.Table.from_pandas(mixed_df) => ArrowInvalid: ('Could not convert b with 
> type str: tried to convert to double', 'Conversion failed for column mixed 
> with type object') }}
> I would have expected behaviors more like the following:
>  * Coerce {{toString}} by default, with a default-off option to disallow 
> toString coercions
>  * Provide a default-off option to {{from_pandas}} to auto-coerce
>  * Name the exception so it is clear that this is a column coercion failure, 
> and include the column name(s), making this predictable and clearly 
> handleable by both library writers & users
> I lean towards:
>  * Defaults auto-coerce, improving life of early users, 
> `coerce_mixed_columns_to_strings=True`
>  * For less frequent yet more advanced library implementors, allow them to 
> override to `False`
>  * In their case, create a predictable & machine-readable exception, 
> `MixedColumnException(mixed_columns=['a', 'b', ...], msg="....")`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to