[ 
https://issues.apache.org/jira/browse/ARROW-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942841#comment-16942841
 ] 

Joris Van den Bossche commented on ARROW-5655:
----------------------------------------------

[~kszucs] I think this might already be fixed in the mean-time. Wes and I did 
some work related to schema handling the last month

> [Python] Table.from_pydict/from_arrays not using types in specified schema 
> correctly 
> -------------------------------------------------------------------------------------
>
>                 Key: ARROW-5655
>                 URL: https://issues.apache.org/jira/browse/ARROW-5655
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Assignee: Krisztian Szucs
>            Priority: Major
>             Fix For: 1.0.0
>
>
> Example with {{from_pydict}} (from 
> https://github.com/apache/arrow/pull/4601#issuecomment-503676534):
> {code:python}
> In [15]: table = pa.Table.from_pydict(
>     ...:     {'a': [1, 2, 3], 'b': [3, 4, 5]},
>     ...:     schema=pa.schema([('a', pa.int64()), ('c', pa.int32())]))
> In [16]: table
> Out[16]: 
> pyarrow.Table
> a: int64
> c: int32
> In [17]: table.to_pandas()
> Out[17]: 
>    a  c
> 0  1  3
> 1  2  0
> 2  3  4
> {code}
> Note that the specified schema has 1) different column names and 2) has a 
> non-default type (int32 vs int64) which leads to corrupted values.
> This is partly due to {{Table.from_pydict}} not using the type information in 
> the schema to convert the dictionary items to pyarrow arrays. But then it is 
> also {{Table.from_arrays}} that is not correctly casting the arrays to 
> another dtype if the schema specifies as such.
> Additional question for {{Table.pydict}} is whether it actually should 
> override the 'b' key from the dictionary as column 'c' as defined in the 
> schema (this behaviour depends on the order of the dictionary, which is not 
> guaranteed below python 3.6).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to