[
https://issues.apache.org/jira/browse/BEAM-12379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363286#comment-17363286
]
Brian Hulette commented on BEAM-12379:
--------------------------------------
[insert
test|https://github.com/apache/beam/blob/c0b8e6531f6ade6a9c9e50222542e041954ba911/sdks/python/apache_beam/dataframe/frames_test.py#L796]
seems to be an issue with implicitly converting int64 columns to float64 when
a NaN is needed.
Interestingly we get all of the dtypes flipped float64 -> int64, int64 ->
float64
{code}
(Pdb) print(actual.dtypes)
foo float64
A int64
B int64
dtype: object
(Pdb) print(actual)
foo A B
0 NaN 1 4
1 8.0 2 5
2 NaN 3 6
(Pdb) print(proxy.dtypes)
foo int64
A float64
B float64
dtype: object
{code}
> Some DataFrame operations yield incorrect proxies
> -------------------------------------------------
>
> Key: BEAM-12379
> URL: https://issues.apache.org/jira/browse/BEAM-12379
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Brian Hulette
> Assignee: Brian Hulette
> Priority: P2
> Labels: dataframe-api
> Time Spent: 5h
> Remaining Estimate: 0h
>
> There are some operations that yield proxies which do not match the data they
> produce at runtime. We should add tests that verify proxies match, and fix
> the operations where they dont.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)