[
https://issues.apache.org/jira/browse/ARROW-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17196573#comment-17196573
]
Krisztian Szucs edited comment on ARROW-9997 at 9/15/20, 10:03 PM:
-------------------------------------------------------------------
My issue is StructScalar is an *arrow object* which implements a python mapping
interface. Once we have duplicate keys the object stops to operate, we cannot
do anything with it since all operation will raise a KeyError (not just when we
call {{.as_py()}}
My another problem is that the struct array/scalar is the only type where we
fail to roundtrip between arrow and python (at least according to a hypothesis
test):
{code:python}
pa.array(arr.to_pylist(), type=arr.type)
pa.scalar(scalar.as_py(), type=scalar.type)
{code}
If we want convenient pythonic access to StructScalar I'd rather add a best
effort {{.as_dict()}} method.
was (Author: kszucs):
My issue is StructScalar is an arrow object which implements a python mapping
interface. Once we have duplicate keys the object stops to operate, we cannot
do anything with it since all operation will raise a KeyError (not just when we
call {{.as_py()}}
My another problem is that the struct array/scalar is the only type where we
fail to roundtrip between arrow and python (at least according to a hypothesis
test):
{code:python}
pa.array(arr.to_pylist(), type=arr.type)
pa.scalar(scalar.as_py(), type=scalar.type)
{code}
If we want convenient pythonic access to StructScalar I'd rather add a best
effort {{.as_dict()}} method.
> [Python] StructScalar.as_py() fails if the type has duplicate field names
> -------------------------------------------------------------------------
>
> Key: ARROW-9997
> URL: https://issues.apache.org/jira/browse/ARROW-9997
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Krisztian Szucs
> Assignee: Krisztian Szucs
> Priority: Major
> Fix For: 2.0.0
>
>
> {{StructScalar}} currently extends an abstract Mapping interface. Since the
> type allows duplicate field names we cannot provide that API.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)