[
https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098522#comment-16098522
]
Leif Walsh commented on SPARK-21187:
------------------------------------
[~rxin] [~bryanc], pandas does support array and map columns, it represents
each value as a python {{list}} or {{dict}} (with {{object}} dtype):
{code}
>>> pd.DataFrame({'x': [[1,2,3], [4,5]], 'y': [{'hello': 1}, {'world': 2,
>>> ('fizz', 'buzz'): 3}]})
x y
0 [1, 2, 3] {'hello': 1}
1 [4, 5] {'world': 2, ('fizz', 'buzz'): 3}
{code}
You could also model structs as namedtuples:
{code}
>>> import collections
>>> person = collections.namedtuple('person', ['first', 'last'])
>>> pd.DataFrame({'participants': [person('Reynold', 'Xin'), person('Bryan',
>>> 'Cutler')]})
participants
0 (Reynold, Xin)
1 (Bryan, Cutler)
{code}
This would also have {{object}} dtype.
Another choice is, for structs at least, you could model it as a hierarchical
index on columns:
{code}
>>> pd.DataFrame(data=[['Reynold', 'Xin'], ['Bryan', 'Cutler']],
>>> columns=pd.MultiIndex(levels=[['participant'], ['first', 'last']],
>>> labels=[[0, 0], [0, 1]]))
participant
first last
0 Reynold Xin
1 Bryan Cutler
{code}
Let me know if this is unclear and I should elaborate.
> Complete support for remaining Spark data types in Arrow Converters
> -------------------------------------------------------------------
>
> Key: SPARK-21187
> URL: https://issues.apache.org/jira/browse/SPARK-21187
> Project: Spark
> Issue Type: Umbrella
> Components: PySpark, SQL
> Affects Versions: 2.3.0
> Reporter: Bryan Cutler
>
> This is to track adding the remaining type support in Arrow Converters.
> Currently, only primitive data types are supported. '
> Remaining types:
> * *Date*
> * *Timestamp*
> * *Complex*: Struct, Array, Map
> * *Decimal*
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]