[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098522#comment-16098522 ]
Leif Walsh commented on SPARK-21187: ------------------------------------ [~rxin] [~bryanc], pandas does support array and map columns, it represents each value as a python {{list}} or {{dict}} (with {{object}} dtype): {code} >>> pd.DataFrame({'x': [[1,2,3], [4,5]], 'y': [{'hello': 1}, {'world': 2, >>> ('fizz', 'buzz'): 3}]}) x y 0 [1, 2, 3] {'hello': 1} 1 [4, 5] {'world': 2, ('fizz', 'buzz'): 3} {code} You could also model structs as namedtuples: {code} >>> import collections >>> person = collections.namedtuple('person', ['first', 'last']) >>> pd.DataFrame({'participants': [person('Reynold', 'Xin'), person('Bryan', >>> 'Cutler')]}) participants 0 (Reynold, Xin) 1 (Bryan, Cutler) {code} This would also have {{object}} dtype. Another choice is, for structs at least, you could model it as a hierarchical index on columns: {code} >>> pd.DataFrame(data=[['Reynold', 'Xin'], ['Bryan', 'Cutler']], >>> columns=pd.MultiIndex(levels=[['participant'], ['first', 'last']], >>> labels=[[0, 0], [0, 1]])) participant first last 0 Reynold Xin 1 Bryan Cutler {code} Let me know if this is unclear and I should elaborate. > Complete support for remaining Spark data types in Arrow Converters > ------------------------------------------------------------------- > > Key: SPARK-21187 > URL: https://issues.apache.org/jira/browse/SPARK-21187 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL > Affects Versions: 2.3.0 > Reporter: Bryan Cutler > > This is to track adding the remaining type support in Arrow Converters. > Currently, only primitive data types are supported. ' > Remaining types: > * *Date* > * *Timestamp* > * *Complex*: Struct, Array, Map > * *Decimal* -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org