[ 
https://issues.apache.org/jira/browse/SPARK-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222310#comment-14222310
 ] 

Davies Liu commented on SPARK-4561:
-----------------------------------

I tried to do it, but found that it's not easy, bacause Row() could be nested 
in MapType and ArrayType (even UDT), it also could be expensive.

Maybe we need to do it optional, using recursive=True?

> PySparkSQL's Row.asDict() should convert nested rows to dictionaries
> --------------------------------------------------------------------
>
>                 Key: SPARK-4561
>                 URL: https://issues.apache.org/jira/browse/SPARK-4561
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>    Affects Versions: 1.2.0
>            Reporter: Josh Rosen
>            Assignee: Davies Liu
>
> In PySpark, you can call {{.asDict
> ()}} on a SparkSQL {{Row}} to convert it to a dictionary.  Unfortunately, 
> though, this does not convert nested rows to dictionaries.  For example:
> {code}
> >>> sqlContext.sql("select results from results").first()
> Row(results=[Row(time=3.762), Row(time=3.47), Row(time=3.559), 
> Row(time=3.458), Row(time=3.229), Row(time=3.21), Row(time=3.166), 
> Row(time=3.276), Row(time=3.239), Row(time=3.149)])
> >>> sqlContext.sql("select results from results").first().asDict()
> {u'results': [(3.762,),
>   (3.47,),
>   (3.559,),
>   (3.458,),
>   (3.229,),
>   (3.21,),
>   (3.166,),
>   (3.276,),
>   (3.239,),
>   (3.149,)]}
> {code}
> Actually, it looks like the nested fields are just left as Rows (IPython's 
> fancy display logic obscured this in my first example):
> {code}
> >>> Row(results=[Row(time=1), Row(time=2)]).asDict()
> {'results': [Row(time=1), Row(time=2)]}
> {code}
> Here's the output I'd expect:
> {code}
> >>> Row(results=[Row(time=1), Row(time=2)])
> {'results' : [{'time': 1}, {'time': 2}]}
> {code}
> I ran into this issue when trying to use Pandas dataframes to display nested 
> data that I queried from Spark SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to