[jira] [Commented] (SPARK-8007) Support resolving virtual columns in DataFrames

Joseph Batchik (JIRA) Fri, 17 Jul 2015 10:59:55 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631673#comment-14631673
 ]


Joseph Batchik commented on SPARK-8007:
---------------------------------------

Reynold, thanks for pointing that out. I updated the commit to use what you 
suggested. This should also make it easy to add other virtual columns as 
described in the parent ticket. All that should need to be done is updating the 
resolver in the logical plan and the new virtual column rule.

https://github.com/JDrit/spark/commit/7b46e7de6f98df98480fa34c85248aa2d90bc635#diff-d74f782d414a74eee09a4b6b9994be87R34

> Support resolving virtual columns in DataFrames
> -----------------------------------------------
>
>                 Key: SPARK-8007
>                 URL: https://issues.apache.org/jira/browse/SPARK-8007
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>
> Create the infrastructure so we can resolve df("SPARK__PARTITION__ID") to 
> SparkPartitionID expression.
> A cool use case is to understand physical data skew:
> {code}
> df.groupBy("SPARK__PARTITION__ID").count()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-8007) Support resolving virtual columns in DataFrames

Reply via email to