[
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-5302:
-----------------------------
Assignee: Cheng Lian
> Add support for SQLContext "partition" columns
> ----------------------------------------------
>
> Key: SPARK-5302
> URL: https://issues.apache.org/jira/browse/SPARK-5302
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Reporter: Bob Tiernay
> Assignee: Cheng Lian
> Fix For: 1.4.0
>
>
> For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to
> support a virtual column that maps to part of the the file path, similar to
> what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}
> where {{dt}} is a column of type {{TEXT}}).
> The API could allow the user to type the column using an appropriate
> {{DataType}} instance. This new field could be addressed in SQL statements
> much the same as is done in Hive.
> As a consequence, pruning of partitions could be possible when executing a
> query and also remove the need to materialize a column in each logical
> partition that is already encoded in the path name. Furthermore, this would
> provide an nice interop and migration strategy for Hive users who may one day
> use {{SQLContext}} directly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]