[
https://issues.apache.org/jira/browse/PIG-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thejas M Nair updated PIG-1693:
-------------------------------
Attachment: PIG-1693.1.patch
PIG-1693.1.patch
Highlights -
- ProjectExpression in logical plan now supports project-range
- ProjectStarExpander is called from LogicalPlanBuilder while building
foreach,group,join or sort expression plans, to expand the project-range
expression.
- ProjectStarExpander expands all project-range expressions, except
project-to-end (eg. $5 ..) when input schema is null. This is the only case
when project-range expression is seen by logical optimizers or the physical
plan.
- Some of the logical optimizer rules have changed to consider project-to-end
use cases.
- POProject supports project-to-end expression, and project-star is a special
case of project-to-end.
- MRCompiler and some MR optimizer rules have changed to handle project-to-end
case of POProject
> support project-range expression. (was: There needs to be a way in foreach to
> indicate "and all the rest of the fields" )
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: PIG-1693
> URL: https://issues.apache.org/jira/browse/PIG-1693
> Project: Pig
> Issue Type: New Feature
> Components: impl
> Reporter: Alan Gates
> Assignee: Thejas M Nair
> Fix For: 0.9.0
>
> Attachments: PIG-1693.1.patch
>
>
> A common use case we see in Pig is people have many columns in their data and
> they only want to operate on a few of them. Consider for example if before
> storing data with ten columns, the user wants to perform a cast on one column:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, secondcol, thridcol, forthcol,
> fifthcol, sixthcol, seventhcol, eigthcol, ninethcol, tenthcol;
> store Z into 'output';
> {code}
> Obviously this only gets worse as the user has more columns. Ideally the
> above could be transformed to something like:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, "and all the rest";
> store Z into 'output'
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira