[
https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Gates reassigned PIG-299:
------------------------------
Assignee: Santhosh Srinivasan
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java
> method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the
> LogicalPlan::getPredecessor method. Here is the explan plan print out of the
> inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false
> FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false
> FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*]
> Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded:
> false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray})
> Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA
> {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the
> LOProject::getExpression() method. It is not showing up as an input operator.
> Focus on the projection immediately following the filter. If I remove this
> projection then I get a correct plan. For example, let the inner foreach plan
> be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa:
> bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false
> FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age:
> bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age:
> bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type:
> Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded:
> false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type:
> chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded:
> false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray})
> Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA
> {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the
> 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.