[
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746387#action_12746387
]
Santhosh Srinivasan commented on PIG-922:
-----------------------------------------
Review comments for patch p1_1
I have not reviewed the test cases. I have reviewed all the sources.
Index: src/org/apache/pig/impl/logicalLayer/RelationalOperator.java
===================================================================
In the second example, why are A.a0 and B.b0 relevant input columns for C.$0 ?
I don't see this logic in LOJoin.getRelevantInputs()
{code}
+ * eg2:
+ * A = load 'a' AS (a0, a1);
+ * B = load 'b' AS (b0, b1);
+ * C = join A by a0, B by b0;
+ *
+ * Relevant input columns for C.$0 is A.a0, B.b0. Relevant input columns
for C.$1 is A.a1.
{code}
Index: src/org/apache/pig/impl/logicalLayer/LOForEach.java
===================================================================
I am not sure about the logic for the computation of the inner plan number that
produces the output column in getRelevantInputs. I would recommend that you
cache the schema generated by the inner plan (as part of getSchema()) and use
that information here.
{code}
+ // find the index of foreach inner plan for this particular output
column
+ LogicalOperator pOp = null;
+ int planIndex = 0;
+ try {
+ pOp =
mSchema.getField(0).getReverseCanonicalMap().keySet().iterator().next();
+
+ for (int i=1;i<=column;i++)
+ {
+ if
(mSchema.getField(i).getReverseCanonicalMap().keySet().iterator().next()!=pOp)
+ {
+ planIndex++;
+ pOp =
mSchema.getField(i).getReverseCanonicalMap().keySet().iterator().next();
+ }
+ }
+ } catch (FrontendException e) {
+ log.warn("Cannot retrieve field schema from "+mSchema.toString());
+ return null;
+ }
{code}
Index: src/org/apache/pig/impl/logicalLayer/LOCogroup.java
===================================================================
Why are we adding null to the list of required fields while iterating over the
inputs?
{code}
+ if(inputNum == column-1) {
+ result.add(new RequiredFields(true));
+ } else {
+ result.add(null);
+ }
{code}
Index: src/org/apache/pig/impl/plan/RequiredFields.java
===================================================================
Where are the following methods used? I did not see any calls to them.
{code}
+
+ // return true if this merge modify the object itself
+ public boolean merge(RequiredFields r2)
+ {
+ boolean newRequiredFields = false;
+ if (r2==null)
+ return newRequiredFields;
+ if (r2.getNeedAllFields())
+ {
+ mNeedAllFields = true;
+ }
+ if (!r2.getNeedNoFields())
+ {
+ mNeedNoFields = false;
+ }
+ if (r2.getFields()==null)
+ return newRequiredFields;
+ for (Pair<Integer, Integer> f:r2.getFields())
+ {
+ if (mFields==null)
+ mFields = new ArrayList<Pair<Integer, Integer>>();
+ if (!mFields.contains(f))
+ {
+ mFields.add(f);
+ mNeedNoFields = false;
+ newRequiredFields = true;
+ }
+ }
+ return newRequiredFields;
+ }
+
+ public void reIndex(int i)
+ {
+ for (Pair<Integer, Integer> p:mFields)
+ {
+ p.first = i;
+ }
+ }
{code}
> Logical optimizer: push up project
> ----------------------------------
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
> Issue Type: New Feature
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch
>
>
> This is a continuation work of
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add
> another rule to the logical optimizer: Push up project, ie, prune columns as
> early as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.