[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746387#action_12746387
 ] 

Santhosh Srinivasan commented on PIG-922:
-----------------------------------------

Review comments for patch p1_1

I have not reviewed the test cases. I have reviewed all the sources.

Index: src/org/apache/pig/impl/logicalLayer/RelationalOperator.java
===================================================================

In the second example, why are A.a0 and B.b0 relevant input columns for C.$0 ? 
I don't see this logic in LOJoin.getRelevantInputs()

{code}
+     * eg2:
+     * A = load 'a' AS (a0, a1);
+     * B = load 'b' AS (b0, b1);
+     * C = join A by a0, B by b0;
+     * 
+     * Relevant input columns for C.$0 is A.a0, B.b0. Relevant input columns 
for C.$1 is A.a1.
{code}

Index: src/org/apache/pig/impl/logicalLayer/LOForEach.java
===================================================================

I am not sure about the logic for the computation of the inner plan number that 
produces the output column in getRelevantInputs. I would recommend that you 
cache the schema generated by the inner plan (as part of getSchema()) and use 
that information here.

{code}
+        // find the index of foreach inner plan for this particular output 
column
+        LogicalOperator pOp = null;
+        int planIndex = 0;
+        try {
+            pOp = 
mSchema.getField(0).getReverseCanonicalMap().keySet().iterator().next();
+         
+            for (int i=1;i<=column;i++)
+            {
+                if 
(mSchema.getField(i).getReverseCanonicalMap().keySet().iterator().next()!=pOp)
+                {
+                    planIndex++;
+                    pOp = 
mSchema.getField(i).getReverseCanonicalMap().keySet().iterator().next();
+                }
+            }
+        } catch (FrontendException e) {
+            log.warn("Cannot retrieve field schema from "+mSchema.toString());
+            return null;
+        }
{code}

Index: src/org/apache/pig/impl/logicalLayer/LOCogroup.java
===================================================================

Why are we adding null to the list of required fields while iterating over the 
inputs?

{code}
+            if(inputNum == column-1) {
+                result.add(new RequiredFields(true));
+            } else {
+                result.add(null);
+            }
{code}


Index: src/org/apache/pig/impl/plan/RequiredFields.java
===================================================================

Where are the following methods used? I did not see any calls to them.

{code}
+    
+    // return true if this merge modify the object itself 
+    public boolean merge(RequiredFields r2)
+    {
+        boolean newRequiredFields = false;
+        if (r2==null)
+            return newRequiredFields;
+        if (r2.getNeedAllFields())
+        {
+            mNeedAllFields = true;
+        }
+        if (!r2.getNeedNoFields())
+        {
+            mNeedNoFields = false;
+        }
+        if (r2.getFields()==null)
+            return newRequiredFields;
+        for (Pair<Integer, Integer> f:r2.getFields())
+        {
+            if (mFields==null)
+                mFields = new ArrayList<Pair<Integer, Integer>>(); 
+            if (!mFields.contains(f))
+            {
+                mFields.add(f);
+                mNeedNoFields = false;
+                newRequiredFields = true;
+            }
+        }
+        return newRequiredFields;
+    }
+
+    public void reIndex(int i)
+    {
+        for (Pair<Integer, Integer> p:mFields)
+        {
+            p.first = i;
+        }
+    }
{code}

> Logical optimizer: push up project
> ----------------------------------
>
>                 Key: PIG-922
>                 URL: https://issues.apache.org/jira/browse/PIG-922
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.4.0
>
>         Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch
>
>
> This is a continuation work of 
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
> another rule to the logical optimizer: Push up project, ie, prune columns as 
> early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to