[ 
https://issues.apache.org/jira/browse/PIG-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681913#action_12681913
 ] 

Santhosh Srinivasan commented on PIG-693:
-----------------------------------------

Review comments on 693.patch

Overall comment:

Instead of removing the private member variables and the associated 
constructors, its better to mark the constructor and the other methods as 
deprecated. If there are developers who are using these sources then compiler 
warning messages will help them move the code instead of forcing them to make 
immediate changes.

Specific comments:

Index: 
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/LogToPhyTranslationVisitor.java
=====================================================================================

Is it required to have the complete package name?

{code}
+        List<org.apache.pig.impl.logicalLayer.ExpressionOperator> fromList = 
func.getArguments();
{code}

Index: src/org/apache/pig/impl/logicalLayer/RemoveRedundantOperators.java
=============================================================

1. Can you add an error code and an appropriate message to the VisitorException 
?
{code}
+                            throw new VisitorException(pe.getMessage(),pe);
{code}


Index: src/org/apache/pig/impl/logicalLayer/validators/TypeCheckingVisitor.java
===================================================================
1. Instead of node use node.getClass().getSimpleName(). Otherwise, users will 
see a huge string with contains the operator key and scope. This is useful for 
developers but not for users.
{code}
+            String msg = "Problem with inserting cast operator for " + node + 
" in plan.";
{code}

Index: src/org/apache/pig/impl/plan/OperatorPlan.java
===========================================
1.  Use Integer instead of string. Integer, Long, etc. can handle null and 
toString will ensure that null is printed as "null".

{code}
+            String size = " null ";
+            if(preds != null)
+                size = ((Integer)preds.size()).toString();
{code}

2. I see that you are following the convention of the source for throwing 
exceptions. It will be good if you can include an error code, not log the error 
and just throw the exception.

{code}
+            PlanException pe = new PlanException("Attempt to remove " +
+                    " and reconnect for node with  " + size +
+            " predecessors.");
+            log.error(pe.getMessage());
+            throw pe;
{code}

{code}
+        //checking pre-requisite conditions
+        if (C == null || C.size() == 0) {
+            PlanException pe = new PlanException("Attempt to remove " +
+            " and reconnect for node with no successors.");
+            log.error(pe.getMessage());
+            throw pe;
+        }   

{code}

3. OperatorPlan is a generic class and should not hold any references to 
LogicalOperator, ExpressionOperator or LOProject. In addition, the last section 
of removeAndReconnectMultiSucc checks for LOProject. This should instead go 
into the patchInputReference method in RemoveRedundantOperators.java.

{code}
+import org.apache.pig.impl.logicalLayer.ExpressionOperator;
+import org.apache.pig.impl.logicalLayer.LOProject;
+import org.apache.pig.impl.logicalLayer.LogicalOperator;
{code}

{code}
+            //special handling of LOProject because its getExpression() does
+            // need not be same as getPredecessors(LOProject)
+            if(c instanceof LOProject){
+                LOProject lop = (LOProject)c;
+                if(B == lop.getExpression()){
+                    lop.setExpression((LogicalOperator)B);
+                }
+            }
{code}

4. The removeAndReconnectMultiSucc does not take into account successors of A. 
If A has other successors apart from B then the following code will remove the 
edge from A to the other successors.

{code}
+        // Do A.succesors = B.successors(ie C)
+        mFromEdges.removeKey(A);
+        mFromEdges.put(A,C);
{code}

5. Minor comment, the use of variables A, B, C should be changed to nodeA, 
nodeB, nodeC to distinguish between types (like E) and instances (like A, B, 
C). Its a little bit confusing.

> Parameter to UDF which is an alias returned in another UDF in nested foreach 
> causes incorrect results
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PIG-693
>                 URL: https://issues.apache.org/jira/browse/PIG-693
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 1.0.0
>            Reporter: Viraj Bhat
>            Assignee: Thejas M Nair
>         Attachments: 693.patch, 693.utest.patch, one.txt, REPLACEALL.java, 
> URLDECODE.java
>
>
> Consider the following Pig Script
> {code}
> register myudf.jar;
> A = load 'one.txt' using PigStorage() as ( one: int ); --use this dummy file 
> to start execution
> B = foreach A {
>         dec = myudf.URLDECODE('hello');
>         str1 = myudf.REPLACEALL(dec, '[\\u0000-\\u0020]', ' '); -- ERROR
>         str2 = myudf.REPLACEALL('hello', '[\\u0000-\\u0020]', ' ');
>         generate
>                 dec,
>                 str1,
>                 str2;
>         };
> describe B;
> dump B;
> {code}
> where one.txt is a file which contains number one (1) for starting execution 
> of the Pig script!!
> {code}
> describe B; 
> {code}
>  returns the following:
> B: {urldecode_9: chararray,replaceall_urldecode_10_11: 
> chararray,replaceall_12: chararray}
> {code}
> dump B;
> {code}
> returns 
> (hello,[\u0000-\u0020],hello)
> The result should be:
> There is a workaround for the same, 
> {code}
> register myudf.jar;
> A = load 'one.txt' using PigStorage() as ( one: int );
> B = foreach A {
>         dec = myudf.URLDECODE('hello');
>         generate
>                 dec as dec,
>                 myudf.REPLACEALL(dec, '[\\u0000-\\u0020]', ' ') as str1,
>                 myudf.REPLACEALL('hello', '[\\u0000-\\u0020]', ' ') as str2;
>         };
> describe B;
> dump B;
> {code}
> where 
> {code}
> dump B;
> {code}
> returns (hello,hello,hello)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to