[ https://issues.apache.org/jira/browse/PIG-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681913#action_12681913 ]
Santhosh Srinivasan commented on PIG-693: ----------------------------------------- Review comments on 693.patch Overall comment: Instead of removing the private member variables and the associated constructors, its better to mark the constructor and the other methods as deprecated. If there are developers who are using these sources then compiler warning messages will help them move the code instead of forcing them to make immediate changes. Specific comments: Index: src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/LogToPhyTranslationVisitor.java ===================================================================================== Is it required to have the complete package name? {code} + List<org.apache.pig.impl.logicalLayer.ExpressionOperator> fromList = func.getArguments(); {code} Index: src/org/apache/pig/impl/logicalLayer/RemoveRedundantOperators.java ============================================================= 1. Can you add an error code and an appropriate message to the VisitorException ? {code} + throw new VisitorException(pe.getMessage(),pe); {code} Index: src/org/apache/pig/impl/logicalLayer/validators/TypeCheckingVisitor.java =================================================================== 1. Instead of node use node.getClass().getSimpleName(). Otherwise, users will see a huge string with contains the operator key and scope. This is useful for developers but not for users. {code} + String msg = "Problem with inserting cast operator for " + node + " in plan."; {code} Index: src/org/apache/pig/impl/plan/OperatorPlan.java =========================================== 1. Use Integer instead of string. Integer, Long, etc. can handle null and toString will ensure that null is printed as "null". {code} + String size = " null "; + if(preds != null) + size = ((Integer)preds.size()).toString(); {code} 2. I see that you are following the convention of the source for throwing exceptions. It will be good if you can include an error code, not log the error and just throw the exception. {code} + PlanException pe = new PlanException("Attempt to remove " + + " and reconnect for node with " + size + + " predecessors."); + log.error(pe.getMessage()); + throw pe; {code} {code} + //checking pre-requisite conditions + if (C == null || C.size() == 0) { + PlanException pe = new PlanException("Attempt to remove " + + " and reconnect for node with no successors."); + log.error(pe.getMessage()); + throw pe; + } {code} 3. OperatorPlan is a generic class and should not hold any references to LogicalOperator, ExpressionOperator or LOProject. In addition, the last section of removeAndReconnectMultiSucc checks for LOProject. This should instead go into the patchInputReference method in RemoveRedundantOperators.java. {code} +import org.apache.pig.impl.logicalLayer.ExpressionOperator; +import org.apache.pig.impl.logicalLayer.LOProject; +import org.apache.pig.impl.logicalLayer.LogicalOperator; {code} {code} + //special handling of LOProject because its getExpression() does + // need not be same as getPredecessors(LOProject) + if(c instanceof LOProject){ + LOProject lop = (LOProject)c; + if(B == lop.getExpression()){ + lop.setExpression((LogicalOperator)B); + } + } {code} 4. The removeAndReconnectMultiSucc does not take into account successors of A. If A has other successors apart from B then the following code will remove the edge from A to the other successors. {code} + // Do A.succesors = B.successors(ie C) + mFromEdges.removeKey(A); + mFromEdges.put(A,C); {code} 5. Minor comment, the use of variables A, B, C should be changed to nodeA, nodeB, nodeC to distinguish between types (like E) and instances (like A, B, C). Its a little bit confusing. > Parameter to UDF which is an alias returned in another UDF in nested foreach > causes incorrect results > ----------------------------------------------------------------------------------------------------- > > Key: PIG-693 > URL: https://issues.apache.org/jira/browse/PIG-693 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 1.0.0 > Reporter: Viraj Bhat > Assignee: Thejas M Nair > Attachments: 693.patch, 693.utest.patch, one.txt, REPLACEALL.java, > URLDECODE.java > > > Consider the following Pig Script > {code} > register myudf.jar; > A = load 'one.txt' using PigStorage() as ( one: int ); --use this dummy file > to start execution > B = foreach A { > dec = myudf.URLDECODE('hello'); > str1 = myudf.REPLACEALL(dec, '[\\u0000-\\u0020]', ' '); -- ERROR > str2 = myudf.REPLACEALL('hello', '[\\u0000-\\u0020]', ' '); > generate > dec, > str1, > str2; > }; > describe B; > dump B; > {code} > where one.txt is a file which contains number one (1) for starting execution > of the Pig script!! > {code} > describe B; > {code} > returns the following: > B: {urldecode_9: chararray,replaceall_urldecode_10_11: > chararray,replaceall_12: chararray} > {code} > dump B; > {code} > returns > (hello,[\u0000-\u0020],hello) > The result should be: > There is a workaround for the same, > {code} > register myudf.jar; > A = load 'one.txt' using PigStorage() as ( one: int ); > B = foreach A { > dec = myudf.URLDECODE('hello'); > generate > dec as dec, > myudf.REPLACEALL(dec, '[\\u0000-\\u0020]', ' ') as str1, > myudf.REPLACEALL('hello', '[\\u0000-\\u0020]', ' ') as str2; > }; > describe B; > dump B; > {code} > where > {code} > dump B; > {code} > returns (hello,hello,hello) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.