[jira] Updated: (PIG-1647) Logical simplifier throws a NPE
[ https://issues.apache.org/jira/browse/PIG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1647: -- Attachment: PIG-1647.patch passes test-core. test-patch results: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Logical simplifier throws a NPE --- Key: PIG-1647 URL: https://issues.apache.org/jira/browse/PIG-1647 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.8.0 Attachments: PIG-1647.patch, PIG-1647.patch A query like: A = load 'd.txt' as (a:chararray, b:long, c:map[], d:chararray, e:chararray); B = filter A by a == 'v' and b == 117L and c#'p1' == 'h' and c#'p2' == 'to' and ((d is not null and d != '') or (e is not null and e != '')); will cause the logical expression simplifier to throw a NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1647) Logical simplifier throws a NPE
[ https://issues.apache.org/jira/browse/PIG-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1647: -- Status: Patch Available (was: Open) Logical simplifier throws a NPE --- Key: PIG-1647 URL: https://issues.apache.org/jira/browse/PIG-1647 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.8.0 Attachments: PIG-1647.patch, PIG-1647.patch A query like: A = load 'd.txt' as (a:chararray, b:long, c:map[], d:chararray, e:chararray); B = filter A by a == 'v' and b == 117L and c#'p1' == 'h' and c#'p2' == 'to' and ((d is not null and d != '') or (e is not null and e != '')); will cause the logical expression simplifier to throw a NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915037#action_12915037 ] Daniel Dai commented on PIG-1643: - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. All tests pass. join fails for a query with input having 'load using pigstorage without schema' + 'foreach' --- Key: PIG-1643 URL: https://issues.apache.org/jira/browse/PIG-1643 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch, PIG-1643.4.patch {code} l1 = load 'std.txt'; l2 = load 'std.txt'; f1 = foreach l1 generate $0 as abc, $1 as def; -- j = join f1 by $0, l2 by $0 using 'replicated'; -- j = join l2 by $0, f1 by $0 using 'replicated'; j = join l2 by $0, f1 by $0 ; dump j; {code} the error - {code} 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2044: The type null cannot be collected as a Key type {code} The MR plan from explain - {code} #-- # Map Reduce Plan #-- MapReduce node scope-21 Map Plan Union[tuple] - scope-22 | |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 | | | | | Project[bytearray][0] - scope-12 | | | |---l2: Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) - scope-0 | |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 | | | Project[NULL][0] - scope-14 | |---f1: New For Each(false,false)[bag] - scope-6 | | | Project[bytearray][0] - scope-2 | | | Project[bytearray][1] - scope-4 | |---l1: Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) - scope-1 Reduce Plan j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 | |---POJoinPackage(true,true)[tuple] - scope-23 Global sort: false {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-1643. - Release Note: PIG-1643.4.patch committed to both trunk and 0.8 branch. Resolution: Fixed join fails for a query with input having 'load using pigstorage without schema' + 'foreach' --- Key: PIG-1643 URL: https://issues.apache.org/jira/browse/PIG-1643 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch, PIG-1643.4.patch {code} l1 = load 'std.txt'; l2 = load 'std.txt'; f1 = foreach l1 generate $0 as abc, $1 as def; -- j = join f1 by $0, l2 by $0 using 'replicated'; -- j = join l2 by $0, f1 by $0 using 'replicated'; j = join l2 by $0, f1 by $0 ; dump j; {code} the error - {code} 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2044: The type null cannot be collected as a Key type {code} The MR plan from explain - {code} #-- # Map Reduce Plan #-- MapReduce node scope-21 Map Plan Union[tuple] - scope-22 | |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 | | | | | Project[bytearray][0] - scope-12 | | | |---l2: Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) - scope-0 | |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 | | | Project[NULL][0] - scope-14 | |---f1: New For Each(false,false)[bag] - scope-6 | | | Project[bytearray][0] - scope-2 | | | Project[bytearray][1] - scope-4 | |---l1: Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) - scope-1 Reduce Plan j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 | |---POJoinPackage(true,true)[tuple] - scope-23 Global sort: false {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1644) New logical plan: Plan.connect with position is misused in some places
[ https://issues.apache.org/jira/browse/PIG-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1644: Attachment: PIG-1644-4.patch PIG-1644-4.patch fix findbug warnings and additional unit failures. New logical plan: Plan.connect with position is misused in some places -- Key: PIG-1644 URL: https://issues.apache.org/jira/browse/PIG-1644 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.8.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1644-1.patch, PIG-1644-2.patch, PIG-1644-3.patch, PIG-1644-4.patch When we replace/remove/insert a node, we will use disconnect/connect methods of OperatorPlan. When we disconnect an edge, we shall save the position of the edge in origination and destination, and use this position when connect to the new predecessor/successor. Some of the pattens are: Insert a new node: {code} PairInteger, Integer pos = plan.disconnect(pred, succ); plan.connect(pred, pos.first, newnode, 0); plan.connect(newnode, 0, succ, pos.second); {code} Remove a node: {code} PairInteger, Integer pos1 = plan.disconnect(pred, nodeToRemove); PairInteger, Integer pos2 = plan.disconnect(nodeToRemove, succ); plan.connect(pred, pos1.first, succ, pos2.second); {code} Replace a node: {code} PairInteger, Integer pos1 = plan.disconnect(pred, nodeToReplace); PairInteger, Integer pos2 = plan.disconnect(nodeToReplace, succ); plan.connect(pred, pos1.first, newNode, pos1.second); plan.connect(newNode, pos2.first, succ, pos2.second); {code} There are couple of places of we does not follow this pattern, that results some error. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2, a3); b = foreach a generate a0, a1, a2; store b into 'aaa'; c = order b by a2; d = foreach c generate a2; store d into 'bbb'; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1644) New logical plan: Plan.connect with position is misused in some places
[ https://issues.apache.org/jira/browse/PIG-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-1644. - Hadoop Flags: [Reviewed] Resolution: Fixed [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. All tests pass. Patch committed to both trunk and 0.8 branch. New logical plan: Plan.connect with position is misused in some places -- Key: PIG-1644 URL: https://issues.apache.org/jira/browse/PIG-1644 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.8.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Attachments: PIG-1644-1.patch, PIG-1644-2.patch, PIG-1644-3.patch, PIG-1644-4.patch When we replace/remove/insert a node, we will use disconnect/connect methods of OperatorPlan. When we disconnect an edge, we shall save the position of the edge in origination and destination, and use this position when connect to the new predecessor/successor. Some of the pattens are: Insert a new node: {code} PairInteger, Integer pos = plan.disconnect(pred, succ); plan.connect(pred, pos.first, newnode, 0); plan.connect(newnode, 0, succ, pos.second); {code} Remove a node: {code} PairInteger, Integer pos1 = plan.disconnect(pred, nodeToRemove); PairInteger, Integer pos2 = plan.disconnect(nodeToRemove, succ); plan.connect(pred, pos1.first, succ, pos2.second); {code} Replace a node: {code} PairInteger, Integer pos1 = plan.disconnect(pred, nodeToReplace); PairInteger, Integer pos2 = plan.disconnect(nodeToReplace, succ); plan.connect(pred, pos1.first, newNode, pos1.second); plan.connect(newNode, pos2.first, succ, pos2.second); {code} There are couple of places of we does not follow this pattern, that results some error. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2, a3); b = foreach a generate a0, a1, a2; store b into 'aaa'; c = order b by a2; d = foreach c generate a2; store d into 'bbb'; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.