[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1643: --- Attachment: PIG-1643.4.patch PIG-1643.4.patch is PIG-1643.3.patch + test case > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch, > PIG-1643.4.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1643: Attachment: PIG-1643.3.patch PIG-1643.3.patch is more general than PIG-1643.2.patch. It solves this null schema issue for all expressions. > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1643: Attachment: PIG-1643.2.patch Attach a fix. > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch, PIG-1643.2.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1643: --- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Tests passed. Patch committed to 0.8 branch and trunk. > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1643: --- Status: Patch Available (was: Open) > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'
[ https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1643: --- Attachment: PIG-1643.1.patch PIG-1643.1.patch There was a code path that lead to fields having NULL datatype instead of the default datatype of BYTEARRAY. That was causing these failures. Test-patch has succeeded, unit tests are running. > join fails for a query with input having 'load using pigstorage without > schema' + 'foreach' > --- > > Key: PIG-1643 > URL: https://issues.apache.org/jira/browse/PIG-1643 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1643.1.patch > > > {code} > l1 = load 'std.txt'; > l2 = load 'std.txt'; > f1 = foreach l1 generate $0 as abc, $1 as def; > -- j = join f1 by $0, l2 by $0 using 'replicated'; > -- j = join l2 by $0, f1 by $0 using 'replicated'; > j = join l2 by $0, f1 by $0 ; > dump j; > {code} > the error - > {code} > 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2044: The type null cannot be collected as a Key type > {code} > The MR plan from explain - > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node scope-21 > Map Plan > Union[tuple] - scope-22 > | > |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11 > | | | > | | Project[bytearray][0] - scope-12 > | | > | |---l2: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-0 > | > |---j: Local Rearrange[tuple]{NULL}(false) - scope-13 > | | > | Project[NULL][0] - scope-14 > | > |---f1: New For Each(false,false)[bag] - scope-6 > | | > | Project[bytearray][0] - scope-2 > | | > | Project[bytearray][1] - scope-4 > | > |---l1: > Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage) > - scope-1 > Reduce Plan > j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18 > | > |---POJoinPackage(true,true)[tuple] - scope-23 > Global sort: false > > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.