[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Attachment: PIG-1643.4.patch

PIG-1643.4.patch  is PIG-1643.3.patch + test case

> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch, 
> PIG-1643.4.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1643:


Attachment: PIG-1643.3.patch

PIG-1643.3.patch is more general than PIG-1643.2.patch. It solves this null 
schema issue for all expressions.

> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch, PIG-1643.2.patch, PIG-1643.3.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1643:


Attachment: PIG-1643.2.patch

Attach a fix.

> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch, PIG-1643.2.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Tests passed.
Patch committed to 0.8 branch and trunk.


> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Status: Patch Available  (was: Open)

> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1643) join fails for a query with input having 'load using pigstorage without schema' + 'foreach'

2010-09-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1643:
---

Attachment: PIG-1643.1.patch

PIG-1643.1.patch
There was a code path that lead to fields having NULL datatype instead of the 
default datatype of BYTEARRAY. That was causing these failures. 
Test-patch has succeeded, unit tests are running.


> join fails for a query with input having 'load using pigstorage without 
> schema' + 'foreach'
> ---
>
> Key: PIG-1643
> URL: https://issues.apache.org/jira/browse/PIG-1643
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1643.1.patch
>
>
> {code}
> l1 = load 'std.txt';
> l2 = load 'std.txt'; 
> f1 = foreach l1 generate $0 as abc, $1 as  def;
> -- j =  join f1 by $0, l2 by $0 using 'replicated';
> -- j =  join l2 by $0, f1 by $0 using 'replicated';
> j =  join l2 by $0, f1 by $0 ;
> dump j;
> {code}
> the error -
> {code}
> 2010-09-22 16:24:48,584 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2044: The type null cannot be collected as a Key type
> {code}
> The MR plan from explain  -
> {code}
> #--
> # Map Reduce Plan  
> #--
> MapReduce node scope-21
> Map Plan
> Union[tuple] - scope-22
> |
> |---j: Local Rearrange[tuple]{bytearray}(false) - scope-11
> |   |   |
> |   |   Project[bytearray][0] - scope-12
> |   |
> |   |---l2: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-0
> |
> |---j: Local Rearrange[tuple]{NULL}(false) - scope-13
> |   |
> |   Project[NULL][0] - scope-14
> |
> |---f1: New For Each(false,false)[bag] - scope-6
> |   |
> |   Project[bytearray][0] - scope-2
> |   |
> |   Project[bytearray][1] - scope-4
> |
> |---l1: 
> Load(file:///Users/tejas/pig_obyfail/trunk/std.txt:org.apache.pig.builtin.PigStorage)
>  - scope-1
> Reduce Plan
> j: Store(/tmp/x:org.apache.pig.builtin.PigStorage) - scope-18
> |
> |---POJoinPackage(true,true)[tuple] - scope-23
> Global sort: false
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.