[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-25 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759789#action_12759789
 ] 

Raghu Angadi commented on PIG-949:
--

I just committed this. Thanks Yan for the fix and Jing for the test!

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
 Environment: linux
Reporter: Alok Singh
Assignee: Yan Zhou
 Fix For: 0.5.0

 Attachments: Pig_949.patch, Pig_949.patch, Pig_949.patch


 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-22 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758387#action_12758387
 ] 

Yan Zhou commented on PIG-949:
--

Test case added.

Thanks,

Yan



 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
 Environment: linux
Reporter: Alok Singh
Assignee: Yan Zhou
 Attachments: Pig_949.patch, Pig_949.patch


 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758535#action_12758535
 ] 

Hadoop QA commented on PIG-949:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420313/Pig_949.patch
  against trunk revision 817739.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/9/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/9/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/9/console

This message is automatically generated.

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
 Environment: linux
Reporter: Alok Singh
Assignee: Yan Zhou
 Fix For: 0.4.0, 0.5.0

 Attachments: Pig_949.patch, Pig_949.patch, Pig_949.patch


 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757996#action_12757996
 ] 

Hadoop QA commented on PIG-949:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420202/Pig_949.patch
  against trunk revision 816832.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/40/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/40/console

This message is automatically generated.

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
 Environment: linux
Reporter: Alok Singh
Assignee: Yan Zhou
 Attachments: Pig_949.patch


 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-21 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758074#action_12758074
 ] 

Yan Zhou commented on PIG-949:
--

The test case is 

contrib/zebra/src/test/org/apache/hadoop/zebra/io/TestJira949.java

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
 Environment: linux
Reporter: Alok Singh
Assignee: Yan Zhou
 Attachments: Pig_949.patch


 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-14 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755094#action_12755094
 ] 

Yan Zhou commented on PIG-949:
--

The problem is caused by not adding ColumnMappingEntrys from the key-split 
specs in storage info to an  explicitly specified MAP item in storage info, 
thus causing missing CGs as needed by the key-split specs. Everything falls 
apart thereafter. Will create a patch for R1 patch release soon.

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
 Environment: linux
Reporter: Alok Singh

 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-11 Thread Jing Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754353#action_12754353
 ] 

Jing Huang commented on PIG-949:


Thanks Alok. 
I am able to reproduce the problem. 
I was only using i/o layer (not pig loader) to test map split. 
This is what I did:
  final static String STR_SCHEMA = m1:map(string),m2:map(map(int));
  final static String STR_STORAGE = [m1#{a}];[m2#{x|y}]; [m1#{b}, 
m2#{z}];[m1];
...create table and insert data ..

load:  String projection = new String(m1#{a});

I only got null returned. 



Without storage hint [m1], everything works fine. , i.e. 
 final static String STR_STORAGE = [m1#{a}];[m2#{x|y}]; [m1#{b}, m2#{z}];
 ...create table and insert data ..
load:  String projection = new String(m1#{a});
I am able to get value m1#{a}. 

Zebra team is working on the fix.



 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
 Environment: linux
Reporter: Alok Singh

 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.