[jira] Commented: (PIG-1207) [zebra] Data sanity check should be performed at the end of writing instead of later at query time

2010-03-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843638#action_12843638
 ] 

Hadoop QA commented on PIG-1207:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438300/PIG-1207.patch
  against trunk revision 921185.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/238/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/238/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/238/console

This message is automatically generated.

 [zebra] Data sanity check should be performed at the end  of writing instead 
 of later at query time
 ---

 Key: PIG-1207
 URL: https://issues.apache.org/jira/browse/PIG-1207
 Project: Pig
  Issue Type: Improvement
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: PIG-1207.patch, PIG-1207.patch


 Currently the equity check of number of rows across different column groups 
 are performed by the query. And the error info is sketchy and only emits a 
 Column groups are not evenly distributed, or worse,  throws an 
 IndexOufOfBound exception from CGScanner.getCGValue since BasicTable.atEnd 
 and BasicTable.getKey, which are called just before BasicTable.getValue, only 
 checks the first column group in projection and any discrepancy of the number 
 of rows per file cross multiple column groups in projection could have  
 BasicTable.atEnd  return false and BasicTable.getKey return a key normally 
 but another column group already exaust its current file and the call to its 
 CGScanner.getCGValue throw the exception. 
 This check should also be performed at the end of writing and the error info 
 should be more informational.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1207) [zebra] Data sanity check should be performed at the end of writing instead of later at query time

2010-03-10 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12843653#action_12843653
 ] 

Yan Zhou commented on PIG-1207:
---

This is sanity check at end of writing. Existing writing tests already have a 
good coverage and no new tests need to be introduced.

 [zebra] Data sanity check should be performed at the end  of writing instead 
 of later at query time
 ---

 Key: PIG-1207
 URL: https://issues.apache.org/jira/browse/PIG-1207
 Project: Pig
  Issue Type: Improvement
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: PIG-1207.patch, PIG-1207.patch


 Currently the equity check of number of rows across different column groups 
 are performed by the query. And the error info is sketchy and only emits a 
 Column groups are not evenly distributed, or worse,  throws an 
 IndexOufOfBound exception from CGScanner.getCGValue since BasicTable.atEnd 
 and BasicTable.getKey, which are called just before BasicTable.getValue, only 
 checks the first column group in projection and any discrepancy of the number 
 of rows per file cross multiple column groups in projection could have  
 BasicTable.atEnd  return false and BasicTable.getKey return a key normally 
 but another column group already exaust its current file and the call to its 
 CGScanner.getCGValue throw the exception. 
 This check should also be performed at the end of writing and the error info 
 should be more informational.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.