[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784167#action_12784167 ] Hadoop QA commented on PIG-1098: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426477/PIG-1098.patch against trunk revision 885465. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/console This message is automatically generated. [zebra] Zebra Performance Optimizations --- Key: PIG-1098 URL: https://issues.apache.org/jira/browse/PIG-1098 Project: Pig Issue Type: Improvement Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0, 0.7.0 Attachments: PIG-1098.patch Many in-core performance optimization opportunities exist in zebra, such as removal of redundant precautionary checks, use of better collection types to reduce levels of indirection to the memory objects, changing of input splits in ascending sizes to descending sizes. Observed protyped improvements are around 10% wall clock time improvements. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784468#action_12784468 ] Chao Wang commented on PIG-1098: Ideally, should have a better structure for methods such as: advance(), advanceCG(), getKey(), getCGKey(), getValue(), getCGValue() (ColumnGroup.java). The only difference of new *CG* methods is that they do not do the check if (atEnd()). This gives some performance gain while degrading code readability a bit. Considering this is the first cut for performance improvement and all the above changes are inside ColumnGroup class, which is package private, as a result, these are Zebra's internal implementation details and we can safely improve them in the future, overall +1 [zebra] Zebra Performance Optimizations --- Key: PIG-1098 URL: https://issues.apache.org/jira/browse/PIG-1098 Project: Pig Issue Type: Improvement Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0, 0.7.0 Attachments: PIG-1098.patch Many in-core performance optimization opportunities exist in zebra, such as removal of redundant precautionary checks, use of better collection types to reduce levels of indirection to the memory objects, changing of input splits in ascending sizes to descending sizes. Observed improvements of wall clock time of some PIG LOAD queries are around 10%. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784565#action_12784565 ] Alan Gates commented on PIG-1098: - Committed patch to 0.6 branch as well. [zebra] Zebra Performance Optimizations --- Key: PIG-1098 URL: https://issues.apache.org/jira/browse/PIG-1098 Project: Pig Issue Type: Improvement Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0, 0.7.0 Attachments: PIG-1098.patch Many in-core performance optimization opportunities exist in zebra, such as removal of redundant precautionary checks, use of better collection types to reduce levels of indirection to the memory objects, changing of input splits in ascending sizes to descending sizes. Observed improvements of wall clock time of some PIG LOAD queries are around 10%. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783976#action_12783976 ] Yan Zhou commented on PIG-1098: --- As this patch is purely for performance improvements, there is no new test case created. So Hudson's complain in that regard should be ignored. [zebra] Zebra Performance Optimizations --- Key: PIG-1098 URL: https://issues.apache.org/jira/browse/PIG-1098 Project: Pig Issue Type: Improvement Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Fix For: 0.6.0, 0.7.0 Attachments: PIG-1098.patch Many in-core performance optimization opportunities exist in zebra, such as removal of redundant precautionary checks, use of better collection types to reduce levels of indirection to the memory objects, changing of input splits in ascending sizes to descending sizes. Observed protyped improvements are around 10% wall clock time improvements. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780744#action_12780744 ] Yan Zhou commented on PIG-1098: --- This pacth is also targeted for the 0.6 release so it needs to be on the 0.6 branch too. [zebra] Zebra Performance Optimizations --- Key: PIG-1098 URL: https://issues.apache.org/jira/browse/PIG-1098 Project: Pig Issue Type: Improvement Reporter: Yan Zhou Assignee: Yan Zhou Priority: Minor Many in-core performance optimization opportunities exist in zebra, such as removal of redundant precautionary checks, use of better collection types to reduce levels of indirection to the memory objects, changing of input splits in ascending sizes to descending sizes. Observed protyped improvements are around 10% wall clock time improvements. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.