Re: [VOTE] Release Pig 0.4.0 (candidate 1)
Is anyone else getting javac errors running "ant test"? compile-sources: [javac] Compiling 484 source files to /Users/ndaley/hadoop/verify/ pig-0.4.0/build/classes [javac] /Users/ndaley/hadoop/verify/pig-0.4.0/src/org/apache/pig/ ComparisonFunc.java:22: package org.apache.hadoop.io does not exist [javac] import org.apache.hadoop.io.WritableComparable; [javac]^ ... Nige On Sep 17, 2009, at 12:09 PM, Olga Natkovich wrote: Hi, I have fixed the issue causing the failure that Alan reported. Please test the new release: http://people.apache.org/~olga/pig-0.4.0-candidate-1/. Vote closes on Tuesday, 9/22. Olga -Original Message- From: Olga Natkovich [mailto:ol...@yahoo-inc.com] Sent: Monday, September 14, 2009 2:06 PM To: pig-dev@hadoop.apache.org; priv...@hadoop.apache.org Subject: [VOTE] Release Pig 0.4.0 (candidate 0) Hi, I created a candidate build for Pig 0.4.0 release. The highlights of this release are - Performance improvements especially in the area of JOIN support where we introduced two new join types: skew join to deal with data skew and sort merge join to take advantage of the sorted data sets. - Support for Outer join. - Works with Hadoop 18 I ran the release audit and rat report looked fine. The relevant part is attached below. Keys used to sign the release are available at http://svn.apache.org/viewvc/hadoop/pig/trunk/KEYS?view=markup. Please download the release and try it out: http://people.apache.org/~olga/pig-0.4.0-candidate-0. Should we release this? Vote closes on Thursday, 9/17. Olga [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/ CHANGES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/zebra/ CHANG ES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/broken- links.x ml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ cookbook.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/index.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/linkmap.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ piglatin_refer ence.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ piglatin_users .html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/setup.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ tutorial.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/udf.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/api/ package-li st [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes. html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ missingS inces.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ user_com ments_for_pig_0.3.1_to_pig_0.5.0-dev.xml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ changes-summary.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_all.html [java] !? /home/
[jira] Commented: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756970#action_12756970 ] Olga Natkovich commented on PIG-964: patch committed to branch-0.5 > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756958#action_12756958 ] Olga Natkovich commented on PIG-964: +1 on the code > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756891#action_12756891 ] Hadoop QA commented on PIG-964: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419938/skewedjoinnull.patch against trunk revision 816339. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/37/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/37/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/37/console This message is automatically generated. > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Attachment: (was: skjoin2b.patch) > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Attachment: skewedjoinnull.patch Cleared end-end tests and added a new unit test to check for nulls in the dataset. > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Status: Patch Available (was: Open) > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skewedjoinnull.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Status: Open (was: Patch Available) > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-960) Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage
[ https://issues.apache.org/jira/browse/PIG-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Modi updated PIG-960: --- Status: Open (was: Patch Available) This patch failed in release audit > Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage > --- > > Key: PIG-960 > URL: https://issues.apache.org/jira/browse/PIG-960 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Ankit Modi > > PigStorage's reading of Tuples ( lines ) can be optimized using Hadoop's > {{LineRecordReader}}. > This can help in following areas > - Improving performance reading of Tuples (lines) in {{PigStorage}} > - Any future improvements in line reading done in Hadoop's > {{LineRecordReader}} is automatically carried over to Pig > Issues that are handled by this patch > - BZip uses internal buffers and positioning for determining the number of > bytes read. Hence buffering done by {{LineRecordReader}} has to be turned off > - Current implementation of {{LocalSeekableInputStream}} does not implement > {{available}} method. This method has to be implemented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-960) Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage
[ https://issues.apache.org/jira/browse/PIG-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Modi updated PIG-960: --- Attachment: (was: pig_rlr.patch) > Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage > --- > > Key: PIG-960 > URL: https://issues.apache.org/jira/browse/PIG-960 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Ankit Modi > > PigStorage's reading of Tuples ( lines ) can be optimized using Hadoop's > {{LineRecordReader}}. > This can help in following areas > - Improving performance reading of Tuples (lines) in {{PigStorage}} > - Any future improvements in line reading done in Hadoop's > {{LineRecordReader}} is automatically carried over to Pig > Issues that are handled by this patch > - BZip uses internal buffers and positioning for determining the number of > bytes read. Hence buffering done by {{LineRecordReader}} has to be turned off > - Current implementation of {{LocalSeekableInputStream}} does not implement > {{available}} method. This method has to be implemented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [VOTE] Release Pig 0.4.0 (candidate 1)
Now the code won't build because there's no hadoop jar in the lib directory. Alan. On Sep 17, 2009, at 12:09 PM, Olga Natkovich wrote: Hi, I have fixed the issue causing the failure that Alan reported. Please test the new release: http://people.apache.org/~olga/pig-0.4.0-candidate-1/. Vote closes on Tuesday, 9/22. Olga -Original Message- From: Olga Natkovich [mailto:ol...@yahoo-inc.com] Sent: Monday, September 14, 2009 2:06 PM To: pig-dev@hadoop.apache.org; priv...@hadoop.apache.org Subject: [VOTE] Release Pig 0.4.0 (candidate 0) Hi, I created a candidate build for Pig 0.4.0 release. The highlights of this release are - Performance improvements especially in the area of JOIN support where we introduced two new join types: skew join to deal with data skew and sort merge join to take advantage of the sorted data sets. - Support for Outer join. - Works with Hadoop 18 I ran the release audit and rat report looked fine. The relevant part is attached below. Keys used to sign the release are available at http://svn.apache.org/viewvc/hadoop/pig/trunk/KEYS?view=markup. Please download the release and try it out: http://people.apache.org/~olga/pig-0.4.0-candidate-0. Should we release this? Vote closes on Thursday, 9/17. Olga [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/ CHANGES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/zebra/ CHANG ES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/broken- links.x ml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ cookbook.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/index.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/linkmap.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ piglatin_refer ence.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ piglatin_users .html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/setup.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/ tutorial.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/udf.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/api/ package-li st [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes. html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ missingS inces.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ user_com ments_for_pig_0.3.1_to_pig_0.5.0-dev.xml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ alldiffs_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ changes-summary.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ classes_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ constructors_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ fields_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/ changes/ jdiff_help.html [j
[VOTE] Release Pig 0.4.0 (candidate 1)
Hi, I have fixed the issue causing the failure that Alan reported. Please test the new release: http://people.apache.org/~olga/pig-0.4.0-candidate-1/. Vote closes on Tuesday, 9/22. Olga -Original Message- From: Olga Natkovich [mailto:ol...@yahoo-inc.com] Sent: Monday, September 14, 2009 2:06 PM To: pig-dev@hadoop.apache.org; priv...@hadoop.apache.org Subject: [VOTE] Release Pig 0.4.0 (candidate 0) Hi, I created a candidate build for Pig 0.4.0 release. The highlights of this release are - Performance improvements especially in the area of JOIN support where we introduced two new join types: skew join to deal with data skew and sort merge join to take advantage of the sorted data sets. - Support for Outer join. - Works with Hadoop 18 I ran the release audit and rat report looked fine. The relevant part is attached below. Keys used to sign the release are available at http://svn.apache.org/viewvc/hadoop/pig/trunk/KEYS?view=markup. Please download the release and try it out: http://people.apache.org/~olga/pig-0.4.0-candidate-0. Should we release this? Vote closes on Thursday, 9/17. Olga [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/CHANGES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/zebra/CHANG ES.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/broken-links.x ml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/cookbook.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/index.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/linkmap.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/piglatin_refer ence.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/piglatin_users .html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/setup.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/tutorial.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/udf.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/api/package-li st [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes. html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/missingS inces.txt [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/user_com ments_for_pig_0.3.1_to_pig_0.5.0-dev.xml [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ alldiffs_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ alldiffs_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ alldiffs_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ alldiffs_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ changes-summary.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ classes_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ classes_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ classes_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ classes_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ constructors_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ constructors_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ constructors_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ constructors_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ fields_index_additions.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ fields_index_all.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ fields_index_changes.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ fields_index_removals.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ jdiff_help.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/ jdiff_statistics.html [java] !? /home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff
[jira] Commented: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756700#action_12756700 ] Olga Natkovich commented on PIG-964: The patch needs unit tests > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-965) PERFORMANCE: optimize common case in matches (PORegex)
[ https://issues.apache.org/jira/browse/PIG-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756673#action_12756673 ] Thejas M Nair commented on PIG-965: --- Hive like clause implementation is here - http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop /hive/ql/udf/UDFLike.java?revision=802066&view=markup I ran simple tests with a simple java program to see the impact of these optimizations. Optimization 1 reduces runtime to 1/2, optimization 2 reduces runtime to 1/4 . {code} int matches =0; int tot = 0; String prefix = "123"; Pattern p = Pattern.compile("123.*"); while((str = in.readLine()) != null ){ //without proposed optimizations //test setups 1 and 2 took 9secs, 126 secs respectively //if(str.matches("123.*")) //matches++; // with optimization 1 //test sestups 1, 2 took 4, 57 secs respectively //if((p.matcher(str).matches())) //matches++; // with optimization 1 //test sestups 1, 2 took 2.5, 25 secs respectively //takes 2.5, 25 secs //int len = prefix.length(); //boolean matched = true; //for(int i=0; i PERFORMANCE: optimize common case in matches (PORegex) > -- > > Key: PIG-965 > URL: https://issues.apache.org/jira/browse/PIG-965 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Thejas M Nair > > Some frequently seen use cases of 'matches' comparison operator have follow > properties - > 1. The rhs is a constant string . eg "c1 matches 'abc%' " > 2. Regexes such that look for matching prefix , suffix etc are very common. > eg - "abc%', "%abc", '%abc%' > To optimize for these common cases , PORegex.java can be changed to - > 1. Compile the pattern (rhs of matches) re-use it if the pattern string has > not changed. > 2. Use string comparisons for simple common regexes (in 2 above). > The implementation of Hive like clause uses similar optimizations. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756672#action_12756672 ] patrick o'leary commented on PIG-366: - I'm guessing the 2008-11-12 12:25 AM patch isn't upto date? The tar doesn't contain the src > PigPen - Eclipse plugin for a graphical PigLatin editor > --- > > Key: PIG-366 > URL: https://issues.apache.org/jira/browse/PIG-366 > Project: Pig > Issue Type: New Feature >Reporter: Shubham Chopra >Assignee: Shubham Chopra >Priority: Minor > Attachments: org.apache.pig.pigpen_0.0.1.jar, > org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, > pigpen.patch, pigPen.patch, PigPen.tgz > > > This is an Eclipse plugin that provides a GUI that can help users create > PigLatin scripts and see the example generator outputs on the fly and submit > the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-965) PERFORMANCE: optimize common case in matches (PORegex)
PERFORMANCE: optimize common case in matches (PORegex) -- Key: PIG-965 URL: https://issues.apache.org/jira/browse/PIG-965 Project: Pig Issue Type: Improvement Components: impl Reporter: Thejas M Nair Some frequently seen use cases of 'matches' comparison operator have follow properties - 1. The rhs is a constant string . eg "c1 matches 'abc%' " 2. Regexes such that look for matching prefix , suffix etc are very common. eg - "abc%', "%abc", '%abc%' To optimize for these common cases , PORegex.java can be changed to - 1. Compile the pattern (rhs of matches) re-use it if the pattern string has not changed. 2. Use string comparisons for simple common regexes (in 2 above). The implementation of Hive like clause uses similar optimizations. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin
[ https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756654#action_12756654 ] Alan Gates commented on PIG-951: I'll be reviewing this patch. > Reset parallelism to 1 for indexing job in MergeJoin > > > Key: PIG-951 > URL: https://issues.apache.org/jira/browse/PIG-951 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: pig-951.patch > > > After sampling one tuple from every block, one reducer is used to sort the > index entries in reduce phase to produce sorted index to be used in actual > join job. Thus, parallelism of index job should be explictly set to 1. > Currently, its not. > Currently, this is a non-issue, since we don't allow any blocking operators > in pipeline before merge-join. However, later when we do allow blocking > operators, then parallelism of indexing job will be that of preceding > blocking operator. Even then, job will complete successfully because all > tuple will go to only one reducer, because we are grouping on only one key > "all". However, it will waste cluster resources by starting all the extra > reducers which get no data and thus do nothing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756636#action_12756636 ] Alan Gates commented on PIG-366: At this point no one has picked up PigPen recently and kept it up to date. I know it worked with Pig 0.2.0, but it has not been updated since then. > PigPen - Eclipse plugin for a graphical PigLatin editor > --- > > Key: PIG-366 > URL: https://issues.apache.org/jira/browse/PIG-366 > Project: Pig > Issue Type: New Feature >Reporter: Shubham Chopra >Assignee: Shubham Chopra >Priority: Minor > Attachments: org.apache.pig.pigpen_0.0.1.jar, > org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, > pigpen.patch, pigPen.patch, PigPen.tgz > > > This is an Eclipse plugin that provides a GUI that can help users create > PigLatin scripts and see the example generator outputs on the fly and submit > the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756619#action_12756619 ] patrick o'leary commented on PIG-366: - What version of hadoop is PigPen designed to use? Am getting the following error Caused by: org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.mapred.JobSubmissionProtocol version mismatch. (client = 11, server = 10) Currently using pigpen pigpen_0.0.4.jar and hadoop 0.18.3 The wiki should contain version numbers and be updated to point to the new tar ball > PigPen - Eclipse plugin for a graphical PigLatin editor > --- > > Key: PIG-366 > URL: https://issues.apache.org/jira/browse/PIG-366 > Project: Pig > Issue Type: New Feature >Reporter: Shubham Chopra >Assignee: Shubham Chopra >Priority: Minor > Attachments: org.apache.pig.pigpen_0.0.1.jar, > org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, > pigpen.patch, pigPen.patch, PigPen.tgz > > > This is an Eclipse plugin that provides a GUI that can help users create > PigLatin scripts and see the example generator outputs on the fly and submit > the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756454#action_12756454 ] Hadoop QA commented on PIG-964: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419855/skjoin2b.patch against trunk revision 816012. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/36/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/36/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/36/console This message is automatically generated. > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Description: For null tuples, the tuple size is calculated incorrectly and thus skewed join ends up expecting a large number of reducers. Further, skewed join should not bail out after the second job if the number of reducers specified by the user is low. It should print a warning message and continue execution. (was: The tuple size is calculated incorrectly and thus the skewed join ends up expecting a large number of reducers. Further, skewed join should not bail out after the second job if the number of reducers specified by the user is low. It should print a warning message and continue execution.) Summary: Handling null in skewed join (was: Handling null keys in skewed join) > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Assignee: Sriranjan Manjunath Status: Patch Available (was: Open) > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath >Assignee: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-964) Handling null in skewed join
[ https://issues.apache.org/jira/browse/PIG-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-964: Attachment: skjoin2b.patch Attached patch solves both the issues. > Handling null in skewed join > - > > Key: PIG-964 > URL: https://issues.apache.org/jira/browse/PIG-964 > Project: Pig > Issue Type: Bug >Reporter: Sriranjan Manjunath > Attachments: skjoin2b.patch > > > For null tuples, the tuple size is calculated incorrectly and thus skewed > join ends up expecting a large number of reducers. Further, skewed join > should not bail out after the second job if the number of reducers specified > by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-964) Handling null keys in skewed join
Handling null keys in skewed join - Key: PIG-964 URL: https://issues.apache.org/jira/browse/PIG-964 Project: Pig Issue Type: Bug Reporter: Sriranjan Manjunath The tuple size is calculated incorrectly and thus the skewed join ends up expecting a large number of reducers. Further, skewed join should not bail out after the second job if the number of reducers specified by the user is low. It should print a warning message and continue execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.