[jira] Commented: (PIG-1048) inner join using 'skewed' produces multiple rows for keys with single row in both input relations
[ https://issues.apache.org/jira/browse/PIG-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771249#action_12771249 ] Hadoop QA commented on PIG-1048: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423389/pig_1048.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/124/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/124/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/124/console This message is automatically generated. > inner join using 'skewed' produces multiple rows for keys with single row in > both input relations > - > > Key: PIG-1048 > URL: https://issues.apache.org/jira/browse/PIG-1048 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Sriranjan Manjunath > Attachments: pig_1048.patch > > > ${code} > grunt> cat students.txt > asdfxc M 23 12.44 > qwerF 21 14.44 > uhsdf M 34 12.11 > zxldf M 21 12.56 > qwerF 23 145.5 > oiueM 54 23.33 > l1 = load 'students.txt'; > l2 = load 'students.txt'; > j = join l1 by $0, l2 by $0 ; > store j into 'tmp.txt' > grunt> cat tmp.txt > oiueM 54 23.33 oiueM 54 23.33 > oiueM 54 23.33 oiueM 54 23.33 > qwerF 21 14.44 qwerF 21 14.44 > qwerF 21 14.44 qwerF 23 145.5 > qwerF 23 145.5 qwerF 21 14.44 > qwerF 23 145.5 qwerF 23 145.5 > uhsdf M 34 12.11 uhsdf M 34 12.11 > uhsdf M 34 12.11 uhsdf M 34 12.11 > zxldf M 21 12.56 zxldf M 21 12.56 > zxldf M 21 12.56 zxldf M 21 12.56 > asdfxc M 23 12.44 asdfxc M 23 12.44 > asdfxc M 23 12.44 asdfxc M 23 12.44$ > ${code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-598) Parameter substitution ($PARAMETER) should not be performed in comments
[ https://issues.apache.org/jira/browse/PIG-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771319#action_12771319 ] Hadoop QA commented on PIG-598: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423034/PIG-598.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 197 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 318 release audit warnings (more than the trunk's current 313 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/console This message is automatically generated. > Parameter substitution ($PARAMETER) should not be performed in comments > --- > > Key: PIG-598 > URL: https://issues.apache.org/jira/browse/PIG-598 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 >Reporter: David Ciemiewicz >Assignee: Thejas M Nair > Attachments: PIG-598.patch > > > Compiling the following code example will generate an error that > $NOT_A_PARAMETER is an Undefined Parameter. > This is problematic as sometimes you want to comment out parts of your code, > including parameters so that you don't have to define them. > This I think it would be really good if parameter substitution was not > performed in comments. > {code} > -- $NOT_A_PARAMETER > {code} > {code} > -bash-3.00$ pig -exectype local -latest comment.pig > USING: /grid/0/gs/pig/current > java.lang.RuntimeException: Undefined parameter : NOT_A_PARAMETER > at > org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:221) > at > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106) > at > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86) > at org.apache.pig.Main.runParamPreprocessor(Main.java:394) > at org.apache.pig.Main.main(Main.java:296) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1030) explain and dump not working with two UDFs inside inner plan of foreach
[ https://issues.apache.org/jira/browse/PIG-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771385#action_12771385 ] Hadoop QA commented on PIG-1030: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423476/PIG-1030.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/126/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/126/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/126/console This message is automatically generated. > explain and dump not working with two UDFs inside inner plan of foreach > --- > > Key: PIG-1030 > URL: https://issues.apache.org/jira/browse/PIG-1030 > Project: Pig > Issue Type: Bug >Reporter: Ying He >Assignee: Richard Ding > Attachments: PIG-1030.patch > > > this scprit does not work > register /homes/yinghe/owl/string.jar; > a = load '/user/yinghe/a.txt' as (id, color); > b = group a all; > c = foreach b { > d = distinct a.color; > generate group, string.BagCount2(d), string.ColumnLen2(d, 0); > } > the udfs are regular, not algebraic. > then if I call "dump c;" or "explain c", I would get this error message. > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2019: Expected to find plan > with single leaf. Found 2 leaves. > The error only occurs for the first time, after getting this error, if I call > "dump c" or "explain c" again, it would succeed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-746) Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should never be serialized
[ https://issues.apache.org/jira/browse/PIG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771425#action_12771425 ] Hadoop QA commented on PIG-746: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423482/PIG-746.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/127/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/127/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/127/console This message is automatically generated. > Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should > never be serialized > > > Key: PIG-746 > URL: https://issues.apache.org/jira/browse/PIG-746 > Project: Pig > Issue Type: Bug >Reporter: David Ciemiewicz >Assignee: Richard Ding > Attachments: PIG-746.patch > > > The script below works on Pig 2.0 local mode but fails when I run the same > program on the grid. > I was attempting to create a workaround for PIG-710. > Here's the error: > {code} > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2113: > SingleTupleBag should never be serialized > or serialized. > at org.apache.pig.data.SingleTupleBag.write(SingleTupleBag.java:129) > at > org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:147) > at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291) > at > org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:83) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:439) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:101) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:219) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:86) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > {code} > Here's the program: > {code} > A = load 'filterbug.data' using PigStorage() as ( id, str ); > A = foreach A generate > id, > str, > ( > str matches 'hello' or > str matches 'hello' > ? 1 : 0 > ) as matched; > describe A; > B = group A by ( id ); > describe B; > D = foreach B generate > group, > SUM(A.matched) as matchedcount, > A; > describe D; > E = filter D by matchedcount > 0; > describe E; > F = foreach E generate > FLATTEN(A); > describe F; > dump F; > {code} > Here's the data filterbug.data > {code} > a hello > a goodbye > b goodbye > c hello > c hello > c hello > e what > {code} > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1055) FINDBUGS: remaining "Dodgy Warnings"
[ https://issues.apache.org/jira/browse/PIG-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771433#action_12771433 ] Hadoop QA commented on PIG-1055: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423487/PIG-1055.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/29/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/29/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/29/console This message is automatically generated. > FINDBUGS: remaining "Dodgy Warnings" > > > Key: PIG-1055 > URL: https://issues.apache.org/jira/browse/PIG-1055 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Attachments: PIG-1055.patch > > > BCQuestionable cast from java.util.List to java.util.ArrayList in new > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit(PigContext, > FileSystem, Path, String, List, long, long) > Eqorg.apache.pig.data.AmendableTuple doesn't override > DefaultTuple.equals(Object) > Eqorg.apache.pig.data.TimestampedTuple doesn't override > DefaultTuple.equals(Object) > IAAmbiguous invocation of either an outer or inherited method > org.apache.pig.impl.plan.DotPlanDumper.getName(Operator) in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.DotMRPrinter$InnerPrinter.getAttributes(DotMRPrinter$InnerOperator) > IMComputation of average could overflow in > org.apache.tools.bzip2r.CBZip2OutputStream.qSort3(int, int, int) > IMCheck for oddness that won't work for negative numbers in > org.apache.tools.bzip2r.CBZip2OutputStream.sendMTFValues() > REC Exception is caught when Exception is not thrown in > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.doHod(String, > Properties) > REC Exception is caught when Exception is not thrown in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer.visitMROp(MapReduceOper) > REC Exception is caught when Exception is not thrown in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitDistinct(PODistinct) > REC Exception is caught when Exception is not thrown in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(POFRJoin) > REC Exception is caught when Exception is not thrown in > org.apache.pig.impl.logicalLayer.optimizer.OpLimitOptimizer.processNode(LOLimit) > REC Exception is caught when Exception is not thrown in > org.apache.pig.tools.streams.StreamGenerator.actionPerformed(ActionEvent) > STWrite to static field > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner.sJobConf > from instance method > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.configure(JobConf) > STWrite to static field > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.activeSplit > from instance method > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getRecordReader(InputSplit, > JobConf, Reporter) > STWrite to static field > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.sJob > from instance method > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getRecordReader(InputSplit, > JobConf, Reporter) > STWrite to static field > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce.sJobConf > from instance method > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.configure(JobConf) > STWrite to static field org.apache.pig.data.BagFactory.gMemMgr from > instance method new org.apache.pig.data.BagFactory() > STWrite to static field > org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.mOpToCloneMap from > instance method new > org.
[jira] Commented: (PIG-1036) Fragment-replicate left outer join
[ https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771436#action_12771436 ] Hadoop QA commented on PIG-1036: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423507/LeftOuterFRJoin.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/30/console This message is automatically generated. > Fragment-replicate left outer join > -- > > Key: PIG-1036 > URL: https://issues.apache.org/jira/browse/PIG-1036 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Ankit Modi > Attachments: LeftOuterFRJoin.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771484#action_12771484 ] Hadoop QA commented on PIG-1016: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422575/PIG-1016.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/128/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/128/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/128/console This message is automatically generated. > Reading in map data seems broken > > > Key: PIG-1016 > URL: https://issues.apache.org/jira/browse/PIG-1016 > Project: Pig > Issue Type: Improvement > Components: data >Affects Versions: 0.4.0 >Reporter: hc busy > Fix For: 0.5.0 > > Attachments: PIG-1016.patch > > > Hi, I'm trying to load a map that has a tuple for value. The read fails in > 0.4.0 because of a misconfiguration in the parser. Where as in almost all > documentation it is stated that value of the map can be any time. > I've attached a patch that allows us to read in complex objects as value as > documented. I've done simple verification of loading in maps with tuple/map > values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1001) Generate more meaningful error message when one input file does not exist
[ https://issues.apache.org/jira/browse/PIG-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771519#action_12771519 ] Hadoop QA commented on PIG-1001: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423510/PIG-1001-2.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/31/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/31/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/31/console This message is automatically generated. > Generate more meaningful error message when one input file does not exist > - > > Key: PIG-1001 > URL: https://issues.apache.org/jira/browse/PIG-1001 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1001-1.patch, PIG-1001-2.patch > > > In the following query, if 1.txt does not exist, > a = load '1.txt'; > b = group a by $0; > c = group b all; > dump c; > Pig throws error message "ERROR 2100: file:/tmp/temp155054664/tmp1144108421 > does not exist.", Pig should deal with it with the error message "Input file > 1.txt not exist" instead of those confusing messages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1036) Fragment-replicate left outer join
[ https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771622#action_12771622 ] Hadoop QA commented on PIG-1036: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423594/LeftOuterFRJoin.patch against trunk revision 831051. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/129/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/129/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/129/console This message is automatically generated. > Fragment-replicate left outer join > -- > > Key: PIG-1036 > URL: https://issues.apache.org/jira/browse/PIG-1036 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Ankit Modi > Attachments: LeftOuterFRJoin.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1059) FINDBUGS: remaining Bad practice + Multithreaded correctness Warning
[ https://issues.apache.org/jira/browse/PIG-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771665#action_12771665 ] Hadoop QA commented on PIG-1059: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423606/PIG-1059.patch against trunk revision 831051. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. -1 release audit. The applied patch generated 308 release audit warnings (more than the trunk's current 301 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/32/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/32/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/32/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/32/console This message is automatically generated. > FINDBUGS: remaining Bad practice + Multithreaded correctness Warning > > > Key: PIG-1059 > URL: https://issues.apache.org/jira/browse/PIG-1059 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Attachments: PIG-1059.patch > > > ISInconsistent synchronization of > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.hodConfDir; > locked 66% of time > ISInconsistent synchronization of > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.hodProcess; > locked 80% of time > ISInconsistent synchronization of > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.remoteHodConfDir; > locked 88% of time > ISInconsistent synchronization of > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.initialized; > locked 50% of time > UG > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger.getAggregate() > is unsynchronized, > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger.setAggregate(boolean) > is synchronized > UG > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger.getReporter() > is unsynchronized, > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger.setReporter(Reporter) > is synchronized > BCEquals method for org.apache.pig.builtin.PigStorage assumes the > argument is of type PigStorage > BCEquals method for > org.apache.pig.impl.streaming.StreamingCommand$HandleSpec assumes the > argument is of type StreamingCommand$HandleSpec > DPorg.apache.pig.data.BagFactory.getInstance() creates a > java.net.URLClassLoader classloader, which should be performed within a > doPrivileged block > DPorg.apache.pig.data.TupleFactory.getInstance() creates a > java.net.URLClassLoader classloader, which should be performed within a > doPrivileged block > DPorg.apache.pig.impl.PigContext.createCl(String) creates a > java.net.URLClassLoader classloader, which should be performed within a > doPrivileged block > DPorg.apache.pig.impl.util.JarManager.createCl(String, PigContext) > creates a java.net.URLClassLoader classloader, which should be performed > within a doPrivileged block > Eqorg.apache.pig.data.DistinctDataBag$DistinctDataBagIterator$TContainer > defines compareTo(DistinctDataBag$DistinctDataBagIterator$TContainer) and > uses Object.equals() > Eqorg.apache.pig.data.SingleTupleBag defines compareTo(Object) and uses > Object.equals() > Eqorg.apache.pig.data.SortedDataBag$SortedDataBagIterator$PQContainer > defines compareTo(SortedDataBag$SortedDataBagIterator$PQContainer) and uses > Object.equals() > Eqorg.apache.pig.data.TargetedTuple defines compareTo(Object) and uses > Object.equals() > HE > org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan > defines equals and uses Object.hashCode() > HE > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POCogroup$groupComparator > defines equals and uses Object.hashCode() > HEorg.apache.p
[jira] Commented: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.
[ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771714#action_12771714 ] Hadoop QA commented on PIG-1057: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423611/patch_1057 against trunk revision 831051. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/130/console This message is automatically generated. > [Zebra] Zebra does not support concurrent deletions of column groups now. > - > > Key: PIG-1057 > URL: https://issues.apache.org/jira/browse/PIG-1057 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Chao Wang >Assignee: Chao Wang > Fix For: 0.6.0 > > Attachments: patch_1057 > > > Zebra does not support concurrent deletions of column groups now. As a > result, the TestDropColumnGroup testcase can sometimes fail due to this. > In this testcase, multiple threads will be launched together, with each one > deleting one particular column group. The following exception can be thrown > (with callstack): > /*/ > ... > java.io.FileNotFoundException: File > /.../pig-trunk/build/contrib/zebra/test/data/DropCGTest/CG02 does not exist. > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361) > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:716) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:741) > at > org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465) > at > org.apache.hadoop.zebra.io.BasicTable$SchemaFile.setCGDeletedFlags(BasicTable.java:1610) > at > org.apache.hadoop.zebra.io.BasicTable$SchemaFile.readSchemaFile(BasicTable.java:1593) > at > org.apache.hadoop.zebra.io.BasicTable$SchemaFile.(BasicTable.java:1416) > at > org.apache.hadoop.zebra.io.BasicTable.dropColumnGroup(BasicTable.java:133) > at > org.apache.hadoop.zebra.io.TestDropColumnGroup$DropThread.run(TestDropColumnGroup.java:772) > ... > /*/ > We plan to fix this in Zebra to support concurrent deletions of column > groups. The root cause is that a thread or process reads in some stale file > system information (e.g., it sees /CG0 first) and then can fail later on (it > tries to access /CG0, however /CG0 may be deleted by another thread or > process). Therefore, we plan to adopt a retry logic to resolve this issue. > More detailed, we allow a dropping column group thread to retry n times when > doing its deleting job - n is the total number of column groups. > Note that here we do NOT try to resolve the more general concurrent column > group deletions + reads issue. If a process is reading some data that could > be deleted by another process, it can fail as we expect. > Here we only try to resolve the concurrent column group deletions issue. If > you have multiple threads or processes to delete column groups, they should > succeed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1040) FINDBUGS: MS_SHOULD_BE_FINAL: Field isn't final but should be
[ https://issues.apache.org/jira/browse/PIG-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771736#action_12771736 ] Hadoop QA commented on PIG-1040: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423618/PIG-1040.patch against trunk revision 831051. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 311 release audit warnings (more than the trunk's current 301 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/33/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/33/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/33/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/33/console This message is automatically generated. > FINDBUGS: MS_SHOULD_BE_FINAL: Field isn't final but should be > - > > Key: PIG-1040 > URL: https://issues.apache.org/jira/browse/PIG-1040 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Attachments: PIG-1040.patch > > > MS > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.USER_COMPARATOR_MARKER > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.weightedParts > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce.sJobConf > isn't final and can't be protected from malicious code > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.bagFactory > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.reporter > isn't final and can't be protected from malicious code > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.pigLogger > should be package protected > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyBag > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyBool > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyDBA > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyDouble > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyFloat > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyInt > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyLong > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyMap > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyString > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.dummyTuple > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.mTupleFactory > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.mTupleFactory > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.mBagFactory > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.mTupleFactory > isn't final but should be > MS > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPreCombinerLocalRearrange.mTupleFactory > isn't final but should be > MSorg.apache.pig.builtin.PigDump.recordDelimiter isn't final but should be > MSor
[jira] Commented: (PIG-1048) inner join using 'skewed' produces multiple rows for keys with single row in both input relations
[ https://issues.apache.org/jira/browse/PIG-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771807#action_12771807 ] Hadoop QA commented on PIG-1048: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423641/pig_1048.patch against trunk revision 831169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/34/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/34/console This message is automatically generated. > inner join using 'skewed' produces multiple rows for keys with single row in > both input relations > - > > Key: PIG-1048 > URL: https://issues.apache.org/jira/browse/PIG-1048 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Sriranjan Manjunath > Attachments: pig_1048.patch > > > ${code} > grunt> cat students.txt > asdfxc M 23 12.44 > qwerF 21 14.44 > uhsdf M 34 12.11 > zxldf M 21 12.56 > qwerF 23 145.5 > oiueM 54 23.33 > l1 = load 'students.txt'; > l2 = load 'students.txt'; > j = join l1 by $0, l2 by $0 ; > store j into 'tmp.txt' > grunt> cat tmp.txt > oiueM 54 23.33 oiueM 54 23.33 > oiueM 54 23.33 oiueM 54 23.33 > qwerF 21 14.44 qwerF 21 14.44 > qwerF 21 14.44 qwerF 23 145.5 > qwerF 23 145.5 qwerF 21 14.44 > qwerF 23 145.5 qwerF 23 145.5 > uhsdf M 34 12.11 uhsdf M 34 12.11 > uhsdf M 34 12.11 uhsdf M 34 12.11 > zxldf M 21 12.56 zxldf M 21 12.56 > zxldf M 21 12.56 zxldf M 21 12.56 > asdfxc M 23 12.44 asdfxc M 23 12.44 > asdfxc M 23 12.44 asdfxc M 23 12.44$ > ${code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1063) Pig does not call checkOutSpecs() on OutputFormat provided by StoreFunc in the multistore case
[ https://issues.apache.org/jira/browse/PIG-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771846#action_12771846 ] Hadoop QA commented on PIG-1063: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423638/PIG-1063.patch against trunk revision 831169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 198 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/131/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/131/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/131/console This message is automatically generated. > Pig does not call checkOutSpecs() on OutputFormat provided by StoreFunc in > the multistore case > -- > > Key: PIG-1063 > URL: https://issues.apache.org/jira/browse/PIG-1063 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Pradeep Kamath >Assignee: Pradeep Kamath > Attachments: PIG-1063.patch > > > A StoreFunc implementation can inform pig of an OutputFormat it uses through > the getStoragePreparationClass() method. In a query with multiple stores > which gets optimized into a single mapred job, Pig does not call the > checkOutputSpecs() method on the outputformat. An example of such a script is: > {noformat} > a = load 'input.txt'; > b = filter a by $0 < 10; > store b into 'output1' using StoreWithOutputFormat(); > c = group a by $0; > d = foreach c generate group, COUNT(a.$0); > store d into 'output2' using StoreWithOutputFormat(); > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1048) inner join using 'skewed' produces multiple rows for keys with single row in both input relations
[ https://issues.apache.org/jira/browse/PIG-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771878#action_12771878 ] Hadoop QA commented on PIG-1048: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423658/pig_1048.patch against trunk revision 831169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/35/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/35/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/35/console This message is automatically generated. > inner join using 'skewed' produces multiple rows for keys with single row in > both input relations > - > > Key: PIG-1048 > URL: https://issues.apache.org/jira/browse/PIG-1048 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Sriranjan Manjunath > Attachments: pig_1048.patch > > > ${code} > grunt> cat students.txt > asdfxc M 23 12.44 > qwerF 21 14.44 > uhsdf M 34 12.11 > zxldf M 21 12.56 > qwerF 23 145.5 > oiueM 54 23.33 > l1 = load 'students.txt'; > l2 = load 'students.txt'; > j = join l1 by $0, l2 by $0 ; > store j into 'tmp.txt' > grunt> cat tmp.txt > oiueM 54 23.33 oiueM 54 23.33 > oiueM 54 23.33 oiueM 54 23.33 > qwerF 21 14.44 qwerF 21 14.44 > qwerF 21 14.44 qwerF 23 145.5 > qwerF 23 145.5 qwerF 21 14.44 > qwerF 23 145.5 qwerF 23 145.5 > uhsdf M 34 12.11 uhsdf M 34 12.11 > uhsdf M 34 12.11 uhsdf M 34 12.11 > zxldf M 21 12.56 zxldf M 21 12.56 > zxldf M 21 12.56 zxldf M 21 12.56 > asdfxc M 23 12.44 asdfxc M 23 12.44 > asdfxc M 23 12.44 asdfxc M 23 12.44$ > ${code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1035) support for skewed outer join
[ https://issues.apache.org/jira/browse/PIG-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771913#action_12771913 ] Hadoop QA commented on PIG-1035: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423670/1035.patch against trunk revision 831169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/132/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/132/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/132/console This message is automatically generated. > support for skewed outer join > - > > Key: PIG-1035 > URL: https://issues.apache.org/jira/browse/PIG-1035 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Sriranjan Manjunath > Attachments: 1035.patch > > > Similarly to skewed inner join, skewed outer join will help to scale in the > presense of join keys that don't fit into memory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1063) Pig does not call checkOutSpecs() on OutputFormat provided by StoreFunc in the multistore case
[ https://issues.apache.org/jira/browse/PIG-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772068#action_12772068 ] Hadoop QA commented on PIG-1063: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423638/PIG-1063.patch against trunk revision 831169. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 198 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/133/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/133/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/133/console This message is automatically generated. > Pig does not call checkOutSpecs() on OutputFormat provided by StoreFunc in > the multistore case > -- > > Key: PIG-1063 > URL: https://issues.apache.org/jira/browse/PIG-1063 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Pradeep Kamath >Assignee: Pradeep Kamath > Attachments: PIG-1063.patch > > > A StoreFunc implementation can inform pig of an OutputFormat it uses through > the getStoragePreparationClass() method. In a query with multiple stores > which gets optimized into a single mapred job, Pig does not call the > checkOutputSpecs() method on the outputformat. An example of such a script is: > {noformat} > a = load 'input.txt'; > b = filter a by $0 < 10; > store b into 'output1' using StoreWithOutputFormat(); > c = group a by $0; > d = foreach c generate group, COUNT(a.$0); > store d into 'output2' using StoreWithOutputFormat(); > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1030) explain and dump not working with two UDFs inside inner plan of foreach
[ https://issues.apache.org/jira/browse/PIG-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772123#action_12772123 ] Hadoop QA commented on PIG-1030: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423708/PIG-1030.patch against trunk revision 831402. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/36/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/36/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/36/console This message is automatically generated. > explain and dump not working with two UDFs inside inner plan of foreach > --- > > Key: PIG-1030 > URL: https://issues.apache.org/jira/browse/PIG-1030 > Project: Pig > Issue Type: Bug >Reporter: Ying He >Assignee: Richard Ding > Attachments: PIG-1030.patch, PIG-1030.patch > > > this scprit does not work > register /homes/yinghe/owl/string.jar; > a = load '/user/yinghe/a.txt' as (id, color); > b = group a all; > c = foreach b { > d = distinct a.color; > generate group, string.BagCount2(d), string.ColumnLen2(d, 0); > } > the udfs are regular, not algebraic. > then if I call "dump c;" or "explain c", I would get this error message. > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2019: Expected to find plan > with single leaf. Found 2 leaves. > The error only occurs for the first time, after getting this error, if I call > "dump c" or "explain c" again, it would succeed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1058) FINDBUGS: remaining "Correctness Warnings"
[ https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772210#action_12772210 ] Hadoop QA commented on PIG-1058: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423734/PIG-1058.patch against trunk revision 831481. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/37/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/37/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/37/console This message is automatically generated. > FINDBUGS: remaining "Correctness Warnings" > -- > > Key: PIG-1058 > URL: https://issues.apache.org/jira/browse/PIG-1058 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Attachments: PIG-1058.patch > > > BCImpossible cast from java.lang.Object[] to java.lang.String[] in > org.apache.pig.PigServer.listPaths(String) > ECCall to equals() comparing different types in > org.apache.pig.impl.plan.Operator.equals(Object) > GCjava.lang.Byte is incompatible with expected argument type > java.lang.Integer in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.POPackageAnnotator$LoRearrangeDiscoverer.visitLocalRearrange(POLocalRearrange) > ILThere is an apparent infinite recursive loop in > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POCogroup$groupComparator.equals(Object) > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.bsR(int) > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.getAndMoveToFrontDecode() > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.getAndMoveToFrontDecode() > MFField ConstantExpression.res masks field in superclass > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator > Nm > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitSplit(POSplit) > doesn't override method in superclass because parameter type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit > doesn't match superclass parameter type > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POSplit > Nm > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.NoopStoreRemover$PhysicalRemover.visitSplit(POSplit) > doesn't override method in superclass because parameter type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit > doesn't match superclass parameter type > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POSplit > NPPossible null pointer dereference of ? in > org.apache.pig.impl.logicalLayer.optimizer.PushDownForeachFlatten.check(List) > NPPossible null pointer dereference of lo in > org.apache.pig.impl.logicalLayer.optimizer.StreamOptimizer.transform(List) > NPPossible null pointer dereference of > Schema$FieldSchema.Schema$FieldSchema.alias in > org.apache.pig.impl.logicalLayer.schema.Schema.equals(Schema, Schema, > boolean, boolean) > NPPossible null pointer dereference of Schema$FieldSchema.alias in > org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.equals(Schema$FieldSchema, > Schema$FieldSchema, boolean, boolean) > NPPossible null pointer dereference of inp in > org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run() > RCN Nullcheck of pigContext at line 123 of value previously dereferenced in > org.apache.pig.impl.util.JarManager.createJar(OutputStream, List, PigContext) > RV > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.fixUpDomain(String, > Properties) ignores return value of java.net.InetAddress.getByName(String) > RVBad attempt to compute abso
[jira] Commented: (PIG-997) [zebra] Sorted Table Support by Zebra
[ https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772219#action_12772219 ] Hadoop QA commented on PIG-997: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423724/SortedTable.patch against trunk revision 831481. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 173 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 355 release audit warnings (more than the trunk's current 337 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/134/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/134/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/134/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/134/console This message is automatically generated. > [zebra] Sorted Table Support by Zebra > - > > Key: PIG-997 > URL: https://issues.apache.org/jira/browse/PIG-997 > Project: Pig > Issue Type: New Feature >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.6.0 > > Attachments: SortedTable.patch, SortedTable.patch > > > This new feature is for Zebra to support sorted data in storage. As a storage > library, Zebra will not sort the data by itself. But it will support creation > and use of sorted data either through PIG or through map/reduce tasks that > use Zebra as storage format. > The sorted table keeps the data in a "totally sorted" manner across all > TFiles created by potentially all mappers or reducers. > For sorted data creation through PIG's STORE operator , if the input data is > sorted through "ORDER BY", the new Zebra table will be marked as sorted on > the sorted columns; > For sorted data creation though Map/Reduce tasks, three new static methods > of the BasicTableOutput class will be provided to allow or help the user to > achieve the goal. "setSortInfo" allows the user to specify the sorted columns > of the input tuple to be stored; "getSortKeyGenerator" and "getSortKey" help > the user to generate the key acceptable by Zebra as a sorted key based upon > the schema, sorted columns and the input tuple. > For sorted data read through PIG's LOAD operator, pass string "sorted" as an > extra argument to the TableLoader constructor to ask for sorted table to be > loaded; > For sorted data read through Map/Reduce tasks, a new static method of > TableInputFormat class, requireSortedTable, can be called to ask for a sorted > table to be read. Additionally, an overloaded version of the new method can > be called to ask for a sorted table on specified sort columns and comparator. > For this release, sorted table only supported sorting in ascending order, not > in descending order. In addition, the sort keys must be of simple types not > complex types such as RECORD, COLLECTION and MAP. > Multiple-key sorting is supported. But the ordering of the multiple sort keys > is significant with the first sort column being the primary sort key, the > second being the secondary sort key, etc. > In this release, the sort keys are stored along with the sort columns where > the keys were originally created from, resulting in some data storage > redundancy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1035) support for skewed outer join
[ https://issues.apache.org/jira/browse/PIG-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772227#action_12772227 ] Hadoop QA commented on PIG-1035: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423750/1035new.patch against trunk revision 831481. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/135/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/135/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/135/console This message is automatically generated. > support for skewed outer join > - > > Key: PIG-1035 > URL: https://issues.apache.org/jira/browse/PIG-1035 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Sriranjan Manjunath > Attachments: 1035new.patch > > > Similarly to skewed inner join, skewed outer join will help to scale in the > presense of join keys that don't fit into memory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-997) [zebra] Sorted Table Support by Zebra
[ https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772228#action_12772228 ] Hadoop QA commented on PIG-997: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423753/SortedTable.patch against trunk revision 831481. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 173 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/38/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/38/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/38/console This message is automatically generated. > [zebra] Sorted Table Support by Zebra > - > > Key: PIG-997 > URL: https://issues.apache.org/jira/browse/PIG-997 > Project: Pig > Issue Type: New Feature >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.6.0 > > Attachments: SortedTable.patch, SortedTable.patch, SortedTable.patch > > > This new feature is for Zebra to support sorted data in storage. As a storage > library, Zebra will not sort the data by itself. But it will support creation > and use of sorted data either through PIG or through map/reduce tasks that > use Zebra as storage format. > The sorted table keeps the data in a "totally sorted" manner across all > TFiles created by potentially all mappers or reducers. > For sorted data creation through PIG's STORE operator , if the input data is > sorted through "ORDER BY", the new Zebra table will be marked as sorted on > the sorted columns; > For sorted data creation though Map/Reduce tasks, three new static methods > of the BasicTableOutput class will be provided to allow or help the user to > achieve the goal. "setSortInfo" allows the user to specify the sorted columns > of the input tuple to be stored; "getSortKeyGenerator" and "getSortKey" help > the user to generate the key acceptable by Zebra as a sorted key based upon > the schema, sorted columns and the input tuple. > For sorted data read through PIG's LOAD operator, pass string "sorted" as an > extra argument to the TableLoader constructor to ask for sorted table to be > loaded; > For sorted data read through Map/Reduce tasks, a new static method of > TableInputFormat class, requireSortedTable, can be called to ask for a sorted > table to be read. Additionally, an overloaded version of the new method can > be called to ask for a sorted table on specified sort columns and comparator. > For this release, sorted table only supported sorting in ascending order, not > in descending order. In addition, the sort keys must be of simple types not > complex types such as RECORD, COLLECTION and MAP. > Multiple-key sorting is supported. But the ordering of the multiple sort keys > is significant with the first sort column being the primary sort key, the > second being the secondary sort key, etc. > In this release, the sort keys are stored along with the sort columns where > the keys were originally created from, resulting in some data storage > redundancy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-970) Support of HBase 0.20.0
[ https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772464#action_12772464 ] Hadoop QA commented on PIG-970: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423811/zookeeper-hbase-1329.jar against trunk revision 831481. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/136/console This message is automatically generated. > Support of HBase 0.20.0 > --- > > Key: PIG-970 > URL: https://issues.apache.org/jira/browse/PIG-970 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.3.0 >Reporter: Vincent BARAT >Assignee: Jeff Zhang > Fix For: 0.5.0 > > Attachments: build.xml.path, hbase-0.20.0-test.jar, hbase-0.20.0.jar, > pig-hbase-0.20.0-support.patch, pig-hbase-20-v2.patch, > Pig_HBase_0.20.0.patch, TEST-org.apache.pig.test.TestHBaseStorage.txt, > TEST-org.apache.pig.test.TestHBaseStorage.txt, zookeeper-hbase-1329.jar > > > The support of HBase is currently very limited and restricted to HBase 0.18.0. > Because the next releases of PIG will support Hadoop 0.20.0, they should also > support HBase 0.20.0. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1058) FINDBUGS: remaining "Correctness Warnings"
[ https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773341#action_12773341 ] Hadoop QA commented on PIG-1058: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423961/PIG-1058_v2.patch against trunk revision 832086. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/39/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/39/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/39/console This message is automatically generated. > FINDBUGS: remaining "Correctness Warnings" > -- > > Key: PIG-1058 > URL: https://issues.apache.org/jira/browse/PIG-1058 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Attachments: PIG-1058.patch, PIG-1058_v2.patch > > > BCImpossible cast from java.lang.Object[] to java.lang.String[] in > org.apache.pig.PigServer.listPaths(String) > ECCall to equals() comparing different types in > org.apache.pig.impl.plan.Operator.equals(Object) > GCjava.lang.Byte is incompatible with expected argument type > java.lang.Integer in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.POPackageAnnotator$LoRearrangeDiscoverer.visitLocalRearrange(POLocalRearrange) > ILThere is an apparent infinite recursive loop in > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POCogroup$groupComparator.equals(Object) > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.bsR(int) > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.getAndMoveToFrontDecode() > INT Bad comparison of nonnegative value with -1 in > org.apache.tools.bzip2r.CBZip2InputStream.getAndMoveToFrontDecode() > MFField ConstantExpression.res masks field in superclass > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator > Nm > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitSplit(POSplit) > doesn't override method in superclass because parameter type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit > doesn't match superclass parameter type > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POSplit > Nm > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.NoopStoreRemover$PhysicalRemover.visitSplit(POSplit) > doesn't override method in superclass because parameter type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit > doesn't match superclass parameter type > org.apache.pig.backend.local.executionengine.physicalLayer.relationalOperators.POSplit > NPPossible null pointer dereference of ? in > org.apache.pig.impl.logicalLayer.optimizer.PushDownForeachFlatten.check(List) > NPPossible null pointer dereference of lo in > org.apache.pig.impl.logicalLayer.optimizer.StreamOptimizer.transform(List) > NPPossible null pointer dereference of > Schema$FieldSchema.Schema$FieldSchema.alias in > org.apache.pig.impl.logicalLayer.schema.Schema.equals(Schema, Schema, > boolean, boolean) > NPPossible null pointer dereference of Schema$FieldSchema.alias in > org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.equals(Schema$FieldSchema, > Schema$FieldSchema, boolean, boolean) > NPPossible null pointer dereference of inp in > org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run() > RCN Nullcheck of pigContext at line 123 of value previously dereferenced in > org.apache.pig.impl.util.JarManager.createJar(OutputStream, List, PigContext) > RV > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.fixUpDomain(String, > Properties) ignores return value of java.net.InetAddress.getByName(String) > RVBad a
[jira] Commented: (PIG-1036) Fragment-replicate left outer join
[ https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773273#action_12773273 ] Hadoop QA commented on PIG-1036: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423944/LeftOuterFRJoin.patch against trunk revision 832086. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/137/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/137/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/137/console This message is automatically generated. > Fragment-replicate left outer join > -- > > Key: PIG-1036 > URL: https://issues.apache.org/jira/browse/PIG-1036 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich >Assignee: Ankit Modi > Attachments: LeftOuterFRJoin.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-997) [zebra] Sorted Table Support by Zebra
[ https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773453#action_12773453 ] Hadoop QA commented on PIG-997: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423995/SortedTable.patch against trunk revision 832599. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 177 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/138/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/138/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/138/console This message is automatically generated. > [zebra] Sorted Table Support by Zebra > - > > Key: PIG-997 > URL: https://issues.apache.org/jira/browse/PIG-997 > Project: Pig > Issue Type: New Feature >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.6.0 > > Attachments: SortedTable.patch, SortedTable.patch, SortedTable.patch, > SortedTable.patch > > > This new feature is for Zebra to support sorted data in storage. As a storage > library, Zebra will not sort the data by itself. But it will support creation > and use of sorted data either through PIG or through map/reduce tasks that > use Zebra as storage format. > The sorted table keeps the data in a "totally sorted" manner across all > TFiles created by potentially all mappers or reducers. > For sorted data creation through PIG's STORE operator , if the input data is > sorted through "ORDER BY", the new Zebra table will be marked as sorted on > the sorted columns; > For sorted data creation though Map/Reduce tasks, three new static methods > of the BasicTableOutput class will be provided to allow or help the user to > achieve the goal. "setSortInfo" allows the user to specify the sorted columns > of the input tuple to be stored; "getSortKeyGenerator" and "getSortKey" help > the user to generate the key acceptable by Zebra as a sorted key based upon > the schema, sorted columns and the input tuple. > For sorted data read through PIG's LOAD operator, pass string "sorted" as an > extra argument to the TableLoader constructor to ask for sorted table to be > loaded; > For sorted data read through Map/Reduce tasks, a new static method of > TableInputFormat class, requireSortedTable, can be called to ask for a sorted > table to be read. Additionally, an overloaded version of the new method can > be called to ask for a sorted table on specified sort columns and comparator. > For this release, sorted table only supported sorting in ascending order, not > in descending order. In addition, the sort keys must be of simple types not > complex types such as RECORD, COLLECTION and MAP. > Multiple-key sorting is supported. But the ordering of the multiple sort keys > is significant with the first sort column being the primary sort key, the > second being the secondary sort key, etc. > In this release, the sort keys are stored along with the sort columns where > the keys were originally created from, resulting in some data storage > redundancy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1071) Support comma separated file/directory names in load statements
[ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773750#action_12773750 ] Hadoop QA commented on PIG-1071: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424056/PIG-1071.patch against trunk revision 832804. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/console This message is automatically generated. > Support comma separated file/directory names in load statements > --- > > Key: PIG-1071 > URL: https://issues.apache.org/jira/browse/PIG-1071 > Project: Pig > Issue Type: New Feature >Reporter: Richard Ding >Assignee: Richard Ding > Attachments: PIG-1071.patch > > > Currently Pig Latin support following LOAD syntax: > {code} > LOAD 'data' [USING loader function] [AS schema]; > {code} > where data is the name of the file or directory, including files specified > with Hadoop-supported globing syntax. This name is passed to the loader > function. > This feature is to support loaders that can load multiple files from > different directories and allows users to pass in the file names in a comma > separated string. > For example, these will be valid load statements: > {code} > LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()'; > {code} > and > {code} > LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader(); > {code} > This comma separated string is passed to the loader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1071) Support comma separated file/directory names in load statements
[ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774061#action_12774061 ] Hadoop QA commented on PIG-1071: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424056/PIG-1071.patch against trunk revision 832804. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/console This message is automatically generated. > Support comma separated file/directory names in load statements > --- > > Key: PIG-1071 > URL: https://issues.apache.org/jira/browse/PIG-1071 > Project: Pig > Issue Type: New Feature >Reporter: Richard Ding >Assignee: Richard Ding > Attachments: PIG-1071.patch > > > Currently Pig Latin support following LOAD syntax: > {code} > LOAD 'data' [USING loader function] [AS schema]; > {code} > where data is the name of the file or directory, including files specified > with Hadoop-supported globing syntax. This name is passed to the loader > function. > This feature is to support loaders that can load multiple files from > different directories and allows users to pass in the file names in a comma > separated string. > For example, these will be valid load statements: > {code} > LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()'; > {code} > and > {code} > LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader(); > {code} > This comma separated string is passed to the loader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1060) MultiQuery optimization throws error for multi-level splits
[ https://issues.apache.org/jira/browse/PIG-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774115#action_12774115 ] Hadoop QA commented on PIG-1060: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424143/PIG-1060.patch against trunk revision 833126. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 319 release audit warnings (more than the trunk's current 318 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/41/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/41/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/41/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/41/console This message is automatically generated. > MultiQuery optimization throws error for multi-level splits > --- > > Key: PIG-1060 > URL: https://issues.apache.org/jira/browse/PIG-1060 > Project: Pig > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Ankur >Assignee: Richard Ding > Attachments: PIG-1060.patch > > > Consider the following scenario :- > 1. Multi-level splits in the map plan. > 2. Each split branch further progressing across a local-global rearrange. > 3. Output of each of these finally merged via a UNION. > MultiQuery optimizer throws the following error in such a case: > "ERROR 2146: Internal Error. Inconsistency in key index found during > optimization." -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1026) [zebra] map split returns null
[ https://issues.apache.org/jira/browse/PIG-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774214#action_12774214 ] Hadoop QA commented on PIG-1026: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423738/PIG_1026.patch against trunk revision 833266. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/141/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/141/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/141/console This message is automatically generated. > [zebra] map split returns null > -- > > Key: PIG-1026 > URL: https://issues.apache.org/jira/browse/PIG-1026 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Yan Zhou > Fix For: 0.6.0 > > Attachments: PIG_1026.patch > > > Here is the test scenario: > final static String STR_SCHEMA = "m1:map(string),m2:map(map(int))"; > //final static String STR_STORAGE = "[m1#{a}];[m2#{x|y}]; [m1#{b}, > m2#{z}];[m1]"; > final static String STR_STORAGE = "[m1#{a}, m2#{x}];[m2#{x|y}]; [m1#{b}, > m2#{z}];[m1,m2]"; > projection: String projection2 = new String("m1#{b}, m2#{x|z}"); > User got null pointer exception on reading m1#{b}. > Yan, please refer to the test class: > TestNonDefaultWholeMapSplit.java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1069) [zebra] Order Preserving Sorted Table Union
[ https://issues.apache.org/jira/browse/PIG-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774644#action_12774644 ] Hadoop QA commented on PIG-1069: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424259/OrderPreservingSortedTableUnion_svn.patch against trunk revision 833549. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/142/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/142/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/142/console This message is automatically generated. > [zebra] Order Preserving Sorted Table Union > --- > > Key: PIG-1069 > URL: https://issues.apache.org/jira/browse/PIG-1069 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.6.0 >Reporter: Yan Zhou >Assignee: Yan Zhou > Attachments: OrderPreservingSortedTableUnion_svn.patch > > > The output schema will adopt a "schema union" semantics, namely, if an output > column only appears in one component table, the result rows will have the > values of the column if the rows are from that component table and null > otherwise; on the other hand, if an output column appears in multiple > component tables, the types of the column in all the component tables must be > identical. Otherwise, an exception will be thrown. The result rows will have > the values of the column if the rows are from the component tables that have > the column themselves, or null if otherwise. > The order preserving sort-unioned results could be further indexed by the > component tables if the projection contains column(s) named "source_table". > If so specified, the component table index will be output at the position(s) > as specified in the projection list. If the underlying table is not a union > of sorted tables, use of the special column name in projection will cause an > exception thrown. > If an attempt is made to create a table of a column named "source_table", an > excpetion will be thrown as the name is reserved by zebra for the virtual > name. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1038) Optimize nested distinct/sort to use secondary key
[ https://issues.apache.org/jira/browse/PIG-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774753#action_12774753 ] Hadoop QA commented on PIG-1038: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424289/PIG-1038-1.patch against trunk revision 833549. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 207 javac compiler warnings (more than the trunk's current 199 warnings). -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. -1 release audit. The applied patch generated 319 release audit warnings (more than the trunk's current 317 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/143/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/143/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/143/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/143/console This message is automatically generated. > Optimize nested distinct/sort to use secondary key > -- > > Key: PIG-1038 > URL: https://issues.apache.org/jira/browse/PIG-1038 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.4.0 >Reporter: Olga Natkovich >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1038-1.patch > > > If nested foreach plan contains sort/distinct, it is possible to use hadoop > secondary sort instead of SortedDataBag and DistinctDataBag to optimize the > query. > Eg1: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = order A by $1; > generate group, D; > } > store C into 'myresult'; > We can specify a secondary sort on A.$1, and drop "order A by $1". > Eg2: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = A.$1; > E = distinct D; > generate group, E; > } > store C into 'myresult'; > We can specify a secondary sort key on A.$1, and simplify "D=A.$1; E=distinct > D" to a special version of distinct, which does not do the sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-979) Acummulator Interface for UDFs
[ https://issues.apache.org/jira/browse/PIG-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774815#action_12774815 ] Hadoop QA commented on PIG-979: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424249/PIG-979.patch against trunk revision 833549. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/144/testReport/ Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/144/console This message is automatically generated. > Acummulator Interface for UDFs > -- > > Key: PIG-979 > URL: https://issues.apache.org/jira/browse/PIG-979 > Project: Pig > Issue Type: New Feature >Reporter: Alan Gates >Assignee: Ying He > Attachments: PIG-979.patch > > > Add an accumulator interface for UDFs that would allow them to take a set > number of records at a time instead of the entire bag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1038) Optimize nested distinct/sort to use secondary key
[ https://issues.apache.org/jira/browse/PIG-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774926#action_12774926 ] Hadoop QA commented on PIG-1038: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424332/PIG-1038-2.patch against trunk revision 833549. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 205 javac compiler warnings (more than the trunk's current 199 warnings). -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 319 release audit warnings (more than the trunk's current 317 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/145/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/145/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/145/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/145/console This message is automatically generated. > Optimize nested distinct/sort to use secondary key > -- > > Key: PIG-1038 > URL: https://issues.apache.org/jira/browse/PIG-1038 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.4.0 >Reporter: Olga Natkovich >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1038-1.patch, PIG-1038-2.patch > > > If nested foreach plan contains sort/distinct, it is possible to use hadoop > secondary sort instead of SortedDataBag and DistinctDataBag to optimize the > query. > Eg1: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = order A by $1; > generate group, D; > } > store C into 'myresult'; > We can specify a secondary sort on A.$1, and drop "order A by $1". > Eg2: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = A.$1; > E = distinct D; > generate group, E; > } > store C into 'myresult'; > We can specify a secondary sort key on A.$1, and simplify "D=A.$1; E=distinct > D" to a special version of distinct, which does not do the sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1080) PigStorage may miss records when loading a file
[ https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776205#action_12776205 ] Hadoop QA commented on PIG-1080: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424510/PIG-1080.patch against trunk revision 834285. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/146/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/146/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/146/console This message is automatically generated. > PigStorage may miss records when loading a file > --- > > Key: PIG-1080 > URL: https://issues.apache.org/jira/browse/PIG-1080 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Richard Ding >Assignee: Richard Ding > Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch > > > When a file is assigned to multiple mappers (one block per mapper), the > blocks may not end at the exact record boundary. Special care is taken to > ensure that all records are loaded by mappers (and exactly once), even for > records that cross the block boundary. > The PigStorage, however, doesn't correctly handle the case where a block ends > at exactly record boundary and results in missing records. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1080) PigStorage may miss records when loading a file
[ https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776218#action_12776218 ] Hadoop QA commented on PIG-1080: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424515/PIG-1080.patch against trunk revision 834285. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/42/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/42/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/42/console This message is automatically generated. > PigStorage may miss records when loading a file > --- > > Key: PIG-1080 > URL: https://issues.apache.org/jira/browse/PIG-1080 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Richard Ding >Assignee: Richard Ding > Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch > > > When a file is assigned to multiple mappers (one block per mapper), the > blocks may not end at the exact record boundary. Special care is taken to > ensure that all records are loaded by mappers (and exactly once), even for > records that cross the block boundary. > The PigStorage, however, doesn't correctly handle the case where a block ends > at exactly record boundary and results in missing records. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1080) PigStorage may miss records when loading a file
[ https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776267#action_12776267 ] Hadoop QA commented on PIG-1080: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424531/PIG-1080.patch against trunk revision 834285. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/147/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/147/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/147/console This message is automatically generated. > PigStorage may miss records when loading a file > --- > > Key: PIG-1080 > URL: https://issues.apache.org/jira/browse/PIG-1080 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Richard Ding >Assignee: Richard Ding > Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch > > > When a file is assigned to multiple mappers (one block per mapper), the > blocks may not end at the exact record boundary. Special care is taken to > ensure that all records are loaded by mappers (and exactly once), even for > records that cross the block boundary. > The PigStorage, however, doesn't correctly handle the case where a block ends > at exactly record boundary and results in missing records. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1085) Pass JobConf and UDF specific configuration information to UDFs
[ https://issues.apache.org/jira/browse/PIG-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776285#action_12776285 ] Hadoop QA commented on PIG-1085: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424546/udfconf.patch against trunk revision 834285. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 204 javac compiler warnings (more than the trunk's current 199 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 323 release audit warnings (more than the trunk's current 318 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/43/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/43/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/43/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/43/console This message is automatically generated. > Pass JobConf and UDF specific configuration information to UDFs > --- > > Key: PIG-1085 > URL: https://issues.apache.org/jira/browse/PIG-1085 > Project: Pig > Issue Type: New Feature > Components: impl >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: udfconf.patch > > > Users have long asked for a way to get the JobConf structure in their UDFs. > It would also be nice to have a way to pass properties between the front end > and back end so that UDFs can store state during parse time and use it at > runtime. > This patch does part of what is proposed in PIG-602, but not all of it. It > does not provide a way to give user specified configuration files to UDFs. > So I will mark 602 as depending on this bug, but it isn't a duplicate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1038) Optimize nested distinct/sort to use secondary key
[ https://issues.apache.org/jira/browse/PIG-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776423#action_12776423 ] Hadoop QA commented on PIG-1038: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424580/PIG-1038-3.patch against trunk revision 834285. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 209 javac compiler warnings (more than the trunk's current 199 warnings). -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 320 release audit warnings (more than the trunk's current 317 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/148/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/148/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/148/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/148/console This message is automatically generated. > Optimize nested distinct/sort to use secondary key > -- > > Key: PIG-1038 > URL: https://issues.apache.org/jira/browse/PIG-1038 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.4.0 >Reporter: Olga Natkovich >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1038-1.patch, PIG-1038-2.patch, PIG-1038-3.patch > > > If nested foreach plan contains sort/distinct, it is possible to use hadoop > secondary sort instead of SortedDataBag and DistinctDataBag to optimize the > query. > Eg1: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = order A by $1; > generate group, D; > } > store C into 'myresult'; > We can specify a secondary sort on A.$1, and drop "order A by $1". > Eg2: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = A.$1; > E = distinct D; > generate group, E; > } > store C into 'myresult'; > We can specify a secondary sort key on A.$1, and simplify "D=A.$1; E=distinct > D" to a special version of distinct, which does not do the sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1038) Optimize nested distinct/sort to use secondary key
[ https://issues.apache.org/jira/browse/PIG-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776821#action_12776821 ] Hadoop QA commented on PIG-1038: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424677/PIG-1038-4.patch against trunk revision 835005. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 209 javac compiler warnings (more than the trunk's current 199 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 320 release audit warnings (more than the trunk's current 318 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/44/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/44/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/44/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/44/console This message is automatically generated. > Optimize nested distinct/sort to use secondary key > -- > > Key: PIG-1038 > URL: https://issues.apache.org/jira/browse/PIG-1038 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.4.0 >Reporter: Olga Natkovich >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1038-1.patch, PIG-1038-2.patch, PIG-1038-3.patch, > PIG-1038-4.patch, PIG-1038-5.patch > > > If nested foreach plan contains sort/distinct, it is possible to use hadoop > secondary sort instead of SortedDataBag and DistinctDataBag to optimize the > query. > Eg1: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = order A by $1; > generate group, D; > } > store C into 'myresult'; > We can specify a secondary sort on A.$1, and drop "order A by $1". > Eg2: > A = load 'mydata'; > B = group A by $0; > C = foreach B { > D = A.$1; > E = distinct D; > generate group, E; > } > store C into 'myresult'; > We can specify a secondary sort key on A.$1, and simplify "D=A.$1; E=distinct > D" to a special version of distinct, which does not do the sorting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator
[ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776832#action_12776832 ] Hadoop QA commented on PIG-1064: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424676/PIG-1064.patch against trunk revision 835005. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/149/console This message is automatically generated. > Behvaiour of COGROUP with and without schema when using "*" operator > > > Key: PIG-1064 > URL: https://issues.apache.org/jira/browse/PIG-1064 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1064.patch > > > I have 2 tab separated files, "1.txt" and "2.txt" > $ cat 1.txt > > 1 2 > 2 3 > > $ cat 2.txt > 1 2 > 2 3 > I use COGROUP feature of Pig in the following way: > $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt' as (b0, b1); > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1012: Each COGroup input has to have the same number of inner plans > Details at logfile: pig_1256845224752.log > == > If I reverse, the order of the schema's > {code} > grunt> A = load '1.txt' as (a0, a1); > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1013: Grouping attributes can either be star (*) or a list of expressions, > but not both. > Details at logfile: pig_1256845224752.log > == > Now running without schema?? > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > grunt> dump C; > {code} > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully > stored result in: "file:/tmp/temp-319926700/tmp-1990275961" > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records > written : 2 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written > : 154 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete! > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!! > ((1,2),{(1,2)},{(1,2)}) > ((2,3),{(2,3)},{(2,3)}) > == > Is this a bug or a feature? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1085) Pass JobConf and UDF specific configuration information to UDFs
[ https://issues.apache.org/jira/browse/PIG-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776897#action_12776897 ] Hadoop QA commented on PIG-1085: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424686/udfconf-2.patch against trunk revision 835189. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 204 javac compiler warnings (more than the trunk's current 199 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/150/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/150/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/150/console This message is automatically generated. > Pass JobConf and UDF specific configuration information to UDFs > --- > > Key: PIG-1085 > URL: https://issues.apache.org/jira/browse/PIG-1085 > Project: Pig > Issue Type: New Feature > Components: impl >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: udfconf-2.patch, udfconf.patch > > > Users have long asked for a way to get the JobConf structure in their UDFs. > It would also be nice to have a way to pass properties between the front end > and back end so that UDFs can store state during parse time and use it at > runtime. > This patch does part of what is proposed in PIG-602, but not all of it. It > does not provide a way to give user specified configuration files to UDFs. > So I will mark 602 as depending on this bug, but it isn't a duplicate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-979) Acummulator Interface for UDFs
[ https://issues.apache.org/jira/browse/PIG-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776960#action_12776960 ] Hadoop QA commented on PIG-979: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424621/PIG-979.patch against trunk revision 835284. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 350 release audit warnings (more than the trunk's current 318 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/151/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/151/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/151/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/151/console This message is automatically generated. > Acummulator Interface for UDFs > -- > > Key: PIG-979 > URL: https://issues.apache.org/jira/browse/PIG-979 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.4.0 >Reporter: Alan Gates >Assignee: Ying He > Fix For: 0.6.0 > > Attachments: PIG-979.patch, PIG-979.patch > > > Add an accumulator interface for UDFs that would allow them to take a set > number of records at a time instead of the entire bag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1089) Pig 0.6.0 Documentation
[ https://issues.apache.org/jira/browse/PIG-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777113#action_12777113 ] Hadoop QA commented on PIG-1089: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424687/Pig-6-Beta.patch against trunk revision 835284. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/152/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/152/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/152/console This message is automatically generated. > Pig 0.6.0 Documentation > --- > > Key: PIG-1089 > URL: https://issues.apache.org/jira/browse/PIG-1089 > Project: Pig > Issue Type: Task > Components: documentation >Affects Versions: 0.6.0 >Reporter: Corinne Chandel >Assignee: Corinne Chandel >Priority: Blocker > Fix For: 0.6.0 > > Attachments: Pig-6-Beta-2.patch, Pig-6-Beta.patch > > > Pig 0.6.0 documentation: > > Ability to use Hadoop dfs commands from Pig > > Replicated left outer join > > Skewed outer join > > Map-side group > > Accumulate Interface for UDFs > > Improved Memory Mgt > > Integration with Zebra -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator
[ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777265#action_12777265 ] Hadoop QA commented on PIG-1064: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424755/PIG-1064-2.patch against trunk revision 835499. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/153/console This message is automatically generated. > Behvaiour of COGROUP with and without schema when using "*" operator > > > Key: PIG-1064 > URL: https://issues.apache.org/jira/browse/PIG-1064 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1064-2.patch, PIG-1064.patch > > > I have 2 tab separated files, "1.txt" and "2.txt" > $ cat 1.txt > > 1 2 > 2 3 > > $ cat 2.txt > 1 2 > 2 3 > I use COGROUP feature of Pig in the following way: > $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt' as (b0, b1); > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1012: Each COGroup input has to have the same number of inner plans > Details at logfile: pig_1256845224752.log > == > If I reverse, the order of the schema's > {code} > grunt> A = load '1.txt' as (a0, a1); > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1013: Grouping attributes can either be star (*) or a list of expressions, > but not both. > Details at logfile: pig_1256845224752.log > == > Now running without schema?? > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > grunt> dump C; > {code} > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully > stored result in: "file:/tmp/temp-319926700/tmp-1990275961" > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records > written : 2 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written > : 154 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete! > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!! > ((1,2),{(1,2)},{(1,2)}) > ((2,3),{(2,3)},{(2,3)}) > == > Is this a bug or a feature? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1090) Update sources to reflect recent changes in load-store interfaces
[ https://issues.apache.org/jira/browse/PIG-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777293#action_12777293 ] Hadoop QA commented on PIG-1090: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424793/PIG-1090.patch against trunk revision 835499. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/47/console This message is automatically generated. > Update sources to reflect recent changes in load-store interfaces > - > > Key: PIG-1090 > URL: https://issues.apache.org/jira/browse/PIG-1090 > Project: Pig > Issue Type: Sub-task >Reporter: Pradeep Kamath >Assignee: Pradeep Kamath > Attachments: PIG-1090.patch > > > There have been some changes (as recorded in the Changes Section, Nov 2 2009 > sub section of http://wiki.apache.org/pig/LoadStoreRedesignProposal) in the > load/store interfaces - this jira is to track the task of making those > changes under src. Changes under test will be addresses in a different jira. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1062) load-store-redesign branch: change SampleLoader and subclasses to work with new LoadFunc interface
[ https://issues.apache.org/jira/browse/PIG-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777352#action_12777352 ] Hadoop QA commented on PIG-1062: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424797/PIG-1062.patch against trunk revision 835499. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/48/console This message is automatically generated. > load-store-redesign branch: change SampleLoader and subclasses to work with > new LoadFunc interface > --- > > Key: PIG-1062 > URL: https://issues.apache.org/jira/browse/PIG-1062 > Project: Pig > Issue Type: Sub-task >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: PIG-1062.patch > > > This is part of the effort to implement new load store interfaces as laid out > in http://wiki.apache.org/pig/LoadStoreRedesignProposal . > PigStorage and BinStorage are now working. > SampleLoader and subclasses -RandomSampleLoader, PoissonSampleLoader need to > be changed to work with new LoadFunc interface. > Fixing SampleLoader and RandomSampleLoader will get order-by queries working. > PoissonSampleLoader is used by skew join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator
[ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777392#action_12777392 ] Hadoop QA commented on PIG-1064: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424792/PIG-1064-3.patch against trunk revision 835499. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/154/console This message is automatically generated. > Behvaiour of COGROUP with and without schema when using "*" operator > > > Key: PIG-1064 > URL: https://issues.apache.org/jira/browse/PIG-1064 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064.patch > > > I have 2 tab separated files, "1.txt" and "2.txt" > $ cat 1.txt > > 1 2 > 2 3 > > $ cat 2.txt > 1 2 > 2 3 > I use COGROUP feature of Pig in the following way: > $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt' as (b0, b1); > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1012: Each COGroup input has to have the same number of inner plans > Details at logfile: pig_1256845224752.log > == > If I reverse, the order of the schema's > {code} > grunt> A = load '1.txt' as (a0, a1); > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1013: Grouping attributes can either be star (*) or a list of expressions, > but not both. > Details at logfile: pig_1256845224752.log > == > Now running without schema?? > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > grunt> dump C; > {code} > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully > stored result in: "file:/tmp/temp-319926700/tmp-1990275961" > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records > written : 2 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written > : 154 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete! > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!! > ((1,2),{(1,2)},{(1,2)}) > ((2,3),{(2,3)},{(2,3)}) > == > Is this a bug or a feature? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator
[ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777699#action_12777699 ] Hadoop QA commented on PIG-1064: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424878/PIG-1064-4.patch against trunk revision 835499. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/155/console This message is automatically generated. > Behvaiour of COGROUP with and without schema when using "*" operator > > > Key: PIG-1064 > URL: https://issues.apache.org/jira/browse/PIG-1064 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, > PIG-1064.patch > > > I have 2 tab separated files, "1.txt" and "2.txt" > $ cat 1.txt > > 1 2 > 2 3 > > $ cat 2.txt > 1 2 > 2 3 > I use COGROUP feature of Pig in the following way: > $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt' as (b0, b1); > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1012: Each COGroup input has to have the same number of inner plans > Details at logfile: pig_1256845224752.log > == > If I reverse, the order of the schema's > {code} > grunt> A = load '1.txt' as (a0, a1); > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1013: Grouping attributes can either be star (*) or a list of expressions, > but not both. > Details at logfile: pig_1256845224752.log > == > Now running without schema?? > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > grunt> dump C; > {code} > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully > stored result in: "file:/tmp/temp-319926700/tmp-1990275961" > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records > written : 2 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written > : 154 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete! > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!! > ((1,2),{(1,2)},{(1,2)}) > ((2,3),{(2,3)},{(2,3)}) > == > Is this a bug or a feature? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1062) load-store-redesign branch: change SampleLoader and subclasses to work with new LoadFunc interface
[ https://issues.apache.org/jira/browse/PIG-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1294#action_1294 ] Hadoop QA commented on PIG-1062: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424927/PIG-1062.patch.3 against trunk revision 835499. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/156/console This message is automatically generated. > load-store-redesign branch: change SampleLoader and subclasses to work with > new LoadFunc interface > --- > > Key: PIG-1062 > URL: https://issues.apache.org/jira/browse/PIG-1062 > Project: Pig > Issue Type: Sub-task >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: PIG-1062.patch, PIG-1062.patch.3 > > > This is part of the effort to implement new load store interfaces as laid out > in http://wiki.apache.org/pig/LoadStoreRedesignProposal . > PigStorage and BinStorage are now working. > SampleLoader and subclasses -RandomSampleLoader, PoissonSampleLoader need to > be changed to work with new LoadFunc interface. > Fixing SampleLoader and RandomSampleLoader will get order-by queries working. > PoissonSampleLoader is used by skew join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra's TableInputFormat
[ https://issues.apache.org/jira/browse/PIG-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1293#action_1293 ] Hadoop QA commented on PIG-1077: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424874/patch_Pig1077 against trunk revision 835499. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 104 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/49/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/49/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/49/console This message is automatically generated. > [Zebra] to support record(row)-based file split in Zebra's TableInputFormat > --- > > Key: PIG-1077 > URL: https://issues.apache.org/jira/browse/PIG-1077 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.4.0 >Reporter: Chao Wang >Assignee: Chao Wang > Fix For: 0.6.0 > > Attachments: patch_Pig1077 > > > TFile currently supports split by record sequence number (see Jira > HADOOP-6218). We want to utilize this to provide record(row)-based input > split support in Zebra. > One prominent benefit is that: in cases where we have very large data files, > we can create much more fine-grained input splits than before where we can > only create one big split for one big file. > In more detail, the new row-based getSplits() works by default (user does not > specify no. of splits to be generated) as follows: > 1) Select the biggest column group in terms of data size, split all of its > TFiles according to hdfs block size (64 MB or 128 MB) and get a list of > physical byte offsets as the output per TFile. For example, let us assume for > the 1st TFile we get offset1, offset2, ..., offset10; > 2) Invoke TFile.getRecordNumNear(long offset) to get the RecordNum of a > key-value pair near a byte offset. For the example above, say we get > recordNum1, recordNum2, ..., recordNum10; > 3) Stitch [0, recordNum1], [recordNum1+1, recordNum2], ..., [recordNum9+1, > recordNum10], [recordNum10+1, lastRecordNum] splits of all column groups, > respectively to form 11 record-based input splits for the 1st TFile. > 4) For each input split, we need to create a TFile scanner through: > TFile.createScannerByRecordNum(long beginRecNum, long endRecNum). > Note: conversion from byte offset to record number will be done by each > mapper, rather than being done at the job initialization phase. This is due > to performance concern since the conversion incurs some TFile reading > overhead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-872) use distributed cache for the replicated data set in FR join
[ https://issues.apache.org/jira/browse/PIG-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778732#action_12778732 ] Hadoop QA commented on PIG-872: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425174/PIG_872.patch against trunk revision 881008. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/157/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/157/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/157/console This message is automatically generated. > use distributed cache for the replicated data set in FR join > > > Key: PIG-872 > URL: https://issues.apache.org/jira/browse/PIG-872 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Sriranjan Manjunath > Attachments: PIG_872.patch > > > Currently, the replicated file is read directly from DFS by all maps. If the > number of the concurrent maps is huge, we can overwhelm the NameNode with > open calls. > Using distributed cache will address the issue and might also give a > performance boost since the file will be copied locally once and the reused > by all tasks running on the same machine. > The basic approach would be to use cacheArchive to place the file into the > cache on the frontend and on the backend, the tasks would need to refer to > the data using path from the cache. > Note that cacheArchive does not work in Hadoop local mode. (Not a problem for > us right now as we don't use it.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1053) Consider moving to Hadoop for local mode
[ https://issues.apache.org/jira/browse/PIG-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779249#action_12779249 ] Hadoop QA commented on PIG-1053: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425265/hadoopLocal.patch against trunk revision 881008. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 356 release audit warnings (more than the trunk's current 354 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/158/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/158/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/158/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/158/console This message is automatically generated. > Consider moving to Hadoop for local mode > > > Key: PIG-1053 > URL: https://issues.apache.org/jira/browse/PIG-1053 > Project: Pig > Issue Type: Improvement >Reporter: Alan Gates >Assignee: Ankit Modi > Attachments: hadoopLocal.patch > > > We need to consider moving Pig to use Hadoop's local mode instead of its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1053) Consider moving to Hadoop for local mode
[ https://issues.apache.org/jira/browse/PIG-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779344#action_12779344 ] Hadoop QA commented on PIG-1053: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425289/hadoopLocal.patch against trunk revision 881008. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 22 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 356 release audit warnings (more than the trunk's current 354 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/159/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/159/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/159/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/159/console This message is automatically generated. > Consider moving to Hadoop for local mode > > > Key: PIG-1053 > URL: https://issues.apache.org/jira/browse/PIG-1053 > Project: Pig > Issue Type: Improvement >Reporter: Alan Gates >Assignee: Ankit Modi > Attachments: hadoopLocal.patch > > > We need to consider moving Pig to use Hadoop's local mode instead of its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1064) Behvaiour of COGROUP with and without schema when using "*" operator
[ https://issues.apache.org/jira/browse/PIG-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779693#action_12779693 ] Hadoop QA commented on PIG-1064: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425360/PIG-1064-5.patch against trunk revision 881008. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/160/console This message is automatically generated. > Behvaiour of COGROUP with and without schema when using "*" operator > > > Key: PIG-1064 > URL: https://issues.apache.org/jira/browse/PIG-1064 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1064-2.patch, PIG-1064-3.patch, PIG-1064-4.patch, > PIG-1064-5.patch, PIG-1064.patch > > > I have 2 tab separated files, "1.txt" and "2.txt" > $ cat 1.txt > > 1 2 > 2 3 > > $ cat 2.txt > 1 2 > 2 3 > I use COGROUP feature of Pig in the following way: > $java -cp pig.jar:$HADOOP_HOME org.apache.pig.Main > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt' as (b0, b1); > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:46:04,150 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1012: Each COGroup input has to have the same number of inner plans > Details at logfile: pig_1256845224752.log > == > If I reverse, the order of the schema's > {code} > grunt> A = load '1.txt' as (a0, a1); > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > {code} > 2009-10-29 12:49:27,869 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1013: Grouping attributes can either be star (*) or a list of expressions, > but not both. > Details at logfile: pig_1256845224752.log > == > Now running without schema?? > {code} > grunt> A = load '1.txt'; > grunt> B = load '2.txt'; > grunt> C = cogroup A by *, B by *; > grunt> dump C; > {code} > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Successfully > stored result in: "file:/tmp/temp-319926700/tmp-1990275961" > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records > written : 2 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes written > : 154 > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete! > 2009-10-29 12:55:37,202 [main] INFO > org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!! > ((1,2),{(1,2)},{(1,2)}) > ((2,3),{(2,3)},{(2,3)}) > == > Is this a bug or a feature? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1099) [zebra] version on APACHE trunk should be 0.7.0 to be in pace with PIG
[ https://issues.apache.org/jira/browse/PIG-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779839#action_12779839 ] Hadoop QA commented on PIG-1099: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425431/PIG_1099.patch against trunk revision 881937. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/161/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/161/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/161/console This message is automatically generated. > [zebra] version on APACHE trunk should be 0.7.0 to be in pace with PIG > -- > > Key: PIG-1099 > URL: https://issues.apache.org/jira/browse/PIG-1099 > Project: Pig > Issue Type: Bug >Reporter: Yan Zhou >Assignee: Yan Zhou >Priority: Trivial > Attachments: PIG_1099.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1088) change merge join and merge join indexer to work with new LoadFunc interface
[ https://issues.apache.org/jira/browse/PIG-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780384#action_12780384 ] Hadoop QA commented on PIG-1088: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425554/PIG-1088.patch against trunk revision 882340. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/50/console This message is automatically generated. > change merge join and merge join indexer to work with new LoadFunc interface > > > Key: PIG-1088 > URL: https://issues.apache.org/jira/browse/PIG-1088 > Project: Pig > Issue Type: Sub-task >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: PIG-1088.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1091) [zebra] Exception when load with projection of map keys on a map column that is not map split
[ https://issues.apache.org/jira/browse/PIG-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780456#action_12780456 ] Hadoop QA commented on PIG-1091: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425542/PIG-1091.patch against trunk revision 882340. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/162/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/162/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/162/console This message is automatically generated. > [zebra] Exception when load with projection of map keys on a map column that > is not map split > -- > > Key: PIG-1091 > URL: https://issues.apache.org/jira/browse/PIG-1091 > Project: Pig > Issue Type: Bug >Reporter: Yan Zhou >Assignee: Yan Zhou >Priority: Minor > Attachments: PIG-1091.patch > > > With schema of "f1:string, f2:map", storage info of "[f1]; [f2]", a > projection of "f2#{a}" will see exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1078) [zebra] merge join with empty table failed
[ https://issues.apache.org/jira/browse/PIG-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780762#action_12780762 ] Hadoop QA commented on PIG-1078: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425632/PIG-1078.patch against trunk revision 882340. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/163/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/163/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/163/console This message is automatically generated. > [zebra] merge join with empty table failed > -- > > Key: PIG-1078 > URL: https://issues.apache.org/jira/browse/PIG-1078 > Project: Pig > Issue Type: Bug >Reporter: Jing Huang > Fix For: 0.6.0 > > Attachments: PIG-1078.patch > > > Got indexOutOfBound exception. > Here is the pig script: > register /grid/0/dev/hadoopqa/jars/zebra.jar; > --a1 = load '1.txt' as (a:int, > b:float,c:long,d:double,e:chararray,f:bytearray,r1(f1:chararray,f2:chararray),m1:map[]); > --a2 = load 'empty.txt' as (a:int, > b:float,c:long,d:double,e:chararray,f:bytearray,r1(f1:chararray,f2:chararray),m1:map[]); > --dump a1; > --a1order = order a1 by a; > --a2order = order a2 by a; > --store a1order into 'a1' using > org.apache.hadoop.zebra.pig.TableStorer('[a,b,c];[d,e,f,r1,m1]'); > --store a2order into 'empty' using > org.apache.hadoop.zebra.pig.TableStorer('[a,b,c];[d,e,f,r1,m1]'); > rec1 = load 'a1' using org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load 'empty' using org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by a, rec2 by a using "merge" ; > dump joina; > == > please note that table "a1" and "empty" are created correctly. > Here is the stack trace: > Backend error message > - > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.zebra.mapred.TableInputFormat.getTableRecordReader(TableInputFormat.java:478) > at > org.apache.hadoop.zebra.pig.TableLoader.bindTo(TableLoader.java:166) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.seekInRightStream(POMergeJoin.java:400) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:181) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:247) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:159) > Pig Stack Trace > --- > ERROR 6015: During execution, encountered a Hadoop error. > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias joina > at org.apache.pig.PigServer.openIterator(PigServer.java:481) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) > at org.apache.pig.Main.main(Main.java:386) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6015: > During execution, encountered a Hadoop error. > at > .apache.hadoop.zebra.mapred.TableInputFormat.getTableRecordReader(TableInputFormat.java:478) > at .apache.hadoop.zebra.pig
[jira] Commented: (PIG-1101) Pig parser does not recognize its own data type in LIMIT statement
[ https://issues.apache.org/jira/browse/PIG-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780925#action_12780925 ] Hadoop QA commented on PIG-1101: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425702/pig-1101.patch against trunk revision 882818. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/164/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/164/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/164/console This message is automatically generated. > Pig parser does not recognize its own data type in LIMIT statement > -- > > Key: PIG-1101 > URL: https://issues.apache.org/jira/browse/PIG-1101 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Ashutosh Chauhan >Priority: Minor > Fix For: 0.7.0 > > Attachments: pig-1101.patch > > > I have a Pig script in which I specify the number of records to limit as a > long type. > {code} > A = LOAD '/user/viraj/echo.txt' AS (txt:chararray); > B = LIMIT A 10L; > DUMP B; > {code} > I get a parser error: > 2009-11-21 02:25:51,100 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Encountered " "10L "" at line 3, > column 13. > Was expecting: > ... > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.generateParseException(QueryParser.java:8963) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.jj_consume_token(QueryParser.java:8839) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.LimitClause(QueryParser.java:1656) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1280) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682) > at > org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1017) > In fact 10L seems to work in the foreach generate construct. > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-872) use distributed cache for the replicated data set in FR join
[ https://issues.apache.org/jira/browse/PIG-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781415#action_12781415 ] Hadoop QA commented on PIG-872: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425805/PIG_872.patch.1 against trunk revision 882818. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/165/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/165/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/165/console This message is automatically generated. > use distributed cache for the replicated data set in FR join > > > Key: PIG-872 > URL: https://issues.apache.org/jira/browse/PIG-872 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Sriranjan Manjunath > Attachments: PIG_872.patch.1 > > > Currently, the replicated file is read directly from DFS by all maps. If the > number of the concurrent maps is huge, we can overwhelm the NameNode with > open calls. > Using distributed cache will address the issue and might also give a > performance boost since the file will be copied locally once and the reused > by all tasks running on the same machine. > The basic approach would be to use cacheArchive to place the file into the > cache on the frontend and on the backend, the tasks would need to refer to > the data using path from the cache. > Note that cacheArchive does not work in Hadoop local mode. (Not a problem for > us right now as we don't use it.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-872) use distributed cache for the replicated data set in FR join
[ https://issues.apache.org/jira/browse/PIG-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781560#action_12781560 ] Hadoop QA commented on PIG-872: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425805/PIG_872.patch.1 against trunk revision 882818. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/51/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/51/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/51/console This message is automatically generated. > use distributed cache for the replicated data set in FR join > > > Key: PIG-872 > URL: https://issues.apache.org/jira/browse/PIG-872 > Project: Pig > Issue Type: Improvement >Reporter: Olga Natkovich >Assignee: Sriranjan Manjunath > Attachments: PIG_872.patch.1 > > > Currently, the replicated file is read directly from DFS by all maps. If the > number of the concurrent maps is huge, we can overwhelm the NameNode with > open calls. > Using distributed cache will address the issue and might also give a > performance boost since the file will be copied locally once and the reused > by all tasks running on the same machine. > The basic approach would be to use cacheArchive to place the file into the > cache on the frontend and on the backend, the tasks would need to refer to > the data using path from the cache. > Note that cacheArchive does not work in Hadoop local mode. (Not a problem for > us right now as we don't use it.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-598) Parameter substitution ($PARAMETER) should not be performed in comments
[ https://issues.apache.org/jira/browse/PIG-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781705#action_12781705 ] Hadoop QA commented on PIG-598: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425862/PIG-598.1.patch against trunk revision 882818. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 213 javac compiler warnings (more than the trunk's current 211 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 361 release audit warnings (more than the trunk's current 356 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/console This message is automatically generated. > Parameter substitution ($PARAMETER) should not be performed in comments > --- > > Key: PIG-598 > URL: https://issues.apache.org/jira/browse/PIG-598 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 >Reporter: David Ciemiewicz >Assignee: Thejas M Nair > Attachments: PIG-598.1.patch, PIG-598.patch > > > Compiling the following code example will generate an error that > $NOT_A_PARAMETER is an Undefined Parameter. > This is problematic as sometimes you want to comment out parts of your code, > including parameters so that you don't have to define them. > This I think it would be really good if parameter substitution was not > performed in comments. > {code} > -- $NOT_A_PARAMETER > {code} > {code} > -bash-3.00$ pig -exectype local -latest comment.pig > USING: /grid/0/gs/pig/current > java.lang.RuntimeException: Undefined parameter : NOT_A_PARAMETER > at > org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:221) > at > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106) > at > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86) > at org.apache.pig.Main.runParamPreprocessor(Main.java:394) > at org.apache.pig.Main.main(Main.java:296) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1095) [zebra] Schema support of anonymous fields in COLECTION fails
[ https://issues.apache.org/jira/browse/PIG-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781781#action_12781781 ] Hadoop QA commented on PIG-1095: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425897/PIG-1095.patch against trunk revision 883515. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/53/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/53/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/53/console This message is automatically generated. > [zebra] Schema support of anonymous fields in COLECTION fails > - > > Key: PIG-1095 > URL: https://issues.apache.org/jira/browse/PIG-1095 > Project: Pig > Issue Type: Bug >Reporter: Yan Zhou >Assignee: Yan Zhou >Priority: Minor > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-1095.patch > > > The schema parser fails on schemas of COLLECTION columns like > c:collection(int). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1074) Zebra store function should allow '::' in column names in output schema
[ https://issues.apache.org/jira/browse/PIG-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782317#action_12782317 ] Hadoop QA commented on PIG-1074: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426051/PIG-1074.patch against trunk revision 883903. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/54/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/54/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/54/console This message is automatically generated. > Zebra store function should allow '::' in column names in output schema > --- > > Key: PIG-1074 > URL: https://issues.apache.org/jira/browse/PIG-1074 > Project: Pig > Issue Type: Bug >Reporter: Pradeep Kamath >Assignee: Yan Zhou > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-1074.patch > > > the following script fails: > {noformat} > a = load '/zebra/singlefile/studenttab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, gpa); > b = load '/zebra/singlefile/votertab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, registration, > contributions); > c = filter a by age < 20; > d = filter b by age < 20; > store c into > '/user/pig/out//ZebraMultiQuery_30.out.1' using > org.apache.hadoop.zebra.pig.TableStorer(''); > store d into > '/user/pig/out//ZebraMultiQuery_30.out.2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > e = cogroup c by name, d by name; > f = foreach e generate flatten(c), flatten(d); > store f into '/user/pig//ZebraMultiQuery_30.out.3' > using org.apache.hadoop.zebra.pig.TableStorer(''); > {noformat} > Here the schema of f has names like c::name and it looks like zebra storefunc > does not allow '::' in column name > The stack trace is > > ERROR 2997: Unable to recreate exception from backend error: > java.io.IOException: ColumnGroup.Writer constructor failed : Partition > constructor failed :Encountered " ":" ": "" at line 1, column 3. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-990) Provide a way to pin LogicalOperator Options
[ https://issues.apache.org/jira/browse/PIG-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782693#action_12782693 ] Hadoop QA commented on PIG-990: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426160/pinned_options_3.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/55/testReport/ Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/55/console This message is automatically generated. > Provide a way to pin LogicalOperator Options > > > Key: PIG-990 > URL: https://issues.apache.org/jira/browse/PIG-990 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Dmitriy V. Ryaboy >Assignee: Dmitriy V. Ryaboy >Priority: Minor > Fix For: 0.7.0 > > Attachments: pinned_options.patch, pinned_options_2.patch, > pinned_options_3.patch > > > This is a proactive patch, setting up the groundwork for adding an optimizer. > Some of the LogicalOperators have options. For example, LOJoin has a variety > of join types (regular, fr, skewed, merge), which can be set by the user or > chosen by a hypothetical optimizer. If a user selects a join type, pig > philoophy guides us to always respect the user's choice and not explore > alternatives. Therefore, we need a way to "pin" options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1074) Zebra store function should allow '::' in column names in output schema
[ https://issues.apache.org/jira/browse/PIG-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782750#action_12782750 ] Hadoop QA commented on PIG-1074: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426166/PIG-1074.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/56/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/56/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/56/console This message is automatically generated. > Zebra store function should allow '::' in column names in output schema > --- > > Key: PIG-1074 > URL: https://issues.apache.org/jira/browse/PIG-1074 > Project: Pig > Issue Type: Bug >Reporter: Pradeep Kamath >Assignee: Yan Zhou > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-1074.patch, PIG-1074.patch, PIG-1074.patch > > > the following script fails: > {noformat} > a = load '/zebra/singlefile/studenttab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, gpa); > b = load '/zebra/singlefile/votertab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, registration, > contributions); > c = filter a by age < 20; > d = filter b by age < 20; > store c into > '/user/pig/out//ZebraMultiQuery_30.out.1' using > org.apache.hadoop.zebra.pig.TableStorer(''); > store d into > '/user/pig/out//ZebraMultiQuery_30.out.2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > e = cogroup c by name, d by name; > f = foreach e generate flatten(c), flatten(d); > store f into '/user/pig//ZebraMultiQuery_30.out.3' > using org.apache.hadoop.zebra.pig.TableStorer(''); > {noformat} > Here the schema of f has names like c::name and it looks like zebra storefunc > does not allow '::' in column name > The stack trace is > > ERROR 2997: Unable to recreate exception from backend error: > java.io.IOException: ColumnGroup.Writer constructor failed : Partition > constructor failed :Encountered " ":" ": "" at line 1, column 3. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1107) PigLineRecordReader bails out on an empty line for compressed data
[ https://issues.apache.org/jira/browse/PIG-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782806#action_12782806 ] Hadoop QA commented on PIG-1107: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426168/pig_piglinerecordreader_bug.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/57/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/57/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/57/console This message is automatically generated. > PigLineRecordReader bails out on an empty line for compressed data > -- > > Key: PIG-1107 > URL: https://issues.apache.org/jira/browse/PIG-1107 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Ankit Modi >Assignee: Ankit Modi > Fix For: 0.6.0 > > Attachments: pig_piglinerecordreader_bug.patch > > > PigLineRecordReader bails out with an exception when it encounters an empty > line in a compressed file > java.lang.ArrayIndexOutOfBoundsException: -1 >at > org.apache.pig.impl.io.PigLineRecordReader$LineReader.getNext(PigLineRecordReader.java:136) > at > org.apache.pig.impl.io.PigLineRecordReader.next(PigLineRecordReader.java:57) > at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:121) > at > org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:164) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:140) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1074) Zebra store function should allow '::' in column names in output schema
[ https://issues.apache.org/jira/browse/PIG-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782884#action_12782884 ] Hadoop QA commented on PIG-1074: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426177/PIG-1074.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/58/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/58/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/58/console This message is automatically generated. > Zebra store function should allow '::' in column names in output schema > --- > > Key: PIG-1074 > URL: https://issues.apache.org/jira/browse/PIG-1074 > Project: Pig > Issue Type: Bug >Reporter: Pradeep Kamath >Assignee: Yan Zhou > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-1074.patch, PIG-1074.patch, PIG-1074.patch > > > the following script fails: > {noformat} > a = load '/zebra/singlefile/studenttab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, gpa); > b = load '/zebra/singlefile/votertab10k' using > org.apache.hadoop.zebra.pig.TableLoader() as (name, age, registration, > contributions); > c = filter a by age < 20; > d = filter b by age < 20; > store c into > '/user/pig/out//ZebraMultiQuery_30.out.1' using > org.apache.hadoop.zebra.pig.TableStorer(''); > store d into > '/user/pig/out//ZebraMultiQuery_30.out.2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > e = cogroup c by name, d by name; > f = foreach e generate flatten(c), flatten(d); > store f into '/user/pig//ZebraMultiQuery_30.out.3' > using org.apache.hadoop.zebra.pig.TableStorer(''); > {noformat} > Here the schema of f has names like c::name and it looks like zebra storefunc > does not allow '::' in column name > The stack trace is > > ERROR 2997: Unable to recreate exception from backend error: > java.io.IOException: ColumnGroup.Writer constructor failed : Partition > constructor failed :Encountered " ":" ": "" at line 1, column 3. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-990) Provide a way to pin LogicalOperator Options
[ https://issues.apache.org/jira/browse/PIG-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782941#action_12782941 ] Hadoop QA commented on PIG-990: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426178/pinned_options_4.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/59/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/59/console This message is automatically generated. > Provide a way to pin LogicalOperator Options > > > Key: PIG-990 > URL: https://issues.apache.org/jira/browse/PIG-990 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Dmitriy V. Ryaboy >Assignee: Dmitriy V. Ryaboy >Priority: Minor > Fix For: 0.7.0 > > Attachments: pinned_options.patch, pinned_options_2.patch, > pinned_options_3.patch, pinned_options_4.patch > > > This is a proactive patch, setting up the groundwork for adding an optimizer. > Some of the LogicalOperators have options. For example, LOJoin has a variety > of join types (regular, fr, skewed, merge), which can be set by the user or > chosen by a hypothetical optimizer. If a user selects a join type, pig > philoophy guides us to always respect the user's choice and not explore > alternatives. Therefore, we need a way to "pin" options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782976#action_12782976 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426180/PIG-922-p3_10.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/60/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/60/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/60/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/60/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, > PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, > PIG-922-p3_10.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, > PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, > PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782978#action_12782978 ] Hadoop QA commented on PIG-760: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426186/pigstorageschema_3.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/61/testReport/ Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/61/console This message is automatically generated. > Serialize schemas for PigStorage() and other storage types. > --- > > Key: PIG-760 > URL: https://issues.apache.org/jira/browse/PIG-760 > Project: Pig > Issue Type: New Feature >Reporter: David Ciemiewicz >Assignee: Dmitriy V. Ryaboy > Fix For: 0.7.0 > > Attachments: pigstorageschema-2.patch, pigstorageschema.patch, > pigstorageschema_3.patch > > > I'm finding PigStorage() really convenient for storage and data interchange > because it compresses well and imports into Excel and other analysis > environments well. > However, it is a pain when it comes to maintenance because the columns are in > fixed locations and I'd like to add columns in some cases. > It would be great if load PigStorage() could read a default schema from a > .schema file stored with the data and if store PigStorage() could store a > .schema file with the data. > I have tested this out and both Hadoop HDFS and Pig in -exectype local mode > will ignore a file called .schema in a directory of part files. > So, for example, if I have a chain of Pig scripts I execute such as: > A = load 'data-1' using PigStorage() as ( a: int , b: int ); > store A into 'data-2' using PigStorage(); > B = load 'data-2' using PigStorage(); > describe B; > describe B should output something like { a: int, b: int } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783008#action_12783008 ] Hadoop QA commented on PIG-760: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426245/pigstorageschema_3.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/62/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/62/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/62/console This message is automatically generated. > Serialize schemas for PigStorage() and other storage types. > --- > > Key: PIG-760 > URL: https://issues.apache.org/jira/browse/PIG-760 > Project: Pig > Issue Type: New Feature >Reporter: David Ciemiewicz >Assignee: Dmitriy V. Ryaboy > Fix For: 0.7.0 > > Attachments: pigstorageschema-2.patch, pigstorageschema.patch, > pigstorageschema_3.patch > > > I'm finding PigStorage() really convenient for storage and data interchange > because it compresses well and imports into Excel and other analysis > environments well. > However, it is a pain when it comes to maintenance because the columns are in > fixed locations and I'd like to add columns in some cases. > It would be great if load PigStorage() could read a default schema from a > .schema file stored with the data and if store PigStorage() could store a > .schema file with the data. > I have tested this out and both Hadoop HDFS and Pig in -exectype local mode > will ignore a file called .schema in a directory of part files. > So, for example, if I have a chain of Pig scripts I execute such as: > A = load 'data-1' using PigStorage() as ( a: int , b: int ); > store A into 'data-2' using PigStorage(); > B = load 'data-2' using PigStorage(); > describe B; > describe B should output something like { a: int, b: int } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-990) Provide a way to pin LogicalOperator Options
[ https://issues.apache.org/jira/browse/PIG-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783032#action_12783032 ] Hadoop QA commented on PIG-990: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426246/pinned_options_5.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 363 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/63/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/63/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/63/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/63/console This message is automatically generated. > Provide a way to pin LogicalOperator Options > > > Key: PIG-990 > URL: https://issues.apache.org/jira/browse/PIG-990 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Dmitriy V. Ryaboy >Assignee: Dmitriy V. Ryaboy >Priority: Minor > Fix For: 0.7.0 > > Attachments: pinned_options.patch, pinned_options_2.patch, > pinned_options_3.patch, pinned_options_4.patch, pinned_options_5.patch > > > This is a proactive patch, setting up the groundwork for adding an optimizer. > Some of the LogicalOperators have options. For example, LOJoin has a variety > of join types (regular, fr, skewed, merge), which can be set by the user or > chosen by a hypothetical optimizer. If a user selects a join type, pig > philoophy guides us to always respect the user's choice and not explore > alternatives. Therefore, we need a way to "pin" options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783188#action_12783188 ] Hadoop QA commented on PIG-760: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426297/pigstorageschema_4.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/64/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/64/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/64/console This message is automatically generated. > Serialize schemas for PigStorage() and other storage types. > --- > > Key: PIG-760 > URL: https://issues.apache.org/jira/browse/PIG-760 > Project: Pig > Issue Type: New Feature >Reporter: David Ciemiewicz >Assignee: Dmitriy V. Ryaboy > Fix For: 0.7.0 > > Attachments: pigstorageschema-2.patch, pigstorageschema.patch, > pigstorageschema_3.patch, pigstorageschema_4.patch > > > I'm finding PigStorage() really convenient for storage and data interchange > because it compresses well and imports into Excel and other analysis > environments well. > However, it is a pain when it comes to maintenance because the columns are in > fixed locations and I'd like to add columns in some cases. > It would be great if load PigStorage() could read a default schema from a > .schema file stored with the data and if store PigStorage() could store a > .schema file with the data. > I have tested this out and both Hadoop HDFS and Pig in -exectype local mode > will ignore a file called .schema in a directory of part files. > So, for example, if I have a chain of Pig scripts I execute such as: > A = load 'data-1' using PigStorage() as ( a: int , b: int ); > store A into 'data-2' using PigStorage(); > B = load 'data-2' using PigStorage(); > describe B; > describe B should output something like { a: int, b: int } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-990) Provide a way to pin LogicalOperator Options
[ https://issues.apache.org/jira/browse/PIG-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783210#action_12783210 ] Hadoop QA commented on PIG-990: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426301/pinned_options_6.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/65/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/65/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/65/console This message is automatically generated. > Provide a way to pin LogicalOperator Options > > > Key: PIG-990 > URL: https://issues.apache.org/jira/browse/PIG-990 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Dmitriy V. Ryaboy >Assignee: Dmitriy V. Ryaboy >Priority: Minor > Fix For: 0.7.0 > > Attachments: pinned_options.patch, pinned_options_2.patch, > pinned_options_3.patch, pinned_options_4.patch, pinned_options_5.patch, > pinned_options_6.patch > > > This is a proactive patch, setting up the groundwork for adding an optimizer. > Some of the LogicalOperators have options. For example, LOJoin has a variety > of join types (regular, fr, skewed, merge), which can be set by the user or > chosen by a hypothetical optimizer. If a user selects a join type, pig > philoophy guides us to always respect the user's choice and not explore > alternatives. Therefore, we need a way to "pin" options. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783608#action_12783608 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426389/PIG-922-p3_11.patch against trunk revision 884235. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/66/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/66/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/66/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/66/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, > PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, > PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_2.patch, > PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, > PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1108) Incorrect map output key type in MultiQuery optimization
[ https://issues.apache.org/jira/browse/PIG-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783876#action_12783876 ] Hadoop QA commented on PIG-1108: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426434/PIG-1108.patch against trunk revision 885465. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/67/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/67/console This message is automatically generated. > Incorrect map output key type in MultiQuery optimization > > > Key: PIG-1108 > URL: https://issues.apache.org/jira/browse/PIG-1108 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Ankur >Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: PIG-1108.patch > > > When trying to merge 2 split plans, one of which never progresses along an > M/R boundary, PIG sets the map-output key type incorrectly resulting in the > following error:- > java.io.IOException: Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807) > at > org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:159) > Here is a small script to be used a reproducible test case > rmf plan1 > rmf plan2 > A = LOAD 'data' USING PigStorage() as (a: int, b: chararray); > SPLIT A into plan1 IF (a>5), plan2 IF (a<5); > B = GROUP plan1 BY b; > C = FOREACH B { > tmp = ORDER plan1 BY a desc; > GENERATE FLATTEN(group) as b, tmp; > }; > D = FILTER C BY b is not null; > STORE D into 'plan1'; > STORE plan2 into 'plan2'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1084) Pig CookBook documentation "Take Advantage of Join Optimization" additions:Merge and Skewed Join
[ https://issues.apache.org/jira/browse/PIG-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784007#action_12784007 ] Hadoop QA commented on PIG-1084: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426453/cookbook.patch against trunk revision 885465. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/68/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/68/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/68/console This message is automatically generated. > Pig CookBook documentation "Take Advantage of Join Optimization" > additions:Merge and Skewed Join > > > Key: PIG-1084 > URL: https://issues.apache.org/jira/browse/PIG-1084 > Project: Pig > Issue Type: Bug > Components: documentation >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Corinne Chandel > Fix For: 0.6.0 > > Attachments: cookbook.patch > > > Hi all, > We have a host of Join optimizations that have been implemented recently in > Pig to improve performance. These include: > http://hadoop.apache.org/pig/docs/r0.5.0/piglatin_reference.html#JOIN > 1) Merge Join > 2) Skewed Join > It would be nice to mention the Merge Join and Skewed join in the following > section on the PigCookBook > http://hadoop.apache.org/pig/docs/r0.5.0/cookbook.html#Take+Advantage+of+Join+Optimization > Can we update this release 0.6?? > Thanks > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization
[ https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784087#action_12784087 ] Hadoop QA commented on PIG-978: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426454/pig-latin-users-guide.patch against trunk revision 885465. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/69/console This message is automatically generated. > ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) > and ERROR 2999: (Unexpected internal error. null) when using Multi-Query > optimization > --- > > Key: PIG-978 > URL: https://issues.apache.org/jira/browse/PIG-978 > Project: Pig > Issue Type: Bug > Components: documentation >Affects Versions: 0.6.0 >Reporter: Viraj Bhat >Assignee: Corinne Chandel > Fix For: 0.6.0 > > Attachments: pig-latin-users-guide.patch > > > I have Pig script of this form.. which I execute using Multi-query > optimization. > {code} > A = load '/user/viraj/firstinput' using PigStorage(); > B = group > C = ..agrregation function > store C into '/user/viraj/firstinputtempresult/days1'; > .. > Atab = load '/user/viraj/secondinput' using PigStorage(); > Btab = group > Ctab = ..agrregation function > store Ctab into '/user/viraj/secondinputtempresult/days1'; > .. > E = load '/user/viraj/firstinputtempresult/' using PigStorage(); > F = group > G = aggregation function > store G into '/user/viraj/finalresult1'; > Etab = load '/user/viraj/secondinputtempresult/' using PigStorage(); > Ftab = group > Gtab = aggregation function > store Gtab into '/user/viraj/finalresult2'; > {code} > 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. > Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log) > is due to the mismatch of store/load commands. The script first stores files > into the 'days1' directory (store C into > '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later > loads from the top level directory (E = load > '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the > original directory (/user/viraj/firstinputtempresult/days1). > The current multi-query optimizer can't solve the dependency between these > two commands--they have different load file paths. So the jobs will run > concurrently and result in the errors. > The solution is to add 'exec' or 'run' command after the first two stores . > This will force the first two store commands to run before the rest commands. > It would be nice to see this fixed as a part of an enhancement to the > Multi-query. We either disable the Multi-query or throw a warning/error > message, so that the user can correct his load/store statements. > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784167#action_12784167 ] Hadoop QA commented on PIG-1098: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426477/PIG-1098.patch against trunk revision 885465. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/70/console This message is automatically generated. > [zebra] Zebra Performance Optimizations > --- > > Key: PIG-1098 > URL: https://issues.apache.org/jira/browse/PIG-1098 > Project: Pig > Issue Type: Improvement >Reporter: Yan Zhou >Assignee: Yan Zhou >Priority: Minor > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-1098.patch > > > Many in-core performance optimization opportunities exist in zebra, such as > removal of redundant precautionary checks, use of better collection types to > reduce levels of indirection to the memory objects, changing of input splits > in ascending sizes to descending sizes. Observed protyped improvements are > around 10% wall clock time improvements. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1113) Diamond query optimization throws error in JOIN
[ https://issues.apache.org/jira/browse/PIG-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784515#action_12784515 ] Hadoop QA commented on PIG-1113: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426566/PIG-1113.patch against trunk revision 885858. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/console This message is automatically generated. > Diamond query optimization throws error in JOIN > --- > > Key: PIG-1113 > URL: https://issues.apache.org/jira/browse/PIG-1113 > Project: Pig > Issue Type: Bug >Reporter: Ankur >Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: PIG-1113.patch > > > The following script results in 1 M/R job as a result of diamond query > optimization but the script fails. > set1 = LOAD 'set1' USING PigStorage as (a:chararray, b:chararray, > c:chararray); > set2 = LOAD 'set2' USING PigStorage as (a: chararray, b:chararray, c:bag{}); > set2_1 = FOREACH set2 GENERATE a as f1, b as f2, (chararray) 0 as f3; > set2_2 = FOREACH set2 GENERATE a as f1, FLATTEN((IsEmpty(c) ? null : c)) as > f2, (chararray) 1 as f3; > all_set2 = UNION set2_1, set2_2; > joined_sets = JOIN set1 BY (a,b), all_set2 BY (f2,f3); > dump joined_sets; > And here is the error > org.apache.pig.backend.executionengine.ExecException: ERROR 1071: Cannot > convert a bag to a String > at org.apache.pig.data.DataType.toString(DataType.java:739) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:625) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:288) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:247) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1114) MultiQuery optimization throws error when merging 2 level splits
[ https://issues.apache.org/jira/browse/PIG-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784600#action_12784600 ] Hadoop QA commented on PIG-1114: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426576/PIG-1114.patch against trunk revision 885953. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/72/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/72/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/72/console This message is automatically generated. > MultiQuery optimization throws error when merging 2 level splits > > > Key: PIG-1114 > URL: https://issues.apache.org/jira/browse/PIG-1114 > Project: Pig > Issue Type: Bug >Reporter: Ankur >Assignee: Richard Ding >Priority: Critical > Fix For: 0.6.0 > > Attachments: PIG-1114.patch, Pig_1114_Client.log > > > Multi-query optimization throws an error when merging 2 level splits. > Following is the script to reproduce the error > data = LOAD 'data' USING PigStorage() AS (id:int, name:chararray); > ids = FOREACH data GENERATE id; > allId = GROUP ids all; > allIdCount = FOREACH allId GENERATE group as allId, COUNT(ids) as total; > idGroup = GROUP ids by id; > idGroupCount = FOREACH idGroup GENERATE group as id, COUNT(ids) as count; > countTotal = cross idGroupCount, allIdCount; > idCountTotal = foreach countTotal generate > id, > count, > total, > (double)count / (double)total as proportion; > orderedCounts = order idCountTotal by count desc; > STORE orderedCounts INTO 'mq_problem/ids'; > names = FOREACH data GENERATE name; > allNames = GROUP names all; > allNamesCount = FOREACH allNames GENERATE group as namesAll, COUNT(names) as > total; > nameGroup = GROUP names by name; > nameGroupCount = FOREACH nameGroup GENERATE group as name, COUNT(names) as > count; > namesCrossed = cross nameGroupCount, allNamesCount; > nameCountTotal = foreach namesCrossed generate > name, > count, > total, > (double)count / (double)total as proportion; > nameCountsOrdered = order nameCountTotal by count desc; > STORE nameCountsOrdered INTO 'mq_problem/names'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784676#action_12784676 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426615/PIG-922-p3_12.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/73/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/73/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/73/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/73/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, > PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, > PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, > PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, > PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, > PIG-922-p3_8.patch, PIG-922-p3_9.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1116) Remove redundant map-reduce job for merge join
[ https://issues.apache.org/jira/browse/PIG-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784763#action_12784763 ] Hadoop QA commented on PIG-1116: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426637/PIG-1116.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/console This message is automatically generated. > Remove redundant map-reduce job for merge join > -- > > Key: PIG-1116 > URL: https://issues.apache.org/jira/browse/PIG-1116 > Project: Pig > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1116.patch > > > In merge join, when we convert right hand side file into a side file, we > didn't remove it from the map-reduce plan, we only disconnect it from the > plan. When we run the query, the redundant load will load the data but doing > nothing. This operation should be removed entirely. > Eg: > a = load '/user/pig/tests/data/zebra/singlefile/studentsortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, gpa); > b = load '/user/pig/tests/data/zebra/singlefile/votersortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, > registration, contributions); > c = join a by name, b by name using "merge"; > explain c; > {code} > #-- > # Map Reduce Plan > #-- > MapReduce node 1-21 > Map Plan > Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/votersortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) > - 1-13 > Global sort: false > > MapReduce node 1-20 > Map Plan > Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-19 > | > |---MergeJoin[tuple] - 1-16 > | > > |---Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/studentsortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) > - 1-12 > Global sort: false > > {code} > 1-21 should be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784879#action_12784879 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426641/PIG-922-p3_13.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/75/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, > PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, > PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, > PIG-922-p3_13.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch, > PIG-922-p3_4.patch, PIG-922-p3_5.patch, PIG-922-p3_6.patch, > PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785029#action_12785029 ] Hadoop QA commented on PIG-1068: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426691/PIG-1068.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/console This message is automatically generated. > COGROUP fails with 'Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple' > --- > > Key: PIG-1068 > URL: https://issues.apache.org/jira/browse/PIG-1068 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Vikram Oberoi >Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: cogroup-bug.pig, log, PIG-1068.patch > > > The COGROUP in the following script fails in its map: > {code} > logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, > command:chararray, comments:chararray); > > > > > SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; > > > > > > -- Project login clients and count them by ID. > > > login_info = FOREACH logins { > > > GENERATE id as id, > > > comments AS client; > > > }; > > > > > > logins_grouped = GROUP login_info BY (id, client); > > > > > > count_logins_by_client = FOREACH logins_grouped { > > > generate group.id AS id, group.
[jira] Commented: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785166#action_12785166 ] Hadoop QA commented on PIG-1118: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426698/PIG_1118.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/79/console This message is automatically generated. > expression with aggregate functions returning null, with accumulate interface > - > > Key: PIG-1118 > URL: https://issues.apache.org/jira/browse/PIG-1118 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Ying He > Fix For: 0.7.0 > > Attachments: PIG_1118.patch > > > The problem is in trunk . It works fine in 0.6 branch. > l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); > grunt> g = group l by 1; > grunt> dump g; > (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) > grunt> f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); > grunt> dump f; > (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785218#action_12785218 ] Hadoop QA commented on PIG-1122: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426734/PIG-1122.patch against trunk revision 886650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/80/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/80/console This message is automatically generated. > [zebra] Zebra build.xml still uses 0.6 version > -- > > Key: PIG-1122 > URL: https://issues.apache.org/jira/browse/PIG-1122 > Project: Pig > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.7.0 > > Attachments: PIG-1122.patch > > > Zebra still uses pig-0.6.0-dev-core.jar in build-contrib.xml. It should be > changed to pig-0.7.0-dev-core.jar on APACHE trunk only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785293#action_12785293 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426740/PIG-922-p3_14.patch against trunk revision 886650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. -1 release audit. The applied patch generated 368 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/81/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/81/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/81/console This message is automatically generated. > Logical optimizer: push up project > -- > > Key: PIG-922 > URL: https://issues.apache.org/jira/browse/PIG-922 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: 0.3.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, > PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, > PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, > PIG-922-p3_10.patch, PIG-922-p3_11.patch, PIG-922-p3_12.patch, > PIG-922-p3_13.patch, PIG-922-p3_14.patch, PIG-922-p3_2.patch, > PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch, > PIG-922-p3_6.patch, PIG-922-p3_7.patch, PIG-922-p3_8.patch, PIG-922-p3_9.patch > > > This is a continuation work of > [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add > another rule to the logical optimizer: Push up project, ie, prune columns as > early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1111) [Zebra] multiple outputs support
[ https://issues.apache.org/jira/browse/PIG-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785420#action_12785420 ] Hadoop QA commented on PIG-: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426757/PIG-.patch against trunk revision 886650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/82/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/82/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/82/console This message is automatically generated. > [Zebra] multiple outputs support > > > Key: PIG- > URL: https://issues.apache.org/jira/browse/PIG- > Project: Pig > Issue Type: New Feature >Affects Versions: 0.6.0, 0.7.0 >Reporter: Gaurav Jain >Assignee: Gaurav Jain > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG-.patch, PIG-.patch > > > Zebra enables application to stream data into different zebra table instances. > New Interface added: > setMultipleOutputs( JobConf jobconf, String commaSeparatedLocation, Class extends ZebraOutputPartitioner> theClass. > Zebra maintains a list of tables instances based on commaseparatedlocations ( > in that order ) > ZebraOutputPartitioner interface has getOutputPartition method which is > implemented by the application. It will return an index into the list. Zebra > will write to that instance > We also introduce a new mapred property for setting multiple outputs. > mapred.lib.table.multi.output.dirs > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1119) [zebra] "group" is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785550#action_12785550 ] Hadoop QA commented on PIG-1119: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426761/PIG-1119.patch against trunk revision 886650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 30 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/83/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/83/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/83/console This message is automatically generated. > [zebra] "group" is a Pig preserved word, zebra needs to use other string for > table's group information > -- > > Key: PIG-1119 > URL: https://issues.apache.org/jira/browse/PIG-1119 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang > Fix For: 0.6.0 > > Attachments: PIG-1119.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-915) Load row names in HBase loader
[ https://issues.apache.org/jira/browse/PIG-915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785703#action_12785703 ] Hadoop QA commented on PIG-915: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426785/Pig_915.Patch against trunk revision 886875. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 365 release audit warnings (more than the trunk's current 362 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/84/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/84/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/84/console This message is automatically generated. > Load row names in HBase loader > -- > > Key: PIG-915 > URL: https://issues.apache.org/jira/browse/PIG-915 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.6.0 >Reporter: Alex Newman >Assignee: Jeff Zhang >Priority: Minor > Fix For: 0.7.0 > > Attachments: Pig_915.Patch > > > Currently their is no way to get the Row names when doing a query from HBase, > we should probably remedy this as important data may be stored there. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785786#action_12785786 ] Hadoop QA commented on PIG-1118: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426698/PIG_1118.patch against trunk revision 887017. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/86/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/86/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/86/console This message is automatically generated. > expression with aggregate functions returning null, with accumulate interface > - > > Key: PIG-1118 > URL: https://issues.apache.org/jira/browse/PIG-1118 > Project: Pig > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Ying He > Fix For: 0.6.0 > > Attachments: PIG_1118.patch > > > The problem is in trunk . It works fine in 0.6 branch. > l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); > grunt> g = group l by 1; > grunt> dump g; > (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) > grunt> f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); > grunt> dump f; > (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs
[ https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785801#action_12785801 ] Hadoop QA commented on PIG-480: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426804/PIG_480.patch against trunk revision 887049. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 217 javac compiler warnings (more than the trunk's current 213 warnings). -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/88/testReport/ Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/88/console This message is automatically generated. > PERFORMANCE: Use identity mapper in a chain of M-R jobs > --- > > Key: PIG-480 > URL: https://issues.apache.org/jira/browse/PIG-480 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.2.0 >Reporter: Olga Natkovich >Assignee: Ying He > Attachments: PIG_480.patch > > > For jobs with two or more MR jobs, use identity mapper wherever possible in > second and subsequent MR jobs. Identity mapper is about 50% than pig empty > map job because it doesn't parse the data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785928#action_12785928 ] Hadoop QA commented on PIG-653: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426879/PIG-653.patch against trunk revision 887049. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 97 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 395 release audit warnings (more than the trunk's current 368 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/console This message is automatically generated. > Make fieldsToRead work in loader > > > Key: PIG-653 > URL: https://issues.apache.org/jira/browse/PIG-653 > Project: Pig > Issue Type: New Feature >Reporter: Alan Gates >Assignee: Pradeep Kamath > Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch > > > Currently pig does not call the fieldsToRead function in LoadFunc, thus it > does not provide information to load functions on what fields are needed. We > need to implement a visitor that determines (where possible) which fields in > a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1119) [zebra] "group" is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786027#action_12786027 ] Hadoop QA commented on PIG-1119: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426881/PIG-1119.patch against trunk revision 887049. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/console This message is automatically generated. > [zebra] "group" is a Pig preserved word, zebra needs to use other string for > table's group information > -- > > Key: PIG-1119 > URL: https://issues.apache.org/jira/browse/PIG-1119 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang > Fix For: 0.6.0 > > Attachments: PIG-1119.patch, PIG-1119.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786031#action_12786031 ] Hadoop QA commented on PIG-1105: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426887/PIG-1105.patch against trunk revision 887290. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/91/console This message is automatically generated. > COUNT_STAR accumulate interface implementation cases failure > > > Key: PIG-1105 > URL: https://issues.apache.org/jira/browse/PIG-1105 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Thejas M Nair >Assignee: Sriranjan Manjunath > Fix For: 0.6.0 > > Attachments: PIG-1105.1.patch, PIG-1105.patch > > > COUNT_STAR.accumulate is calling sum() which is supposed to be used by > intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786188#action_12786188 ] Hadoop QA commented on PIG-1104: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426801/PIG1104.patch against trunk revision 887290. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/console This message is automatically generated. > [zebra] Provide streaming support in Zebra. > --- > > Key: PIG-1104 > URL: https://issues.apache.org/jira/browse/PIG-1104 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.4.0 >Reporter: Chao Wang >Assignee: Chao Wang > Fix For: 0.6.0, 0.7.0 > > Attachments: PIG1104.patch > > > Hadoop streaming is very popular among Hadoop users. The main attraction is > the simplicity of use. A user can write the application logic in any language > and process large amounts of data using Hadoop framework. As more people > start to use Zebra to store their data, we expect users would like to run > Hadoop streaming scripts to easily process Zebra tables. > The following lists a simple example of using Hadoop streaming to access > Zebra data. It loads data from foo table using Zebra's TableInputFormat and > then writes the data into output using default TextOutputFormat. > $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output > output -mapper 'cat' -inputformat > org.apache.hadoop.zebra.mapred.TableInputFormat > More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its > records. Currently, when Zebra's TableInputFormat is used for input, the user > script sees each line containing " key_if_any\tTuple.toString() ". We plan to > generate CSV format representation of our Pig tuples. To this end, we plan to > do the following: > 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override > its toString() method to present the data into CSV format. > 2) On Zebra side, the tuple factory should be changed to create ZebraTuple > objects, instead of DefaultTuple objects. > Note that we can only support streaming on the input side - ability to use > streaming to read data from Zebra tables. For the output side, the streaming > support is not feasible, since the streaming mapper or reducer only emits > "Text\tText", the output collector has no way of knowing how to convert this > to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1086) Nested sort by * throw exception
[ https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786300#action_12786300 ] Hadoop QA commented on PIG-1086: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426934/PIG-1086.patch against trunk revision 887318. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/console This message is automatically generated. > Nested sort by * throw exception > > > Key: PIG-1086 > URL: https://issues.apache.org/jira/browse/PIG-1086 > Project: Pig > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Daniel Dai >Assignee: Richard Ding > Attachments: PIG-1086.patch > > > The following script fail: > A = load '1.txt' as (a0, a1, a2); > B = group A by a0; > C = foreach B { D = order A by *; generate group, D;}; > explain C; > Here is the stack: > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 > at java.util.ArrayList.get(ArrayList.java:324) > at > org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752) > at > org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365) > at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176) > at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43) > at > org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274) > at > org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234) > at org.apache.pig.PigServer.compilePp(PigServer.java:864) > at org.apache.pig.PigServer.explain(PigServer.java:583) > ... 8 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.