[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765415#action_12765415 ] Hadoop QA commented on PIG-1016: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422031/PIG-1016.patch against trunk revision 824980. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-755) Difficult to debug parameter substitution problems based on the error messages when running in local mode
[ https://issues.apache.org/jira/browse/PIG-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765423#action_12765423 ] Daniel Dai commented on PIG-755: Now the error message changed to: ERROR 2999: Unexpected internal error. Can not create a Path from an empty string java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.pig.impl.logicalLayer.parser.QueryParser.massageFilename(QueryParser.java:191) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1440) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1227) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1017) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:967) at org.apache.pig.PigServer.registerQuery(PigServer.java:383) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:716) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) at org.apache.pig.Main.main(Main.java:397) Difficult to debug parameter substitution problems based on the error messages when running in local mode - Key: PIG-755 URL: https://issues.apache.org/jira/browse/PIG-755 Project: Pig Issue Type: Bug Components: grunt Affects Versions: 0.3.0 Reporter: Viraj Bhat Attachments: inputfile.txt, localparamsub.pig I have a script in which I do a parameter substitution for the input file. I have a use case where I find it difficult to debug based on the error messages in local mode. {code} A = load '$infile' using PigStorage() as ( date: chararray, count : long, gmean : double ); dump A; {code} 1) I run it in local mode with the input file in the current working directory {code} prompt $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main -exectype local -param infile='inputfile.txt' localparamsub.pig {code} 2009-04-07 00:03:51,967 [main] ERROR org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore - Received error from storer function: org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable to setup the load function. 2009-04-07 00:03:51,970 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed jobs!! 2009-04-07 00:03:51,971 [main] INFO org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out of 1 failed! 2009-04-07 00:03:51,974 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A Details at logfile: /home/viraj/pig-svn/trunk/pig_1239062631414.log ERROR 1066: Unable to open iterator for alias A org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A at org.apache.pig.PigServer.openIterator(PigServer.java:439) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:193) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88) at org.apache.pig.Main.main(Main.java:352) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:433) ... 5 more 2) I run it in map reduce mode {code} prompt $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main -param infile='inputfile.txt' localparamsub.pig {code} 2009-04-07 00:07:31,660 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:9000 2009-04-07 00:07:32,074
[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan
[ https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765601#action_12765601 ] Alan Gates commented on PIG-858: Mostly looks straight forward and passes all the tests. You made a number of changes in MRCompiler.visitUnion. I don't understand what exactly you were changing there. Could you give a brief overview of those changes? Order By followed by replicated join fails while compiling MR-plan from physical plan --- Key: PIG-858 URL: https://issues.apache.org/jira/browse/PIG-858 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.6.0 Attachments: pig-858.patch Consider the query: {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0; explain C; {code} works. But if replicated join is used instead {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0 using replicated; explain C; {code} this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error compiling operator POFRJoin relevant stacktrace: {code} Caused by: java.lang.RuntimeException: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306) at org.apache.pig.PigServer.explain(PigServer.java:574) ... 8 more Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303) ... 9 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901) ... 16 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765603#action_12765603 ] Alan Gates commented on PIG-760: At this point no one has contributed a PigStorageSchema as suggested above. We remain open to such a contribution if someone has the time. Serialize schemas for PigStorage() and other storage types. --- Key: PIG-760 URL: https://issues.apache.org/jira/browse/PIG-760 Project: Pig Issue Type: New Feature Reporter: David Ciemiewicz I'm finding PigStorage() really convenient for storage and data interchange because it compresses well and imports into Excel and other analysis environments well. However, it is a pain when it comes to maintenance because the columns are in fixed locations and I'd like to add columns in some cases. It would be great if load PigStorage() could read a default schema from a .schema file stored with the data and if store PigStorage() could store a .schema file with the data. I have tested this out and both Hadoop HDFS and Pig in -exectype local mode will ignore a file called .schema in a directory of part files. So, for example, if I have a chain of Pig scripts I execute such as: A = load 'data-1' using PigStorage() as ( a: int , b: int ); store A into 'data-2' using PigStorage(); B = load 'data-2' using PigStorage(); describe B; describe B should output something like { a: int, b: int } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter
optimizer pushes filter before the foreach that generates column used by filter --- Key: PIG-1022 URL: https://issues.apache.org/jira/browse/PIG-1022 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair grunt l = load 'students.txt' using PigStorage() as (name:chararray, gender:chararray, age:chararray, score:chararray); grunt f = foreach l generate name, gender, age,score, '200' as gid:chararray; grunt g = group f by (name, gid); grunt f2 = foreach g generate group.name as name: chararray, group.gid as gid: chararray; grunt filt = filter f2 by gid == '200'; grunt explain filt; In the plan generated filt is pushed up after the load and before the first foreach, even though the filter is on gid which is generated in first foreach. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter
[ https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765612#action_12765612 ] Thejas M Nair commented on PIG-1022: ${code} grunt explain filt; #--- # Logical Plan: #--- Store 1-1162 Schema: {name: chararray,gid: chararray} Type: Unknown | |---ForEach 1-1148 Schema: {name: chararray,gid: chararray} Type: bag | | | Project 1-1144 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray | Input: Project 1-1145 Projections: [0] Overloaded: false| | |---Project 1-1145 Projections: [0] Overloaded: false FieldSchema: group: tuple({name: chararray,gid: chararray}) Type: tuple | Input: CoGroup 1-1138 | | | Project 1-1146 Projections: [1] Overloaded: false FieldSchema: gid: chararray Type: chararray | Input: Project 1-1147 Projections: [0] Overloaded: false| | |---Project 1-1147 Projections: [0] Overloaded: false FieldSchema: group: tuple({name: chararray,gid: chararray}) Type: tuple | Input: CoGroup 1-1138 | |---CoGroup 1-1138 Schema: {group: (name: chararray,gid: chararray),f: {name: chararray,gender: chararray,age: chararray,score: chararray,gid: chararray}} Type: bag | | | Project 1-1136 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray | Input: ForEach 1-1135 | | | Project 1-1137 Projections: [4] Overloaded: false FieldSchema: gid: chararray Type: chararray | Input: ForEach 1-1135 | |---ForEach 1-1135 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray,gid: chararray} Type: bag | | | Project 1-1130 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray | Input: Filter 1-1152 | | | Project 1-1131 Projections: [1] Overloaded: false FieldSchema: gender: chararray Type: chararray | Input: Filter 1-1152 | | | Project 1-1132 Projections: [2] Overloaded: false FieldSchema: age: chararray Type: chararray | Input: Filter 1-1152 | | | Project 1-1133 Projections: [3] Overloaded: false FieldSchema: score: chararray Type: chararray | Input: Filter 1-1152 | | | Const 1-1134( 200 ) FieldSchema: chararray Type: chararray | |---Filter 1-1152 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray} Type: bag | | | Equal 1-1151 FieldSchema: boolean Type: boolean | | | |---Project 1-1149 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray | | Input: ForEach 1-1161 | | | |---Const 1-1150( 200 ) FieldSchema: chararray Type: chararray | |---ForEach 1-1161 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray} Type: bag | | | Cast 1-1154 FieldSchema: name: chararray Type: chararray | | | |---Project 1-1153 Projections: [0] Overloaded: false FieldSchema: name: bytearray Type: bytearray | Input: Load 1-1123 | | | Cast 1-1156 FieldSchema: gender: chararray Type: chararray | | | |---Project 1-1155 Projections: [1] Overloaded: false FieldSchema: gender: bytearray Type: bytearray | Input: Load 1-1123 | | | Cast 1-1158 FieldSchema: age: chararray Type: chararray | | | |---Project 1-1157 Projections: [2] Overloaded: false FieldSchema: age: bytearray Type: bytearray | Input: Load 1-1123 | | | Cast 1-1160 FieldSchema: score: chararray Type: chararray | | | |---Project 1-1159 Projections: [3] Overloaded: false FieldSchema: score: bytearray Type: bytearray | Input: Load 1-1123 | |---Load 1-1123 Schema: {name: bytearray,gender: bytearray,age: bytearray,score: bytearray} Type: bag ${code} optimizer pushes filter before the foreach that generates column used by filter --- Key: PIG-1022 URL: https://issues.apache.org/jira/browse/PIG-1022 Project: Pig Issue Type: Bug Components: impl
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765626#action_12765626 ] Dmitriy V. Ryaboy commented on PIG-760: --- This would be a nice proof-of-concept task for the new Load/StoreMetadata interfaces, as it removes the complexity of dealing with something like Owl. Serialize schemas for PigStorage() and other storage types. --- Key: PIG-760 URL: https://issues.apache.org/jira/browse/PIG-760 Project: Pig Issue Type: New Feature Reporter: David Ciemiewicz I'm finding PigStorage() really convenient for storage and data interchange because it compresses well and imports into Excel and other analysis environments well. However, it is a pain when it comes to maintenance because the columns are in fixed locations and I'd like to add columns in some cases. It would be great if load PigStorage() could read a default schema from a .schema file stored with the data and if store PigStorage() could store a .schema file with the data. I have tested this out and both Hadoop HDFS and Pig in -exectype local mode will ignore a file called .schema in a directory of part files. So, for example, if I have a chain of Pig scripts I execute such as: A = load 'data-1' using PigStorage() as ( a: int , b: int ); store A into 'data-2' using PigStorage(); B = load 'data-2' using PigStorage(); describe B; describe B should output something like { a: int, b: int } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter
[ https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai reassigned PIG-1022: --- Assignee: Daniel Dai optimizer pushes filter before the foreach that generates column used by filter --- Key: PIG-1022 URL: https://issues.apache.org/jira/browse/PIG-1022 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Daniel Dai grunt l = load 'students.txt' using PigStorage() as (name:chararray, gender:chararray, age:chararray, score:chararray); grunt f = foreach l generate name, gender, age,score, '200' as gid:chararray; grunt g = group f by (name, gid); grunt f2 = foreach g generate group.name as name: chararray, group.gid as gid: chararray; grunt filt = filter f2 by gid == '200'; grunt explain filt; In the plan generated filt is pushed up after the load and before the first foreach, even though the filter is on gid which is generated in first foreach. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1014) Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records
[ https://issues.apache.org/jira/browse/PIG-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765658#action_12765658 ] Pradeep Kamath commented on PIG-1014: - To achieve 1. above, we would translate COUNT( A ) to COUNT_STAR( A ) during job compilation. Since 3. above has multiple options and does not seem to be a prevalent use case (SQL does not support it), another option is to disable it - thoughts? Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records Key: PIG-1014 URL: https://issues.apache.org/jira/browse/PIG-1014 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Pradeep Kamath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning
[ https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1000: Status: Patch Available (was: Open) InternalCachedBag.java generates javac warning and findbug warning -- Key: PIG-1000 URL: https://issues.apache.org/jira/browse/PIG-1000 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Ying He Assignee: Ying He Fix For: 0.6.0 Attachments: PIG-1000.patch patch submitted by PIG-975 generates javac warning and findbug warning -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter
[ https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765682#action_12765682 ] Daniel Dai commented on PIG-1022: - Actually we cannot push the filter even before f2. Since we do not keep track of the source of data inside tuple, so gid should be treated as a generated field of f2. However, projection map of f2 give us the wrong result that gid is a directly mapped field of group (which is a tuple (name, gid)), and this triggers all the subsequences. The fix for this problem is to modify the projection map generation logic for the mapped field. Santhosh, do you have any comment? optimizer pushes filter before the foreach that generates column used by filter --- Key: PIG-1022 URL: https://issues.apache.org/jira/browse/PIG-1022 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Daniel Dai grunt l = load 'students.txt' using PigStorage() as (name:chararray, gender:chararray, age:chararray, score:chararray); grunt f = foreach l generate name, gender, age,score, '200' as gid:chararray; grunt g = group f by (name, gid); grunt f2 = foreach g generate group.name as name: chararray, group.gid as gid: chararray; grunt filt = filter f2 by gid == '200'; grunt explain filt; In the plan generated filt is pushed up after the load and before the first foreach, even though the filter is on gid which is generated in first foreach. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1003) FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators
[ https://issues.apache.org/jira/browse/PIG-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765711#action_12765711 ] Olga Natkovich commented on PIG-1003: - Added to exclude file for now FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators Key: PIG-1003 URL: https://issues.apache.org/jira/browse/PIG-1003 Project: Pig Issue Type: Bug Reporter: Olga Natkovich All physical expression operators have this issue. In the clone method, they instanciate a new object rather than call super.clone. This is a major change and for now I am planning to exclude this warning. We will address it once we work on the frontend rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1004) FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
[ https://issues.apache.org/jira/browse/PIG-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765712#action_12765712 ] Olga Natkovich commented on PIG-1004: - Added to exclue file for now FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators Key: PIG-1004 URL: https://issues.apache.org/jira/browse/PIG-1004 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Will address this during next cleanup: CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrangeForIllustrate.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POOptimizedForEach.clone() does not call super.clone() CN org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSort.clone() does not call super.clone() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
[ https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765718#action_12765718 ] Olga Natkovich commented on PIG-1023: - This does not have to go through patch test process. Could one of the committers please review FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL Key: PIG-1023 URL: https://issues.apache.org/jira/browse/PIG-1023 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Attachments: PIG-1023.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
[ https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1023: Attachment: PIG-1023.patch FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL Key: PIG-1023 URL: https://issues.apache.org/jira/browse/PIG-1023 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Attachments: PIG-1023.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan
[ https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765720#action_12765720 ] Ashutosh Chauhan commented on PIG-858: -- visitUnion has same changes as others visit functions, that is it adds MR Operator corresponding to POUnion in phyToMROpMap map. Real changes are in visitFRJoin. Earlier in visitFRJoin, it used to look in compiledInputs array of MROper one by one trying to match MROPer leaf PO with POFRJoin using operator key. Now, it doesn't need to do that it can simply lookup in the phyToMROpMap. Order By followed by replicated join fails while compiling MR-plan from physical plan --- Key: PIG-858 URL: https://issues.apache.org/jira/browse/PIG-858 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.6.0 Attachments: pig-858.patch Consider the query: {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0; explain C; {code} works. But if replicated join is used instead {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0 using replicated; explain C; {code} this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error compiling operator POFRJoin relevant stacktrace: {code} Caused by: java.lang.RuntimeException: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306) at org.apache.pig.PigServer.explain(PigServer.java:574) ... 8 more Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303) ... 9 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901) ... 16 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1020) Include an ant target to build pig.jar without hadoop libraries
[ https://issues.apache.org/jira/browse/PIG-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765734#action_12765734 ] Olga Natkovich commented on PIG-1020: - +1, please, commit to trunk and 0.5.0 branch Include an ant target to build pig.jar without hadoop libraries --- Key: PIG-1020 URL: https://issues.apache.org/jira/browse/PIG-1020 Project: Pig Issue Type: New Feature Components: build Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Priority: Minor Fix For: 0.6.0 Attachments: PIG-1020-1.patch, PIG-1020-2.patch Provide an ant target to build pig.jar without all hadoop related libraries. User will provide external hadoop jars in classpath before invoking pig. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan
[ https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765735#action_12765735 ] Ashutosh Chauhan commented on PIG-858: -- Its been a while since I did that patch. So, bit more clarification: We are interested in finding PO which corresponds to fragment PO input of POFRJoin. This PO is already compiled and is in one the MROper. Earlier we will iterate through compiledInputs array trying to match this PO with PO contained in each MROperator. This fails as discussed in previous comments. With this change, since we keep track of MR operator with each physical operator it need not to do that but can simply look up for MROper corresponding to fragment PO in the phyToMROpMap. Order By followed by replicated join fails while compiling MR-plan from physical plan --- Key: PIG-858 URL: https://issues.apache.org/jira/browse/PIG-858 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.6.0 Attachments: pig-858.patch Consider the query: {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0; explain C; {code} works. But if replicated join is used instead {code} A = load 'a'; B = order A by $0; C = join A by $0, B by $0 using replicated; explain C; {code} this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error compiling operator POFRJoin relevant stacktrace: {code} Caused by: java.lang.RuntimeException: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306) at org.apache.pig.PigServer.explain(PigServer.java:574) ... 8 more Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException: ERROR 2034: Error compiling operator POFRJoin at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303) ... 9 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901) ... 16 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
[ https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765751#action_12765751 ] Daniel Dai commented on PIG-1023: - +1. Target findbug warnings suppressed. Findbugs generate 37 less warnings. FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL Key: PIG-1023 URL: https://issues.apache.org/jira/browse/PIG-1023 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Attachments: PIG-1023.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765767#action_12765767 ] Yan Zhou commented on PIG-944: -- A typo in one of my earlier comments at 02/Oct/09 10:33 PM. Instead of This patch must be applied after the patch for Jira PIG-933 has been applied. it should have read as This patch must be applied after the patch for Jira PIG-993 has been applied. Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Assignee: Yan Zhou Fix For: 0.6.0 Attachments: SchemaConversion.patch, SchemaConversion.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
[ https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-1023. - Resolution: Fixed patch committed FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL Key: PIG-1023 URL: https://issues.apache.org/jira/browse/PIG-1023 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Attachments: PIG-1023.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning
[ https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1000: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) This patch is to address javacc and findbug warnings, no unit test needed. Patch committed. Thanks Ying! InternalCachedBag.java generates javac warning and findbug warning -- Key: PIG-1000 URL: https://issues.apache.org/jira/browse/PIG-1000 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Ying He Assignee: Ying He Fix For: 0.6.0 Attachments: PIG-1000.patch patch submitted by PIG-975 generates javac warning and findbug warning -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1014) Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records
[ https://issues.apache.org/jira/browse/PIG-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765779#action_12765779 ] Santhosh Srinivasan commented on PIG-1014: -- Another option is to change the implementation of COUNT to reflect the proposed semantics. If the underlying UDF is changed then the user should be notified via an information message. If the user checks the explain output then (s)he will notice COUNT_STAR and will be confused. Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records Key: PIG-1014 URL: https://issues.apache.org/jira/browse/PIG-1014 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Pradeep Kamath -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1020) Include an ant target to build pig.jar without hadoop libraries
[ https://issues.apache.org/jira/browse/PIG-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1020: Resolution: Fixed Fix Version/s: 0.5.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) No unit test included since it only changes build.xml. Patch committed to both trunk and 0.5 branch. New target for pig.jar without hadoop libs is jar-withouthadoop. Include an ant target to build pig.jar without hadoop libraries --- Key: PIG-1020 URL: https://issues.apache.org/jira/browse/PIG-1020 Project: Pig Issue Type: New Feature Components: build Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Priority: Minor Fix For: 0.5.0, 0.6.0 Attachments: PIG-1020-1.patch, PIG-1020-2.patch Provide an ant target to build pig.jar without all hadoop related libraries. User will provide external hadoop jars in classpath before invoking pig. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1018) FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter
[ https://issues.apache.org/jira/browse/PIG-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1018: Attachment: PIG-1018.patch FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter --- Key: PIG-1018 URL: https://issues.apache.org/jira/browse/PIG-1018 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Attachments: PIG-1018.patch NmThe field name org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.LogToPhyMap doesn't start with a lower case letter NmThe method name org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.CreateTuple(Object[]) doesn't start with a lower case letter NmThe class name org.apache.pig.backend.hadoop.executionengine.physicalLayer.util.operatorHelper doesn't start with an upper case letter NmClass org.apache.pig.impl.util.WrappedIOException is not derived from an Exception, even though it is named as such NmThe method name org.apache.pig.pen.EquivalenceClasses.GetEquivalenceClasses(LogicalOperator, Map) doesn't start with a lower case letter NmThe field name org.apache.pig.pen.util.DisplayExamples.Result doesn't start with a lower case letter NmThe method name org.apache.pig.pen.util.DisplayExamples.PrintSimple(LogicalOperator, Map) doesn't start with a lower case letter NmThe method name org.apache.pig.pen.util.DisplayExamples.PrintTabular(LogicalPlan, Map) doesn't start with a lower case letter NmThe method name org.apache.pig.tools.parameters.TokenMgrError.LexicalError(boolean, int, int, int, String, char) doesn't start with a lower case letter -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs
Script contains nested limit fail due to LOLimit does not support multiple outputs Key: PIG-1024 URL: https://issues.apache.org/jira/browse/PIG-1024 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Daniel Dai Fix For: 0.6.0 The following script fail: a = load '1.txt' as (a0:int, a1:int, a2:int); b = group a by a0; c = foreach b { c1 = limit a 10; c2 = (c1.a0/c1.a1); c3 = (c1.a0/c1.a2); generate c2, c3;} Error message: ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.impl.logicalLayer.LOLimit multiple outputs. This operator does not support multiple outputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs
[ https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1024: Attachment: PIG-1024-1.patch Patch included. Thanks Pradeep's diagnosis. Script contains nested limit fail due to LOLimit does not support multiple outputs Key: PIG-1024 URL: https://issues.apache.org/jira/browse/PIG-1024 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1024-1.patch The following script fail: a = load '1.txt' as (a0:int, a1:int, a2:int); b = group a by a0; c = foreach b { c1 = limit a 10; c2 = (c1.a0/c1.a1); c3 = (c1.a0/c1.a2); generate c2, c3;} Error message: ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.impl.logicalLayer.LOLimit multiple outputs. This operator does not support multiple outputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs
[ https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1024: Status: Patch Available (was: Open) Script contains nested limit fail due to LOLimit does not support multiple outputs Key: PIG-1024 URL: https://issues.apache.org/jira/browse/PIG-1024 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1024-1.patch The following script fail: a = load '1.txt' as (a0:int, a1:int, a2:int); b = group a by a0; c = foreach b { c1 = limit a 10; c2 = (c1.a0/c1.a1); c3 = (c1.a0/c1.a2); generate c2, c3;} Error message: ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.impl.logicalLayer.LOLimit multiple outputs. This operator does not support multiple outputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs
[ https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai reassigned PIG-1024: --- Assignee: Daniel Dai Script contains nested limit fail due to LOLimit does not support multiple outputs Key: PIG-1024 URL: https://issues.apache.org/jira/browse/PIG-1024 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1024-1.patch The following script fail: a = load '1.txt' as (a0:int, a1:int, a2:int); b = group a by a0; c = foreach b { c1 = limit a 10; c2 = (c1.a0/c1.a1); c3 = (c1.a0/c1.a2); generate c2, c3;} Error message: ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.impl.logicalLayer.LOLimit multiple outputs. This operator does not support multiple outputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1008) FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL
[ https://issues.apache.org/jira/browse/PIG-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1008: Status: Patch Available (was: Open) FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL --- Key: PIG-1008 URL: https://issues.apache.org/jira/browse/PIG-1008 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Attachments: PIG-1008.patch NPorg.apache.pig.data.DataByteArray.toString() may return null NP org.apache.pig.impl.streaming.StreamingCommand$HandleSpec.equals(Object) does not check for null argument -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1009) FINDBUGS: OS_OPEN_STREAM: Method may fail to close stream
[ https://issues.apache.org/jira/browse/PIG-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1009: Attachment: PIG-1009.patch FINDBUGS: OS_OPEN_STREAM: Method may fail to close stream - Key: PIG-1009 URL: https://issues.apache.org/jira/browse/PIG-1009 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Attachments: PIG-1009.patch OSorg.apache.pig.impl.io.FileLocalizer.parseCygPath(String, int) may fail to close stream OSorg.apache.pig.impl.logicalLayer.parser.QueryParser.which(String) may fail to close stream OS org.apache.pig.impl.util.PropertiesUtil.loadPropertiesFromFile(Properties) may fail to close stream OSorg.apache.pig.Main.configureLog4J(Properties, PigContext) may fail to close stream OS org.apache.pig.tools.parameters.PreprocessorContext.executeShellCommand(String) may fail to close stream OS org.apache.pig.tools.parameters.PreprocessorContext.executeShellCommand(String) may fail to close stream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs
[ https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765853#action_12765853 ] Hadoop QA commented on PIG-1024: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422154/PIG-1024-1.patch against trunk revision 825308. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/console This message is automatically generated. Script contains nested limit fail due to LOLimit does not support multiple outputs Key: PIG-1024 URL: https://issues.apache.org/jira/browse/PIG-1024 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1024-1.patch The following script fail: a = load '1.txt' as (a0:int, a1:int, a2:int); b = group a by a0; c = foreach b { c1 = limit a 10; c2 = (c1.a0/c1.a1); c3 = (c1.a0/c1.a2); generate c2, c3;} Error message: ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.impl.logicalLayer.LOLimit multiple outputs. This operator does not support multiple outputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-921) Strange use case for Join which produces different results in local and map reduce mode
[ https://issues.apache.org/jira/browse/PIG-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765864#action_12765864 ] Pradeep Kamath commented on PIG-921: +1 - minor comment, we can probably remove preds==null || preds.get(0)==null from the if() since the project should always have a predecessor and if it does not the execution would fail somewhere else . Strange use case for Join which produces different results in local and map reduce mode --- Key: PIG-921 URL: https://issues.apache.org/jira/browse/PIG-921 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Environment: Hadoop 18 and Hadoop 20 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.6.0 Attachments: A.txt, B.txt, joinusecase.pig, PIG-921-1.patch I have script in this manner, loads from 2 files A.txt and B.txt {code} A = LOAD 'A.txt' as (a:tuple(a1:int, a2:chararray)); B = LOAD 'B.txt' as (b:tuple(b1:int, b2:chararray)); C = JOIN A by a.a1, B by b.b1; DESCRIBE C; DUMP C; {code} A.txt contains the following lines: {code} (1,a) (2,aa) {code} B.txt contains the following lines: {code} (1,b) (2,bb) {code} Now running the above script in local and map reduce mode on Hadoop 18 Hadoop 20, produces the following: Hadoop 18 = (1,1) (2,2) = Hadoop 20 = (1,1) (2,2) = Local Mode: Pig with Hadoop 18 jar release = 2009-08-13 17:15:13,473 [main] INFO org.apache.pig.Main - Logging error messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log 09/08/13 17:15:13 INFO pig.Main: Logging error messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log C: {a: (a1: int,a2: chararray),b: (b1: int,b2: chararray)} 2009-08-13 17:15:13,932 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1002: Unable to store alias C 09/08/13 17:15:13 ERROR grunt.Grunt: ERROR 1002: Unable to store alias C Details at logfile: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log = Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109) at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165) ... 9 more = Local Mode: Pig with Hadoop 20 jar release = ((1,a),(1,b)) ((2,aa),(2,bb) = -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-921) Strange use case for Join which produces different results in local and map reduce mode
[ https://issues.apache.org/jira/browse/PIG-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-921: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch committed. Strange use case for Join which produces different results in local and map reduce mode --- Key: PIG-921 URL: https://issues.apache.org/jira/browse/PIG-921 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.4.0 Environment: Hadoop 18 and Hadoop 20 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.6.0 Attachments: A.txt, B.txt, joinusecase.pig, PIG-921-1.patch I have script in this manner, loads from 2 files A.txt and B.txt {code} A = LOAD 'A.txt' as (a:tuple(a1:int, a2:chararray)); B = LOAD 'B.txt' as (b:tuple(b1:int, b2:chararray)); C = JOIN A by a.a1, B by b.b1; DESCRIBE C; DUMP C; {code} A.txt contains the following lines: {code} (1,a) (2,aa) {code} B.txt contains the following lines: {code} (1,b) (2,bb) {code} Now running the above script in local and map reduce mode on Hadoop 18 Hadoop 20, produces the following: Hadoop 18 = (1,1) (2,2) = Hadoop 20 = (1,1) (2,2) = Local Mode: Pig with Hadoop 18 jar release = 2009-08-13 17:15:13,473 [main] INFO org.apache.pig.Main - Logging error messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log 09/08/13 17:15:13 INFO pig.Main: Logging error messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log C: {a: (a1: int,a2: chararray),b: (b1: int,b2: chararray)} 2009-08-13 17:15:13,932 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1002: Unable to store alias C 09/08/13 17:15:13 ERROR grunt.Grunt: ERROR 1002: Unable to store alias C Details at logfile: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log = Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109) at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165) ... 9 more = Local Mode: Pig with Hadoop 20 jar release = ((1,a),(1,b)) ((2,aa),(2,bb) = -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1018) FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter
[ https://issues.apache.org/jira/browse/PIG-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765896#action_12765896 ] Hadoop QA commented on PIG-1018: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422153/PIG-1018.patch against trunk revision 825308. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 315 release audit warnings (more than the trunk's current 309 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/console This message is automatically generated. FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter --- Key: PIG-1018 URL: https://issues.apache.org/jira/browse/PIG-1018 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Attachments: PIG-1018.patch NmThe field name org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.LogToPhyMap doesn't start with a lower case letter NmThe method name org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.CreateTuple(Object[]) doesn't start with a lower case letter NmThe class name org.apache.pig.backend.hadoop.executionengine.physicalLayer.util.operatorHelper doesn't start with an upper case letter NmClass org.apache.pig.impl.util.WrappedIOException is not derived from an Exception, even though it is named as such NmThe method name org.apache.pig.pen.EquivalenceClasses.GetEquivalenceClasses(LogicalOperator, Map) doesn't start with a lower case letter NmThe field name org.apache.pig.pen.util.DisplayExamples.Result doesn't start with a lower case letter NmThe method name org.apache.pig.pen.util.DisplayExamples.PrintSimple(LogicalOperator, Map) doesn't start with a lower case letter NmThe method name org.apache.pig.pen.util.DisplayExamples.PrintTabular(LogicalPlan, Map) doesn't start with a lower case letter NmThe method name org.apache.pig.tools.parameters.TokenMgrError.LexicalError(boolean, int, int, int, String, char) doesn't start with a lower case letter -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1008) FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL
[ https://issues.apache.org/jira/browse/PIG-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765901#action_12765901 ] Hadoop QA commented on PIG-1008: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422155/PIG-1008.patch against trunk revision 825308. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/console This message is automatically generated. FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL --- Key: PIG-1008 URL: https://issues.apache.org/jira/browse/PIG-1008 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Attachments: PIG-1008.patch NPorg.apache.pig.data.DataByteArray.toString() may return null NP org.apache.pig.impl.streaming.StreamingCommand$HandleSpec.equals(Object) does not check for null argument -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.