[jira] Commented: (PIG-1016) Reading in map data seems broken

2009-10-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765415#action_12765415
 ] 

Hadoop QA commented on PIG-1016:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422031/PIG-1016.patch
  against trunk revision 824980.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/console

This message is automatically generated.

 Reading in map data seems broken
 

 Key: PIG-1016
 URL: https://issues.apache.org/jira/browse/PIG-1016
 Project: Pig
  Issue Type: Improvement
  Components: data
Affects Versions: 0.4.0
Reporter: hc busy
 Attachments: PIG-1016.patch


 Hi, I'm trying to load a map that has a tuple for value. The read fails in 
 0.4.0 because of a misconfiguration in the parser. Where as in almost all 
 documentation it is stated that value of the map can be any time.
 I've attached a patch that allows us to read in complex objects as value as 
 documented. I've done simple verification of loading in maps with tuple/map 
 values and writing them back out using LOAD and STORE. All seems to work fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-755) Difficult to debug parameter substitution problems based on the error messages when running in local mode

2009-10-14 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765423#action_12765423
 ] 

Daniel Dai commented on PIG-755:


Now the error message changed to:
ERROR 2999: Unexpected internal error. Can not create a Path from an empty 
string

java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.init(Path.java:90)
at 
org.apache.pig.impl.logicalLayer.parser.QueryParser.massageFilename(QueryParser.java:191)
at 
org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1440)
at 
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1227)
at 
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893)
at 
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682)
at 
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1017)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:967)
at org.apache.pig.PigServer.registerQuery(PigServer.java:383)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:716)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:397)

 Difficult to debug parameter substitution problems based on the error 
 messages when running in local mode
 -

 Key: PIG-755
 URL: https://issues.apache.org/jira/browse/PIG-755
 Project: Pig
  Issue Type: Bug
  Components: grunt
Affects Versions: 0.3.0
Reporter: Viraj Bhat
 Attachments: inputfile.txt, localparamsub.pig


 I have a script in which I do a parameter substitution for the input file. I 
 have a use case where I find it difficult to debug based on the error 
 messages in local mode.
 {code}
 A = load '$infile' using PigStorage() as
  (
date: chararray,
count   : long,
gmean   : double
 );
 dump A;
 {code}
 1) I run it in local mode with the input file in the current working directory
 {code}
 prompt  $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main 
 -exectype local -param infile='inputfile.txt' localparamsub.pig
 {code}
 2009-04-07 00:03:51,967 [main] ERROR 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
  - Received error from storer function: 
 org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable to 
 setup the load function.
 2009-04-07 00:03:51,970 [main] INFO  
 org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed jobs!!
 2009-04-07 00:03:51,971 [main] INFO  
 org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out of 1 
 failed!
 2009-04-07 00:03:51,974 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1066: Unable to open iterator for alias A
 
 Details at logfile: /home/viraj/pig-svn/trunk/pig_1239062631414.log
 
 ERROR 1066: Unable to open iterator for alias A
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
 open iterator for alias A
 at org.apache.pig.PigServer.openIterator(PigServer.java:439)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:193)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
 at org.apache.pig.Main.main(Main.java:352)
 Caused by: java.io.IOException: Job terminated with anomalous status FAILED
 at org.apache.pig.PigServer.openIterator(PigServer.java:433)
 ... 5 more
 
 2) I run it in map reduce mode
 {code}
 prompt  $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main -param 
 infile='inputfile.txt' localparamsub.pig
 {code}
 2009-04-07 00:07:31,660 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
 to hadoop file system at: hdfs://localhost:9000
 2009-04-07 00:07:32,074 

[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan

2009-10-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765601#action_12765601
 ] 

Alan Gates commented on PIG-858:


Mostly looks straight forward and passes all the tests.  You made a number of 
changes in MRCompiler.visitUnion.  I don't understand what exactly you were 
changing there.  Could you give a brief overview of those changes?

 Order By followed by replicated join fails while compiling MR-plan from 
 physical plan
 ---

 Key: PIG-858
 URL: https://issues.apache.org/jira/browse/PIG-858
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.6.0

 Attachments: pig-858.patch


 Consider the query:
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0;
 explain C;
 {code}
 works. But if replicated join is used instead
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0 using replicated;
 explain C;
 {code}
 this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error 
 compiling operator POFRJoin
 relevant stacktrace:
 {code}
 Caused by: java.lang.RuntimeException: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306)
 at org.apache.pig.PigServer.explain(PigServer.java:574)
 ... 8 more
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303)
 ... 9 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901)
 ... 16 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.

2009-10-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765603#action_12765603
 ] 

Alan Gates commented on PIG-760:


At this point no one has contributed a PigStorageSchema as suggested above.  We 
remain open to such a contribution if someone has the time.  

 Serialize schemas for PigStorage() and other storage types.
 ---

 Key: PIG-760
 URL: https://issues.apache.org/jira/browse/PIG-760
 Project: Pig
  Issue Type: New Feature
Reporter: David Ciemiewicz

 I'm finding PigStorage() really convenient for storage and data interchange 
 because it compresses well and imports into Excel and other analysis 
 environments well.
 However, it is a pain when it comes to maintenance because the columns are in 
 fixed locations and I'd like to add columns in some cases.
 It would be great if load PigStorage() could read a default schema from a 
 .schema file stored with the data and if store PigStorage() could store a 
 .schema file with the data.
 I have tested this out and both Hadoop HDFS and Pig in -exectype local mode 
 will ignore a file called .schema in a directory of part files.
 So, for example, if I have a chain of Pig scripts I execute such as:
 A = load 'data-1' using PigStorage() as ( a: int , b: int );
 store A into 'data-2' using PigStorage();
 B = load 'data-2' using PigStorage();
 describe B;
 describe B should output something like { a: int, b: int }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter

2009-10-14 Thread Thejas M Nair (JIRA)
optimizer pushes filter before the foreach that generates column used by filter
---

 Key: PIG-1022
 URL: https://issues.apache.org/jira/browse/PIG-1022
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair


grunt l = load 'students.txt' using PigStorage() as (name:chararray, 
gender:chararray, age:chararray, score:chararray);
grunt f = foreach l generate name, gender, age,score, '200'  as gid:chararray;
grunt g = group f by (name, gid);
grunt f2 = foreach g generate group.name as name: chararray, group.gid as gid: 
chararray;
grunt filt = filter f2 by gid == '200';
grunt explain filt;

In the plan generated filt is pushed up after the load and before the first 
foreach, even though the filter is on gid which is generated in first foreach.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter

2009-10-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765612#action_12765612
 ] 

Thejas M Nair commented on PIG-1022:


${code}
grunt explain filt;
#---
# Logical Plan:
#---

Store 1-1162 Schema: {name: chararray,gid: chararray} Type: Unknown
|
|---ForEach 1-1148 Schema: {name: chararray,gid: chararray} Type: bag
|   |
|   Project 1-1144 Projections: [0] Overloaded: false FieldSchema: name: 
chararray Type: chararray
|   Input: Project 1-1145 Projections: [0] Overloaded: false|
|   |---Project 1-1145 Projections: [0] Overloaded: false FieldSchema: 
group: tuple({name: chararray,gid: chararray}) Type: tuple
|   Input: CoGroup 1-1138
|   |
|   Project 1-1146 Projections: [1] Overloaded: false FieldSchema: gid: 
chararray Type: chararray
|   Input: Project 1-1147 Projections: [0] Overloaded: false|
|   |---Project 1-1147 Projections: [0] Overloaded: false FieldSchema: 
group: tuple({name: chararray,gid: chararray}) Type: tuple
|   Input: CoGroup 1-1138
|
|---CoGroup 1-1138 Schema: {group: (name: chararray,gid: chararray),f: 
{name: chararray,gender: chararray,age: chararray,score: chararray,gid: 
chararray}} Type: bag
|   |
|   Project 1-1136 Projections: [0] Overloaded: false FieldSchema: 
name: chararray Type: chararray
|   Input: ForEach 1-1135
|   |
|   Project 1-1137 Projections: [4] Overloaded: false FieldSchema: gid: 
chararray Type: chararray
|   Input: ForEach 1-1135
|
|---ForEach 1-1135 Schema: {name: chararray,gender: chararray,age: 
chararray,score: chararray,gid: chararray} Type: bag
|   |
|   Project 1-1130 Projections: [0] Overloaded: false FieldSchema: 
name: chararray Type: chararray
|   Input: Filter 1-1152
|   |
|   Project 1-1131 Projections: [1] Overloaded: false FieldSchema: 
gender: chararray Type: chararray
|   Input: Filter 1-1152
|   |
|   Project 1-1132 Projections: [2] Overloaded: false FieldSchema: 
age: chararray Type: chararray
|   Input: Filter 1-1152
|   |
|   Project 1-1133 Projections: [3] Overloaded: false FieldSchema: 
score: chararray Type: chararray
|   Input: Filter 1-1152
|   |
|   Const 1-1134( 200 ) FieldSchema: chararray Type: chararray
|
|---Filter 1-1152 Schema: {name: chararray,gender: chararray,age: 
chararray,score: chararray} Type: bag
|   |
|   Equal 1-1151 FieldSchema: boolean Type: boolean
|   |
|   |---Project 1-1149 Projections: [0] Overloaded: false 
FieldSchema: name: chararray Type: chararray
|   |   Input: ForEach 1-1161
|   |
|   |---Const 1-1150( 200 ) FieldSchema: chararray Type: 
chararray
|
|---ForEach 1-1161 Schema: {name: chararray,gender: 
chararray,age: chararray,score: chararray} Type: bag
|   |
|   Cast 1-1154 FieldSchema: name: chararray Type: chararray
|   |
|   |---Project 1-1153 Projections: [0] Overloaded: false 
FieldSchema: name: bytearray Type: bytearray
|   Input: Load 1-1123
|   |
|   Cast 1-1156 FieldSchema: gender: chararray Type: 
chararray
|   |
|   |---Project 1-1155 Projections: [1] Overloaded: false 
FieldSchema: gender: bytearray Type: bytearray
|   Input: Load 1-1123
|   |
|   Cast 1-1158 FieldSchema: age: chararray Type: chararray
|   |
|   |---Project 1-1157 Projections: [2] Overloaded: false 
FieldSchema: age: bytearray Type: bytearray
|   Input: Load 1-1123
|   |
|   Cast 1-1160 FieldSchema: score: chararray Type: 
chararray
|   |
|   |---Project 1-1159 Projections: [3] Overloaded: false 
FieldSchema: score: bytearray Type: bytearray
|   Input: Load 1-1123
|
|---Load 1-1123 Schema: {name: bytearray,gender: 
bytearray,age: bytearray,score: bytearray} Type: bag

${code}

 optimizer pushes filter before the foreach that generates column used by 
 filter
 ---

 Key: PIG-1022
 URL: https://issues.apache.org/jira/browse/PIG-1022
 Project: Pig
  Issue Type: Bug
  Components: impl
   

[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and other storage types.

2009-10-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765626#action_12765626
 ] 

Dmitriy V. Ryaboy commented on PIG-760:
---

This would be a nice proof-of-concept task for the new Load/StoreMetadata 
interfaces, as it removes the complexity of dealing with something like Owl.

 Serialize schemas for PigStorage() and other storage types.
 ---

 Key: PIG-760
 URL: https://issues.apache.org/jira/browse/PIG-760
 Project: Pig
  Issue Type: New Feature
Reporter: David Ciemiewicz

 I'm finding PigStorage() really convenient for storage and data interchange 
 because it compresses well and imports into Excel and other analysis 
 environments well.
 However, it is a pain when it comes to maintenance because the columns are in 
 fixed locations and I'd like to add columns in some cases.
 It would be great if load PigStorage() could read a default schema from a 
 .schema file stored with the data and if store PigStorage() could store a 
 .schema file with the data.
 I have tested this out and both Hadoop HDFS and Pig in -exectype local mode 
 will ignore a file called .schema in a directory of part files.
 So, for example, if I have a chain of Pig scripts I execute such as:
 A = load 'data-1' using PigStorage() as ( a: int , b: int );
 store A into 'data-2' using PigStorage();
 B = load 'data-2' using PigStorage();
 describe B;
 describe B should output something like { a: int, b: int }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-1022:
---

Assignee: Daniel Dai

 optimizer pushes filter before the foreach that generates column used by 
 filter
 ---

 Key: PIG-1022
 URL: https://issues.apache.org/jira/browse/PIG-1022
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Daniel Dai

 grunt l = load 'students.txt' using PigStorage() as (name:chararray, 
 gender:chararray, age:chararray, score:chararray);
 grunt f = foreach l generate name, gender, age,score, '200'  as 
 gid:chararray;
 grunt g = group f by (name, gid);
 grunt f2 = foreach g generate group.name as name: chararray, group.gid as 
 gid: chararray;
 grunt filt = filter f2 by gid == '200';
 grunt explain filt;
 In the plan generated filt is pushed up after the load and before the first 
 foreach, even though the filter is on gid which is generated in first foreach.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1014) Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records

2009-10-14 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765658#action_12765658
 ] 

Pradeep Kamath commented on PIG-1014:
-

To achieve 1. above, we would translate COUNT( A ) to COUNT_STAR( A ) during 
job compilation. Since 3. above has multiple options and does not seem to be a 
prevalent use case (SQL does not support it), another option is to disable it - 
thoughts?

 Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all 
 records are counted without considering nullness of the fields in the records
 

 Key: PIG-1014
 URL: https://issues.apache.org/jira/browse/PIG-1014
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Pradeep Kamath



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1000:


Status: Patch Available  (was: Open)

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 patch submitted by PIG-975 generates javac warning and findbug warning

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter

2009-10-14 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765682#action_12765682
 ] 

Daniel Dai commented on PIG-1022:
-

Actually we cannot push the filter even before f2. Since we do not keep track 
of the source of data inside tuple, so gid should be treated as a generated 
field of f2. However, projection map of f2 give us the wrong result that gid is 
a directly mapped field of group (which is a tuple (name, gid)), and this 
triggers all the subsequences. The fix for this problem is to modify the 
projection map generation logic for the mapped field. 

Santhosh, do you have any comment?

 optimizer pushes filter before the foreach that generates column used by 
 filter
 ---

 Key: PIG-1022
 URL: https://issues.apache.org/jira/browse/PIG-1022
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Daniel Dai

 grunt l = load 'students.txt' using PigStorage() as (name:chararray, 
 gender:chararray, age:chararray, score:chararray);
 grunt f = foreach l generate name, gender, age,score, '200'  as 
 gid:chararray;
 grunt g = group f by (name, gid);
 grunt f2 = foreach g generate group.name as name: chararray, group.gid as 
 gid: chararray;
 grunt filt = filter f2 by gid == '200';
 grunt explain filt;
 In the plan generated filt is pushed up after the load and before the first 
 foreach, even though the filter is on gid which is generated in first foreach.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1003) FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators

2009-10-14 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765711#action_12765711
 ] 

Olga Natkovich commented on PIG-1003:
-

Added to exclude file for now

 FINDBUGS: CN_IDIOM_NO_SUPER_CALL in   
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators
 

 Key: PIG-1003
 URL: https://issues.apache.org/jira/browse/PIG-1003
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich

 All physical expression operators have this issue. In the clone method, they 
 instanciate a new object rather than call super.clone.
 This is a major change and for now I am planning to exclude this warning. We 
 will address it once we work on the frontend rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1004) FINDBUGS: CN_IDIOM_NO_SUPER_CALL in org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators

2009-10-14 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765712#action_12765712
 ] 

Olga Natkovich commented on PIG-1004:
-

Added to exclue file for now

 FINDBUGS: CN_IDIOM_NO_SUPER_CALL in   
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
 

 Key: PIG-1004
 URL: https://issues.apache.org/jira/browse/PIG-1004
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich

 Will address this during next cleanup:
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrangeForIllustrate.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POOptimizedForEach.clone()
  does not call super.clone()
 CN
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSort.clone()
  does not call super.clone()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL

2009-10-14 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765718#action_12765718
 ] 

Olga Natkovich commented on PIG-1023:
-

This does not have to go through patch test process. Could one of the 
committers please review

 FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
 

 Key: PIG-1023
 URL: https://issues.apache.org/jira/browse/PIG-1023
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Attachments: PIG-1023.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL

2009-10-14 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1023:


Attachment: PIG-1023.patch

 FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
 

 Key: PIG-1023
 URL: https://issues.apache.org/jira/browse/PIG-1023
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Attachments: PIG-1023.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan

2009-10-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765720#action_12765720
 ] 

Ashutosh Chauhan commented on PIG-858:
--

visitUnion has same changes as others visit functions, that is it adds MR 
Operator corresponding to POUnion in phyToMROpMap map. Real changes are in 
visitFRJoin. Earlier in visitFRJoin, it used to look in compiledInputs array of 
MROper one by one trying to match MROPer leaf PO with POFRJoin using operator 
key. Now, it doesn't need to do that it can simply lookup in the phyToMROpMap.

 Order By followed by replicated join fails while compiling MR-plan from 
 physical plan
 ---

 Key: PIG-858
 URL: https://issues.apache.org/jira/browse/PIG-858
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.6.0

 Attachments: pig-858.patch


 Consider the query:
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0;
 explain C;
 {code}
 works. But if replicated join is used instead
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0 using replicated;
 explain C;
 {code}
 this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error 
 compiling operator POFRJoin
 relevant stacktrace:
 {code}
 Caused by: java.lang.RuntimeException: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306)
 at org.apache.pig.PigServer.explain(PigServer.java:574)
 ... 8 more
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303)
 ... 9 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901)
 ... 16 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1020) Include an ant target to build pig.jar without hadoop libraries

2009-10-14 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765734#action_12765734
 ] 

Olga Natkovich commented on PIG-1020:
-

+1, please, commit to trunk and 0.5.0 branch

 Include an ant target to build pig.jar without hadoop libraries
 ---

 Key: PIG-1020
 URL: https://issues.apache.org/jira/browse/PIG-1020
 Project: Pig
  Issue Type: New Feature
  Components: build
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Priority: Minor
 Fix For: 0.6.0

 Attachments: PIG-1020-1.patch, PIG-1020-2.patch


 Provide an ant target to build pig.jar without all hadoop related libraries. 
 User will provide external hadoop jars in classpath before invoking pig.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan

2009-10-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765735#action_12765735
 ] 

Ashutosh Chauhan commented on PIG-858:
--

Its been a while since I did that patch. So, bit more clarification: We are 
interested in finding PO which corresponds to fragment PO input of POFRJoin. 
This PO is already compiled and is in one the MROper. Earlier we  will iterate 
through compiledInputs array trying to match this PO  with PO contained in each 
MROperator. This fails as discussed in previous comments. With this change, 
since we keep track of MR operator with each physical operator it need not to 
do that but can simply look up for MROper corresponding to fragment PO in the 
phyToMROpMap.

 Order By followed by replicated join fails while compiling MR-plan from 
 physical plan
 ---

 Key: PIG-858
 URL: https://issues.apache.org/jira/browse/PIG-858
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.6.0

 Attachments: pig-858.patch


 Consider the query:
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0;
 explain C;
 {code}
 works. But if replicated join is used instead
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0 using replicated;
 explain C;
 {code}
 this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error 
 compiling operator POFRJoin
 relevant stacktrace:
 {code}
 Caused by: java.lang.RuntimeException: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306)
 at org.apache.pig.PigServer.explain(PigServer.java:574)
 ... 8 more
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303)
 ... 9 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901)
 ... 16 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL

2009-10-14 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765751#action_12765751
 ] 

Daniel Dai commented on PIG-1023:
-

+1. Target findbug warnings suppressed. Findbugs generate 37 less warnings.

 FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
 

 Key: PIG-1023
 URL: https://issues.apache.org/jira/browse/PIG-1023
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Attachments: PIG-1023.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct

2009-10-14 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765767#action_12765767
 ] 

Yan Zhou commented on PIG-944:
--

A typo in one of my earlier comments at 02/Oct/09 10:33 PM. Instead of 

This patch must be applied after the patch for Jira PIG-933 has been applied. 

it should have read as

This patch must be applied after the patch for Jira PIG-993 has been applied. 

 Zebra schema is taken from Pig through TableStorer's construct
 --

 Key: PIG-944
 URL: https://issues.apache.org/jira/browse/PIG-944
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.6.0

 Attachments: SchemaConversion.patch, SchemaConversion.patch


 It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method 
 because the information is dynamic in Pig's execution engine and should not 
 be taking a static argument to the constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-1023) FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL

2009-10-14 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-1023.
-

Resolution: Fixed

patch committed

 FINDBUGS: exclude CN_IDIOM_NO_SUPER_CALL
 

 Key: PIG-1023
 URL: https://issues.apache.org/jira/browse/PIG-1023
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Attachments: PIG-1023.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1000:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

This patch is to address javacc and findbug warnings, no unit test needed. 
Patch committed. Thanks Ying!

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 patch submitted by PIG-975 generates javac warning and findbug warning

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1014) Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all records are counted without considering nullness of the fields in the records

2009-10-14 Thread Santhosh Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765779#action_12765779
 ] 

Santhosh Srinivasan commented on PIG-1014:
--

Another option is to change the implementation of COUNT to reflect the proposed 
semantics. If the underlying UDF is changed then the user should be notified 
via an information message. If the user checks the explain output then (s)he 
will notice COUNT_STAR and will be confused.

 Pig should convert COUNT(relation) to COUNT_STAR(relation) so that all 
 records are counted without considering nullness of the fields in the records
 

 Key: PIG-1014
 URL: https://issues.apache.org/jira/browse/PIG-1014
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Pradeep Kamath



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1020) Include an ant target to build pig.jar without hadoop libraries

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1020:


   Resolution: Fixed
Fix Version/s: 0.5.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

No unit test included since it only changes build.xml. Patch committed to both 
trunk and 0.5 branch. 

New target for pig.jar without hadoop libs is jar-withouthadoop.

 Include an ant target to build pig.jar without hadoop libraries
 ---

 Key: PIG-1020
 URL: https://issues.apache.org/jira/browse/PIG-1020
 Project: Pig
  Issue Type: New Feature
  Components: build
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Priority: Minor
 Fix For: 0.5.0, 0.6.0

 Attachments: PIG-1020-1.patch, PIG-1020-2.patch


 Provide an ant target to build pig.jar without all hadoop related libraries. 
 User will provide external hadoop jars in classpath before invoking pig.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1018) FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter

2009-10-14 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1018:


Attachment: PIG-1018.patch

 FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower 
 case letter
 ---

 Key: PIG-1018
 URL: https://issues.apache.org/jira/browse/PIG-1018
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
 Attachments: PIG-1018.patch


 NmThe field name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.LogToPhyMap
  doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.CreateTuple(Object[])
  doesn't start with a lower case letter
 NmThe class name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.util.operatorHelper
  doesn't start with an upper case letter
 NmClass org.apache.pig.impl.util.WrappedIOException is not derived from 
 an Exception, even though it is named as such
 NmThe method name 
 org.apache.pig.pen.EquivalenceClasses.GetEquivalenceClasses(LogicalOperator, 
 Map) doesn't start with a lower case letter
 NmThe field name org.apache.pig.pen.util.DisplayExamples.Result doesn't 
 start with a lower case letter
 NmThe method name 
 org.apache.pig.pen.util.DisplayExamples.PrintSimple(LogicalOperator, Map) 
 doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.pen.util.DisplayExamples.PrintTabular(LogicalPlan, Map) 
 doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.tools.parameters.TokenMgrError.LexicalError(boolean, int, int, 
 int, String, char) doesn't start with a lower case letter

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs

2009-10-14 Thread Daniel Dai (JIRA)
Script contains nested limit fail due to LOLimit does not support multiple 
outputs


 Key: PIG-1024
 URL: https://issues.apache.org/jira/browse/PIG-1024
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Daniel Dai
 Fix For: 0.6.0


The following script fail: 

a = load '1.txt' as (a0:int, a1:int, a2:int);
b = group a by a0;
c = foreach b { c1 = limit a 10;
c2 = (c1.a0/c1.a1);
c3 = (c1.a0/c1.a2);
generate c2, c3;}

Error message:

ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type
org.apache.pig.impl.logicalLayer.LOLimit multiple outputs.  This operator does 
not support multiple outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1024:


Attachment: PIG-1024-1.patch

Patch included. Thanks Pradeep's diagnosis.

 Script contains nested limit fail due to LOLimit does not support multiple 
 outputs
 

 Key: PIG-1024
 URL: https://issues.apache.org/jira/browse/PIG-1024
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1024-1.patch


 The following script fail: 
 a = load '1.txt' as (a0:int, a1:int, a2:int);
 b = group a by a0;
 c = foreach b { c1 = limit a 10;
 c2 = (c1.a0/c1.a1);
 c3 = (c1.a0/c1.a2);
 generate c2, c3;}
 Error message:
 ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type
 org.apache.pig.impl.logicalLayer.LOLimit multiple outputs.  This operator 
 does not support multiple outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1024:


Status: Patch Available  (was: Open)

 Script contains nested limit fail due to LOLimit does not support multiple 
 outputs
 

 Key: PIG-1024
 URL: https://issues.apache.org/jira/browse/PIG-1024
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1024-1.patch


 The following script fail: 
 a = load '1.txt' as (a0:int, a1:int, a2:int);
 b = group a by a0;
 c = foreach b { c1 = limit a 10;
 c2 = (c1.a0/c1.a1);
 c3 = (c1.a0/c1.a2);
 generate c2, c3;}
 Error message:
 ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type
 org.apache.pig.impl.logicalLayer.LOLimit multiple outputs.  This operator 
 does not support multiple outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-1024:
---

Assignee: Daniel Dai

 Script contains nested limit fail due to LOLimit does not support multiple 
 outputs
 

 Key: PIG-1024
 URL: https://issues.apache.org/jira/browse/PIG-1024
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1024-1.patch


 The following script fail: 
 a = load '1.txt' as (a0:int, a1:int, a2:int);
 b = group a by a0;
 c = foreach b { c1 = limit a 10;
 c2 = (c1.a0/c1.a1);
 c3 = (c1.a0/c1.a2);
 generate c2, c3;}
 Error message:
 ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type
 org.apache.pig.impl.logicalLayer.LOLimit multiple outputs.  This operator 
 does not support multiple outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1008) FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL

2009-10-14 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1008:


Status: Patch Available  (was: Open)

 FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL
 ---

 Key: PIG-1008
 URL: https://issues.apache.org/jira/browse/PIG-1008
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
 Attachments: PIG-1008.patch


 NPorg.apache.pig.data.DataByteArray.toString() may return null
 NP
 org.apache.pig.impl.streaming.StreamingCommand$HandleSpec.equals(Object) does 
 not check for null argument

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1009) FINDBUGS: OS_OPEN_STREAM: Method may fail to close stream

2009-10-14 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1009:


Attachment: PIG-1009.patch

 FINDBUGS: OS_OPEN_STREAM: Method may fail to close stream
 -

 Key: PIG-1009
 URL: https://issues.apache.org/jira/browse/PIG-1009
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
 Attachments: PIG-1009.patch


 OSorg.apache.pig.impl.io.FileLocalizer.parseCygPath(String, int) may fail 
 to close stream
 OSorg.apache.pig.impl.logicalLayer.parser.QueryParser.which(String) may 
 fail to close stream
 OS
 org.apache.pig.impl.util.PropertiesUtil.loadPropertiesFromFile(Properties) 
 may fail to close stream
 OSorg.apache.pig.Main.configureLog4J(Properties, PigContext) may fail to 
 close stream
 OS
 org.apache.pig.tools.parameters.PreprocessorContext.executeShellCommand(String)
  may fail to close stream
 OS
 org.apache.pig.tools.parameters.PreprocessorContext.executeShellCommand(String)
  may fail to close stream

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1024) Script contains nested limit fail due to LOLimit does not support multiple outputs

2009-10-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765853#action_12765853
 ] 

Hadoop QA commented on PIG-1024:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422154/PIG-1024-1.patch
  against trunk revision 825308.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/26/console

This message is automatically generated.

 Script contains nested limit fail due to LOLimit does not support multiple 
 outputs
 

 Key: PIG-1024
 URL: https://issues.apache.org/jira/browse/PIG-1024
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1024-1.patch


 The following script fail: 
 a = load '1.txt' as (a0:int, a1:int, a2:int);
 b = group a by a0;
 c = foreach b { c1 = limit a 10;
 c2 = (c1.a0/c1.a1);
 c3 = (c1.a0/c1.a2);
 generate c2, c3;}
 Error message:
 ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type
 org.apache.pig.impl.logicalLayer.LOLimit multiple outputs.  This operator 
 does not support multiple outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-921) Strange use case for Join which produces different results in local and map reduce mode

2009-10-14 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765864#action_12765864
 ] 

Pradeep Kamath commented on PIG-921:


+1 - minor comment, we can probably remove preds==null || preds.get(0)==null 
from the if() since the project should always have a predecessor and if it does 
not the execution would fail somewhere else .

 Strange use case for Join which produces different results in local and map 
 reduce mode
 ---

 Key: PIG-921
 URL: https://issues.apache.org/jira/browse/PIG-921
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
 Environment: Hadoop 18 and Hadoop 20
Reporter: Viraj Bhat
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: A.txt, B.txt, joinusecase.pig, PIG-921-1.patch


 I have script in this manner, loads from 2 files A.txt and B.txt
 {code}
 A = LOAD 'A.txt' as (a:tuple(a1:int, a2:chararray));
 B = LOAD 'B.txt' as (b:tuple(b1:int, b2:chararray));
 C = JOIN A by a.a1, B by b.b1;
 DESCRIBE C;
 DUMP C;
 {code}
 A.txt contains the following lines:
 {code}
 (1,a)
 (2,aa)
 {code}
 B.txt contains the following lines:
 {code}
 (1,b)
 (2,bb)
 {code}
 Now running the above script in local and map reduce mode on Hadoop 18  
 Hadoop 20, produces the following:
 Hadoop 18
 =
 (1,1)
 (2,2)
 =
 Hadoop 20
 =
 (1,1)
 (2,2)
 =
 Local Mode: Pig with Hadoop 18 jar release 
 =
 2009-08-13 17:15:13,473 [main] INFO  org.apache.pig.Main - Logging error 
 messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 09/08/13 17:15:13 INFO pig.Main: Logging error messages to: 
 /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 C: {a: (a1: int,a2: chararray),b: (b1: int,b2: chararray)}
 2009-08-13 17:15:13,932 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1002: Unable to store alias C
 09/08/13 17:15:13 ERROR grunt.Grunt: ERROR 1002: Unable to store alias C
 Details at logfile: 
 /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 =
 Caused by: java.lang.NullPointerException
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
 at 
 org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146)
 at 
 org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109)
 at 
 org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165)
 ... 9 more
 =
 Local Mode: Pig with Hadoop 20 jar release
 =
 ((1,a),(1,b))
 ((2,aa),(2,bb)
 =

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-921) Strange use case for Join which produces different results in local and map reduce mode

2009-10-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-921:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed.

 Strange use case for Join which produces different results in local and map 
 reduce mode
 ---

 Key: PIG-921
 URL: https://issues.apache.org/jira/browse/PIG-921
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
 Environment: Hadoop 18 and Hadoop 20
Reporter: Viraj Bhat
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: A.txt, B.txt, joinusecase.pig, PIG-921-1.patch


 I have script in this manner, loads from 2 files A.txt and B.txt
 {code}
 A = LOAD 'A.txt' as (a:tuple(a1:int, a2:chararray));
 B = LOAD 'B.txt' as (b:tuple(b1:int, b2:chararray));
 C = JOIN A by a.a1, B by b.b1;
 DESCRIBE C;
 DUMP C;
 {code}
 A.txt contains the following lines:
 {code}
 (1,a)
 (2,aa)
 {code}
 B.txt contains the following lines:
 {code}
 (1,b)
 (2,bb)
 {code}
 Now running the above script in local and map reduce mode on Hadoop 18  
 Hadoop 20, produces the following:
 Hadoop 18
 =
 (1,1)
 (2,2)
 =
 Hadoop 20
 =
 (1,1)
 (2,2)
 =
 Local Mode: Pig with Hadoop 18 jar release 
 =
 2009-08-13 17:15:13,473 [main] INFO  org.apache.pig.Main - Logging error 
 messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 09/08/13 17:15:13 INFO pig.Main: Logging error messages to: 
 /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 C: {a: (a1: int,a2: chararray),b: (b1: int,b2: chararray)}
 2009-08-13 17:15:13,932 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1002: Unable to store alias C
 09/08/13 17:15:13 ERROR grunt.Grunt: ERROR 1002: Unable to store alias C
 Details at logfile: 
 /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log
 =
 Caused by: java.lang.NullPointerException
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
 at 
 org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146)
 at 
 org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109)
 at 
 org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165)
 ... 9 more
 =
 Local Mode: Pig with Hadoop 20 jar release
 =
 ((1,a),(1,b))
 ((2,aa),(2,bb)
 =

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1018) FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower case letter

2009-10-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765896#action_12765896
 ] 

Hadoop QA commented on PIG-1018:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422153/PIG-1018.patch
  against trunk revision 825308.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 315 release audit warnings 
(more than the trunk's current 309 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/80/console

This message is automatically generated.

 FINDBUGS: NM_FIELD_NAMING_CONVENTION: Field names should start with a lower 
 case letter
 ---

 Key: PIG-1018
 URL: https://issues.apache.org/jira/browse/PIG-1018
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
 Attachments: PIG-1018.patch


 NmThe field name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.LogToPhyMap
  doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.CreateTuple(Object[])
  doesn't start with a lower case letter
 NmThe class name 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.util.operatorHelper
  doesn't start with an upper case letter
 NmClass org.apache.pig.impl.util.WrappedIOException is not derived from 
 an Exception, even though it is named as such
 NmThe method name 
 org.apache.pig.pen.EquivalenceClasses.GetEquivalenceClasses(LogicalOperator, 
 Map) doesn't start with a lower case letter
 NmThe field name org.apache.pig.pen.util.DisplayExamples.Result doesn't 
 start with a lower case letter
 NmThe method name 
 org.apache.pig.pen.util.DisplayExamples.PrintSimple(LogicalOperator, Map) 
 doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.pen.util.DisplayExamples.PrintTabular(LogicalPlan, Map) 
 doesn't start with a lower case letter
 NmThe method name 
 org.apache.pig.tools.parameters.TokenMgrError.LexicalError(boolean, int, int, 
 int, String, char) doesn't start with a lower case letter

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1008) FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL

2009-10-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765901#action_12765901
 ] 

Hadoop QA commented on PIG-1008:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422155/PIG-1008.patch
  against trunk revision 825308.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/27/console

This message is automatically generated.

 FINDBUGS: NP_TOSTRING_COULD_RETURN_NULL
 ---

 Key: PIG-1008
 URL: https://issues.apache.org/jira/browse/PIG-1008
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
 Attachments: PIG-1008.patch


 NPorg.apache.pig.data.DataByteArray.toString() may return null
 NP
 org.apache.pig.impl.streaming.StreamingCommand$HandleSpec.equals(Object) does 
 not check for null argument

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.