[jira] Commented: (PIG-1468) DataByteArray.compareTo() does not compare in lexicographic order

2010-07-11 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887178#action_12887178
 ] 

Gianmarco De Francisci Morales commented on PIG-1468:
-

It is quite easy to fix DataType.compare() to keep into account the unsigned 
logic.
But I am starting to feel that all of this is probably not worth the trouble.
This would make DataType.compare() for Bytes different from Byte.compareTo().


 DataByteArray.compareTo() does not compare in lexicographic order
 -

 Key: PIG-1468
 URL: https://issues.apache.org/jira/browse/PIG-1468
 Project: Pig
  Issue Type: Bug
Reporter: Gianmarco De Francisci Morales
Assignee: Gianmarco De Francisci Morales
 Attachments: PIG-1468.patch


 The compareTo() method of org.apache.pig.data.DataByteArray does not compare 
 items in lexicographic order.
 Actually, it takes into account the signum of the bytes that compose the 
 DataByteArray.
 So, for example, 0xff compares to less than 0x00

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1295) Binary comparator for secondary sort

2010-07-11 Thread Gianmarco De Francisci Morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gianmarco De Francisci Morales updated PIG-1295:


Attachment: PIG-1295_0.8.patch

I added the code for if the user does not use DefaultTuple we fall back to the 
default deserialization case. I assume the user defined tuple will have a 
different DataType byte from DataType.TUPLE. If this is not the case, I see no 
way of discerning DefaultTuple from any other Tuple implementation.
Anyway, I think this issue needs to be properly addressed in the context of 
[PIG-1472|https://issues.apache.org/jira/browse/PIG-1472].

I added support for BIGCHARARRAY.

UTF-8 decoding is quite convoluted. It is a variable length encoding, so we 
cannot avoid using a String. [UTF-8|http://en.wikipedia.org/wiki/UTF-8]

Before tackling the integration with 
[PIG-1472|https://issues.apache.org/jira/browse/PIG-1472] I need to familiarize 
with the code in the patch. I will write a proposal for the integration in the 
next days.

I also made some changes to DataByteArray in order to encapsulate the logic for 
comparison in a publicly accessible method. This way the raw comparison is 
consistent with the behavior of the class, in a way similar to the other cases 
where I delegate comparison to the class.

 Binary comparator for secondary sort
 

 Key: PIG-1295
 URL: https://issues.apache.org/jira/browse/PIG-1295
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Gianmarco De Francisci Morales
 Fix For: 0.8.0

 Attachments: PIG-1295_0.1.patch, PIG-1295_0.2.patch, 
 PIG-1295_0.3.patch, PIG-1295_0.4.patch, PIG-1295_0.5.patch, 
 PIG-1295_0.6.patch, PIG-1295_0.7.patch, PIG-1295_0.8.patch


 When hadoop framework doing the sorting, it will try to use binary version of 
 comparator if available. The benefit of binary comparator is we do not need 
 to instantiate the object before we compare. We see a ~30% speedup after we 
 switch to binary comparator. Currently, Pig use binary comparator in 
 following case:
 1. When semantics of order doesn't matter. For example, in distinct, we need 
 to do a sort in order to filter out duplicate values; however, we do not care 
 how comparator sort keys. Groupby also share this character. In this case, we 
 rely on hadoop's default binary comparator
 2. Semantics of order matter, but the key is of simple type. In this case, we 
 have implementation for simple types, such as integer, long, float, 
 chararray, databytearray, string
 However, if the key is a tuple and the sort semantics matters, we do not have 
 a binary comparator implementation. This especially matters when we switch to 
 use secondary sort. In secondary sort, we convert the inner sort of nested 
 foreach into the secondary key and rely on hadoop to sorting on both main key 
 and secondary key. The sorting key will become a two items tuple. Since the 
 secondary key the sorting key of the nested foreach, so the sorting semantics 
 matters. It turns out we do not have binary comparator once we use secondary 
 sort, and we see a significant slow down.
 Binary comparator for tuple should be doable once we understand the binary 
 structure of the serialized tuple. We can focus on most common use cases 
 first, which is group by followed by a nested sort. In this case, we will 
 use secondary sort. Semantics of the first key does not matter but semantics 
 of secondary key matters. We need to identify the boundary of main key and 
 secondary key in the binary tuple buffer without instantiate tuple itself. 
 Then if the first key equals, we use a binary comparator to compare secondary 
 key. Secondary key can also be a complex data type, but for the first step, 
 we focus on simple secondary key, which is the most common use case.
 We mark this issue to be a candidate project for Google summer of code 2010 
 program. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1472) Optimize serialization/deserialization between Map and Reduce and between MR jobs

2010-07-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887265#action_12887265
 ] 

Daniel Dai commented on PIG-1472:
-

Patch looks good. Couple of comments:
1. The following code are never used in BinStorage and InterStorage, should be 
removed.
{code}
public static final int RECORD_1 = 0x01;
public static final int RECORD_2 = 0x02;
public static final int RECORD_3 = 0x03;
{code}

2. In BinInterSedes, why do we have type GENERIC_WRITABLECOMPARABLE? When it 
will be used?

3. Seems InterStorage is a replacement for BinStorage, why do we make it 
private? Shall we encourage user use InterStorage in the place of BinStorage, 
and make BinStorage deprecate?

 Optimize serialization/deserialization between Map and Reduce and between MR 
 jobs
 -

 Key: PIG-1472
 URL: https://issues.apache.org/jira/browse/PIG-1472
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1472.2.patch, PIG-1472.3.patch, PIG-1472.patch


 In certain types of pig queries most of the execution time is spent in 
 serializing/deserializing (sedes) records between Map and Reduce and between 
 MR jobs. 
 For example, if PigMix queries are modified to specify types for all the 
 fields in the load statement schema, some of the queries (L2,L3,L9, L10 in 
 pigmix v1) that have records with bags and maps being transmitted across map 
 or reduce boundaries run a lot longer (runtime increase of few times has been 
 seen.
 There are a few optimizations that have shown to improve the performance of 
 sedes in my tests -
 1. Use smaller number of bytes to store length of the column . For example if 
 a bytearray is smaller than 255 bytes , a byte can be used to store the 
 length instead of the integer that is currently used.
 2. Instead of custom code to do sedes on Strings, use DataOutput.writeUTF and 
 DataInput.readUTF.  This reduces the cost of serialization by more than 1/2. 
 Zebra and BinStorage are known to use DefaultTuple sedes functionality. The 
 serialization format that these loaders use cannot change, so after the 
 optimization their format is going to be different from the format used 
 between M/R boundaries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1472) Optimize serialization/deserialization between Map and Reduce and between MR jobs

2010-07-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887267#action_12887267
 ] 

Daniel Dai commented on PIG-1472:
-

Forget 2, GENERIC_WRITABLECOMPARABLE also in DataReaderWriter, we just follow.

 Optimize serialization/deserialization between Map and Reduce and between MR 
 jobs
 -

 Key: PIG-1472
 URL: https://issues.apache.org/jira/browse/PIG-1472
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1472.2.patch, PIG-1472.3.patch, PIG-1472.patch


 In certain types of pig queries most of the execution time is spent in 
 serializing/deserializing (sedes) records between Map and Reduce and between 
 MR jobs. 
 For example, if PigMix queries are modified to specify types for all the 
 fields in the load statement schema, some of the queries (L2,L3,L9, L10 in 
 pigmix v1) that have records with bags and maps being transmitted across map 
 or reduce boundaries run a lot longer (runtime increase of few times has been 
 seen.
 There are a few optimizations that have shown to improve the performance of 
 sedes in my tests -
 1. Use smaller number of bytes to store length of the column . For example if 
 a bytearray is smaller than 255 bytes , a byte can be used to store the 
 length instead of the integer that is currently used.
 2. Instead of custom code to do sedes on Strings, use DataOutput.writeUTF and 
 DataInput.readUTF.  This reduces the cost of serialization by more than 1/2. 
 Zebra and BinStorage are known to use DefaultTuple sedes functionality. The 
 serialization format that these loaders use cannot change, so after the 
 optimization their format is going to be different from the format used 
 between M/R boundaries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1295) Binary comparator for secondary sort

2010-07-11 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887270#action_12887270
 ] 

Daniel Dai commented on PIG-1295:
-

With the change of PIG-1472, we need to change raw comparator accordingly:
1. Bag comparison should be changed to compare TINYBAG/SMALLBAG/BAG
2. Tuple comparison should be changed to compare TINYTUPLE/SMALLTUPLE/TUPLE
3. Map comparison should be changed to compare TINYMAP/SMALLMAP/MAP
4. Integer comparison should be changed to compare 
INTEGER_0/INTEGER_1/INTEGER_INBYTE/INTEGER_INSHORT/INTEGER
5. ByteArray comparison should be changed to compare 
TINYBYTEARRAY/SMALLBYTEARRAY/BYTEARRAY
6. Chararray comparison should be changed to compare SMALLCHARARRAY/CHARARRAY
7. Raw comparator is now depend on the serialization format. Now we have two 
serialization format, DefaultTuple and BinSedesTuple. It's better to move 
PigTupleRawComparatorNew inside BinSedesTuple. But in this project, we only 
focus on BinSedesTuple, which addres most use cases

In the integration code, we shall check if TupleFactory is actually 
BinSedesTupleFactory, if it is, use this raw comparator; otherwise, use the 
original comparator. 

I was wrong for the customized tuple. we do not need a fall back scheme for 
customized tuple. In the serialized format, all Tuples including customized 
Tuple will be serialized into the same format. 

Looks like UTF-8 encoding is convoluted, we can leave it for now.

 Binary comparator for secondary sort
 

 Key: PIG-1295
 URL: https://issues.apache.org/jira/browse/PIG-1295
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Gianmarco De Francisci Morales
 Fix For: 0.8.0

 Attachments: PIG-1295_0.1.patch, PIG-1295_0.2.patch, 
 PIG-1295_0.3.patch, PIG-1295_0.4.patch, PIG-1295_0.5.patch, 
 PIG-1295_0.6.patch, PIG-1295_0.7.patch, PIG-1295_0.8.patch


 When hadoop framework doing the sorting, it will try to use binary version of 
 comparator if available. The benefit of binary comparator is we do not need 
 to instantiate the object before we compare. We see a ~30% speedup after we 
 switch to binary comparator. Currently, Pig use binary comparator in 
 following case:
 1. When semantics of order doesn't matter. For example, in distinct, we need 
 to do a sort in order to filter out duplicate values; however, we do not care 
 how comparator sort keys. Groupby also share this character. In this case, we 
 rely on hadoop's default binary comparator
 2. Semantics of order matter, but the key is of simple type. In this case, we 
 have implementation for simple types, such as integer, long, float, 
 chararray, databytearray, string
 However, if the key is a tuple and the sort semantics matters, we do not have 
 a binary comparator implementation. This especially matters when we switch to 
 use secondary sort. In secondary sort, we convert the inner sort of nested 
 foreach into the secondary key and rely on hadoop to sorting on both main key 
 and secondary key. The sorting key will become a two items tuple. Since the 
 secondary key the sorting key of the nested foreach, so the sorting semantics 
 matters. It turns out we do not have binary comparator once we use secondary 
 sort, and we see a significant slow down.
 Binary comparator for tuple should be doable once we understand the binary 
 structure of the serialized tuple. We can focus on most common use cases 
 first, which is group by followed by a nested sort. In this case, we will 
 use secondary sort. Semantics of the first key does not matter but semantics 
 of secondary key matters. We need to identify the boundary of main key and 
 secondary key in the binary tuple buffer without instantiate tuple itself. 
 Then if the first key equals, we use a binary comparator to compare secondary 
 key. Secondary key can also be a complex data type, but for the first step, 
 we focus on simple secondary key, which is the most common use case.
 We mark this issue to be a candidate project for Google summer of code 2010 
 program. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1493) Column Pruner throw exception inconsistent pruning

2010-07-11 Thread Daniel Dai (JIRA)
Column Pruner throw exception inconsistent pruning


 Key: PIG-1493
 URL: https://issues.apache.org/jira/browse/PIG-1493
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.8.0, 0.7.0


The following script fail:
{code}
a = load '1.txt' as (a0:chararray, a1:chararray, a2);
b = foreach a generate CONCAT(a0,a1) as b0, a0, a2;
c = foreach b generate a0, a2;
dump c;
{code}

Error message:
ERROR 2185: Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) 
inconsistent pruning

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open 
iterator for alias c
at org.apache.pig.PigServer.openIterator(PigServer.java:698)
at 
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:595)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:291)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:451)
at org.apache.pig.Main.main(Main.java:103)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: 
Unable to store alias c
at org.apache.pig.PigServer.storeEx(PigServer.java:804)
at org.apache.pig.PigServer.store(PigServer.java:760)
at org.apache.pig.PigServer.openIterator(PigServer.java:680)
... 7 more
Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2212: 
Unable to prune plan
at 
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:826)
at 
org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:240)
at org.apache.pig.PigServer.compileLp(PigServer.java:1180)
at org.apache.pig.PigServer.storeEx(PigServer.java:799)
... 9 more
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2188: Cannot prune 
columns for (Name: b: ForEach 1-50 Operator Key: 1-50)
at 
org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:177)
at 
org.apache.pig.impl.logicalLayer.ColumnPruner.visit(ColumnPruner.java:202)
at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:132)
at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:47)
at 
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at 
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:821)
... 12 more
Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2185: 
Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) inconsistent pruning
at 
org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:148)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1493) Column Pruner throw exception inconsistent pruning

2010-07-11 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1493:


Attachment: PIG-1493-1.patch

 Column Pruner throw exception inconsistent pruning
 

 Key: PIG-1493
 URL: https://issues.apache.org/jira/browse/PIG-1493
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0, 0.8.0

 Attachments: PIG-1493-1.patch


 The following script fail:
 {code}
 a = load '1.txt' as (a0:chararray, a1:chararray, a2);
 b = foreach a generate CONCAT(a0,a1) as b0, a0, a2;
 c = foreach b generate a0, a2;
 dump c;
 {code}
 Error message:
 ERROR 2185: Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) 
 inconsistent pruning
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
 open iterator for alias c
 at org.apache.pig.PigServer.openIterator(PigServer.java:698)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:595)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:291)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
 at org.apache.pig.Main.run(Main.java:451)
 at org.apache.pig.Main.main(Main.java:103)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: 
 Unable to store alias c
 at org.apache.pig.PigServer.storeEx(PigServer.java:804)
 at org.apache.pig.PigServer.store(PigServer.java:760)
 at org.apache.pig.PigServer.openIterator(PigServer.java:680)
 ... 7 more
 Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2212: 
 Unable to prune plan
 at 
 org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:826)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:240)
 at org.apache.pig.PigServer.compileLp(PigServer.java:1180)
 at org.apache.pig.PigServer.storeEx(PigServer.java:799)
 ... 9 more
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2188: Cannot 
 prune columns for (Name: b: ForEach 1-50 Operator Key: 1-50)
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:177)
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.visit(ColumnPruner.java:202)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:132)
 at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:47)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:821)
 ... 12 more
 Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2185: 
 Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) inconsistent pruning
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:148)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1493) Column Pruner throw exception inconsistent pruning

2010-07-11 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1493:


Status: Patch Available  (was: Open)

 Column Pruner throw exception inconsistent pruning
 

 Key: PIG-1493
 URL: https://issues.apache.org/jira/browse/PIG-1493
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.8.0, 0.7.0

 Attachments: PIG-1493-1.patch


 The following script fail:
 {code}
 a = load '1.txt' as (a0:chararray, a1:chararray, a2);
 b = foreach a generate CONCAT(a0,a1) as b0, a0, a2;
 c = foreach b generate a0, a2;
 dump c;
 {code}
 Error message:
 ERROR 2185: Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) 
 inconsistent pruning
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
 open iterator for alias c
 at org.apache.pig.PigServer.openIterator(PigServer.java:698)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:595)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:291)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
 at org.apache.pig.Main.run(Main.java:451)
 at org.apache.pig.Main.main(Main.java:103)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: 
 Unable to store alias c
 at org.apache.pig.PigServer.storeEx(PigServer.java:804)
 at org.apache.pig.PigServer.store(PigServer.java:760)
 at org.apache.pig.PigServer.openIterator(PigServer.java:680)
 ... 7 more
 Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2212: 
 Unable to prune plan
 at 
 org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:826)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:240)
 at org.apache.pig.PigServer.compileLp(PigServer.java:1180)
 at org.apache.pig.PigServer.storeEx(PigServer.java:799)
 ... 9 more
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2188: Cannot 
 prune columns for (Name: b: ForEach 1-50 Operator Key: 1-50)
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:177)
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.visit(ColumnPruner.java:202)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:132)
 at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:47)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.PruneColumns.prune(PruneColumns.java:821)
 ... 12 more
 Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR 2185: 
 Column $0 of (Name: b: ForEach 1-50 Operator Key: 1-50) inconsistent pruning
 at 
 org.apache.pig.impl.logicalLayer.ColumnPruner.prune(ColumnPruner.java:148)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.