More meaningful error message could be used when a non-existing column is
accessed
----------------------------------------------------------------------------------
Key: PIG-146
URL: https://issues.apache.org/jira/browse/PIG-146
Project: Pig
Issue Type: Bug
Reporter: Xu Zhang
Priority: Minor
When accessing a non-existing column after getting the columns with a streaming
command, I got the following error which is not quite meaningfule:
{noformat}
[main] ERROR org.apache.pig.tools.grunt.Grunt -
{noformat}
Here is the sample Pig script that I used. The data file has only 3 tab
seperate fields so the streaming command on line 3 generates tuples with 3
columns. On line 4 the script tries to access the 4th column which does not
exist and thus the error above occurs.
{code}
A = load 'data;
B = foreach A generate $2, $1, $0;
C = stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2, $1}'`;
D = foreach C generate $4;
store D into 'results';
{code}
Here is what happens on my machine:
{code}
grunt> A = load 'data';
grunt> B = foreach A generate $2, $1, $0;
grunt> stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2,
$1}'`;
grunt> D = foreach C generate $4;
2008-03-11 18:53:00,376 [main] ERROR org.apache.pig.tools.grunt.GruntParser -
grunt>
{code}
A related note is that Pig behaves differently if there is no streaming command
in the Pig script. In this case, an IndexOutOfBoundsException exception is
generated at runtime.
{code}
grunt> A = load 'data';
grunt> B = foreach A generate $2, $1, $0;
grunt> D = foreach B generate $4;
grunt> store D into 'results';
2008-03-11 18:57:40,107 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job
-----
2008-03-11 18:57:40,108 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input:
[data:org.apache.pig.builtin.PigStorage()]
2008-03-11 18:57:40,109 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->GENERATE
{[PROJECT $2],[PROJECT $1],[PROJECT $0]}->GENERATE {[PROJECT $4]}]
2008-03-11 18:57:40,109 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
2008-03-11 18:57:40,110 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
2008-03-11 18:57:40,110 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
2008-03-11 18:57:40,111 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output:
results:org.apache.pig.builtin.PigStorage
2008-03-11 18:57:40,111 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
2008-03-11 18:57:40,112 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
2008-03-11 18:57:40,112 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism:
-1
2008-03-11 18:57:42,391 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Pig progress = 0%
2008-03-11 18:57:59,466 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (map) tip_200802211201_1494_m_000000
java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21,
rachel ovid)
at org.apache.pig.data.Tuple.getField(Tuple.java:153)
at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21,
rachel ovid)
at org.apache.pig.data.Tuple.getField(Tuple.java:153)
at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21,
rachel ovid)
at org.apache.pig.data.Tuple.getField(Tuple.java:153)
at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21,
rachel ovid)
at org.apache.pig.data.Tuple.getField(Tuple.java:153)
at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
2008-03-11 18:57:59,469 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200802211201_1494_r_000000
2008-03-11 18:57:59,470 [main] ERROR org.apache.pig.tools.grunt.GruntParser -
Unable to store alias D
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.