Specifying 'using PigStorage('\t')' in a script causes pig to fail
------------------------------------------------------------------
Key: PIG-340
URL: https://issues.apache.org/jira/browse/PIG-340
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Reporter: Alan Gates
Fix For: types_branch
The script
{code}
A = load '/user/pig/tests/data/perf/studenttab200m' using PigStorage('\t') as
(name, age, gpa);
B = filter A by gpa < 3.6;
store B into 'filter10pct2' using PigStorage();
{code}
fails with the error message:
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:64)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
java.io.IOException: Request for field number 2 exceeds tuple size of 1
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:139)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:64)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
while
{code}
A = load '/user/pig/tests/data/perf/studenttab200m' as (name, age, gpa);
B = filter A by gpa < 3.6;
store B into 'filter10pct2';
{code}
runs just fine over the same data. It appears that the specification of the
load function or the delimiter is causing issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.