Pig appears to hang with this Pig script
----------------------------------------
Key: PIG-186
URL: https://issues.apache.org/jira/browse/PIG-186
Project: Pig
Issue Type: Bug
Reporter: Xu Zhang
Assignee: Arun C Murthy
Priority: Critical
Attachments: DataGuaranteeTest.pl
Pig stoped at progress 56%. It seemed there had been exceptions on the the
reduce task trackers (see below). But waiting for 20 reduce tasks to time out
themselves is excruciating and blocking my other tests.
Here is my Pig script:
{code}
define X `./home/xu/streamingscript/DataGuaranteeTest.pl -n 1`
ship('/home/xu/streamingscript/DataGuaranteeTest.pl');
A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
B = group A by name;
C = foreach B generate flatten(A);
D = stream C through X;
store D into 'results_24';
{code}
Here is the exception on the reduce task trackers:
{noformat}
java.lang.RuntimeException: java.io.IOException: Cannot run program
"./home/xu/streamingscript/DataGuaranteeTest.pl": java.io.IOException: error=2,
No such file or directory
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.(StreamSpec.java:132)
at
org.apache.pig.impl.eval.StreamSpec.setupDefaultPipe(StreamSpec.java:91)
at
org.apache.pig.impl.eval.CompositeEvalSpec.setupDefaultPipe(CompositeEvalSpec.java:51)
at org.apache.pig.impl.eval.EvalSpec.setupPipe(EvalSpec.java:123)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupReducePipe(PigMapReduce.java:303)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:140)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
{noformat}
I will attach DataGuaranteeTest.pl to the report
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.