Got "broken pipe" with 2 shipped files
--------------------------------------
Key: PIG-187
URL: https://issues.apache.org/jira/browse/PIG-187
Project: Pig
Issue Type: Bug
Reporter: Xu Zhang
Assignee: Arun C Murthy
Currently, it seems the "broken pipe" error for the case where only one file is
shipped and executed has been fixed. I still get "broken pipe" error when 2
files are shipped and then one of them is used the command argument of the
other.
Here is my Pig script:
{code}
set stream.skippath '/home/xu/testdata/';
define X `MySimpleStreamApp.pl copyofstudenttab10k`
ship('./streamingscript/MySimpleStreamApp.pl',
'/home/xu/testdata/copyofstudenttab10k');
A = load '/user/pig/tests/data/singlefile/studenttab10k';
B = stream A through X as (name, age, gpa);
C = group B by name;
D = foreach C generate COUNT(B.$0);
store D into 'results_18';
{code}
Here is Pig's console output:
{noformat}
2008-04-04 18:34:15,418 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
hadoop file system at: wilbur11.labs.corp.sp1.yahoo.com:8020
2008-04-04 18:34:16,306 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job
-----
2008-04-04 18:34:16,306 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input:
[/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage()]
2008-04-04 18:34:16,307 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->[EMAIL
PROTECTED]
2008-04-04 18:34:16,307 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: [GENERATE
{[PROJECT $0],[*]}]
2008-04-04 18:34:16,307 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
2008-04-04 18:34:16,307 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: GENERATE
{[COUNT(GENERATE {[PROJECT $1]->[PROJECT $0]})]}
2008-04-04 18:34:16,308 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output:
results_18:org.apache.pig.builtin.PigStorage
2008-04-04 18:34:16,308 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
2008-04-04 18:34:16,308 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
2008-04-04 18:34:16,308 [main] INFO
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism:
-1
219190
hdfs://wilbur11.labs.corp.sp1.yahoo.com:8020/user/pig/tests/data/singlefile/studenttab10k
2008-04-04 18:34:19,383 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Pig progress = 0%
2008-04-04 18:34:44,491 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (map) tip_200804041056_0168_m_000000
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException:
Broken pipe
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:147)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
Caused by: java.lang.RuntimeException: java.io.IOException: Broken pipe
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
... 3 more
Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:134)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
... 4 more
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException:
Broken pipe
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:147)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
Caused by: java.lang.RuntimeException: java.io.IOException: Broken pipe
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
... 3 more
Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:134)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
... 4 more
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException:
Broken pipe
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:147)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
Caused by: java.lang.RuntimeException: java.io.IOException: Broken pipe
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
... 3 more
Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:134)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
... 4 more
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException:
Broken pipe
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:147)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
Caused by: java.lang.RuntimeException: java.io.IOException: Broken pipe
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
at
org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
... 3 more
Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:134)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
at
org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
... 4 more
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000000
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000001
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000002
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000003
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000004
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000005
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000006
2008-04-04 18:34:44,501 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000007
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000008
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000009
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000010
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000011
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000012
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000013
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000014
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000015
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000016
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000017
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000018
2008-04-04 18:34:44,502 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher -
Error message from task (reduce) tip_200804041056_0168_r_000019
2008-04-04 18:34:44,507 [main] ERROR org.apache.pig.tools.grunt.Grunt -
java.io.IOException: Unable to store alias null
at
org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:16)
at org.apache.pig.PigServer.registerQuery(PigServer.java:283)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:446)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:226)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:62)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
at org.apache.pig.Main.main(Main.java:265)
Caused by: org.apache.pig.backend.executionengine.ExecException:
java.io.IOException: Job failed
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:288)
at org.apache.pig.PigServer.optimizeAndRunQuery(PigServer.java:400)
at org.apache.pig.PigServer.registerQuery(PigServer.java:280)
... 5 more
Caused by: java.io.IOException: Job failed
at
org.apache.pig.backend.hadoop.executionengine.POMapreduce.open(POMapreduce.java:180)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:274)
... 7 more
2008-04-04 18:34:44,508 [main] ERROR org.apache.pig.tools.grunt.Grunt - Unable
to store alias null
{noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.