[
https://issues.apache.org/jira/browse/PIG-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated PIG-182:
------------------------------
Attachment: PIG-182_0_20080404.patch
Xu, after discussions with Olga we both concluded that we need to tweak
semantics of ship().
Now we do not auto-ship if the file is an absolute path. Also we use
DistributedCache instead of jar and hence the files are available in cwd of the
task itself and with changes implemented by this patch 'myscript' will work and
you don't have to use './myscript'.
So, could u please try this again without any abs path?
This patch also fixes ExecutableManager to use bash for launching the streaming
command so that PATH and other env. variables can work properly ... it also has
a fix to DataCollector.finishPipe to fix a error-handling bug.
> Broken pipe if excuting the streaming script via the stream command directory
> -----------------------------------------------------------------------------
>
> Key: PIG-182
> URL: https://issues.apache.org/jira/browse/PIG-182
> Project: Pig
> Issue Type: Bug
> Reporter: Xu Zhang
> Assignee: Arun C Murthy
> Priority: Blocker
> Attachments: MySimpleStreamApp.pl, PIG-182_0_20080404.patch,
> script.pig
>
>
> I got "broken pipe" exception with the following Pig script. I also attached
> the Pig script and the perl script to this bug report.
> {code}
> A = load '/user/pig/tests/data/singlefile/studenttab10k';
> B = stream A through `perl /home/xu/streamingscript/MySimpleStreamApp.pl` as
> (name, age, gpa);
> store B into 'results_9';
> {code}
> Here is Pig's console output
> {noformat}
> I can't find HOD configuration for piglet, hopefully you weren't planning on
> using HOD.
> 2008-04-02 18:37:29,214 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: wilbur11.labs.corp.sp1.yahoo.com:8020
> 2008-04-02 18:37:30,030 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce
> Job -----
> 2008-04-02 18:37:30,030 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input:
> [/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage()]
> 2008-04-02 18:37:30,031 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->[EMAIL
> PROTECTED]
> 2008-04-02 18:37:30,031 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
> 2008-04-02 18:37:30,032 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
> 2008-04-02 18:37:30,032 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
> 2008-04-02 18:37:30,032 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output:
> results_9:org.apache.pig.builtin.BinaryStorage
> 2008-04-02 18:37:30,032 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
> 2008-04-02 18:37:30,032 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism:
> -1
> 2008-04-02 18:37:30,033 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce
> parallelism: -1
> 219190
> hdfs://wilbur11.labs.corp.sp1.yahoo.com:8020/user/pig/tests/data/singlefile/studenttab10k
> 2008-04-02 18:37:32,889 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Pig progress = 0%
> 2008-04-02 18:37:53,985 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (map) tip_200803281454_0803_m_000000
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
> at
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
> at
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:260)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
> at
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
> at
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
> ... 4 more
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
> at
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
> at
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:260)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
> at
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
> at
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
> ... 4 more
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
> at
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
> at
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:260)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
> at
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
> at
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
> ... 4 more
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
> at
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
> at
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:260)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
> at
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
> at
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
> at
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
> ... 4 more
> 2008-04-02 18:37:53,998 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000000
> 2008-04-02 18:37:53,998 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000001
> 2008-04-02 18:37:53,998 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000002
> 2008-04-02 18:37:53,998 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000003
> 2008-04-02 18:37:53,998 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000004
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000005
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000006
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000007
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000008
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000009
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000010
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000011
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000012
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000013
> 2008-04-02 18:37:53,999 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000014
> 2008-04-02 18:37:54,000 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000015
> 2008-04-02 18:37:54,001 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000016
> 2008-04-02 18:37:54,001 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000017
> 2008-04-02 18:37:54,001 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000018
> 2008-04-02 18:37:54,001 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher
> - Error message from task (reduce) tip_200803281454_0803_r_000019
> 2008-04-02 18:37:54,005 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> java.io.IOException: Unable to store alias null
> at
> org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:16)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:283)
> at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:446)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:226)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:62)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
> at org.apache.pig.Main.main(Main.java:265)
> Caused by: org.apache.pig.backend.executionengine.ExecException:
> java.io.IOException: Job failed
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:288)
> at org.apache.pig.PigServer.optimizeAndRunQuery(PigServer.java:400)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:280)
> ... 5 more
> Caused by: java.io.IOException: Job failed
> at
> org.apache.pig.backend.hadoop.executionengine.POMapreduce.open(POMapreduce.java:179)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:274)
> ... 7 more
> 2008-04-02 18:37:54,005 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> Unable to store alias null
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.