[ 
https://issues.apache.org/jira/browse/PIG-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated PIG-182:
------------------------------

    Attachment: PIG-182_0_20080404.patch

Xu, after discussions with Olga we both concluded that we need to tweak 
semantics of ship().

Now we do not auto-ship if the file is an absolute path. Also we use 
DistributedCache instead of jar and hence the files are available in cwd of the 
task itself and with changes implemented by this patch 'myscript' will work and 
you don't have to use './myscript'.

So, could u please try this again without any abs path?

This patch also fixes ExecutableManager to use bash for launching the streaming 
command so that PATH and other env. variables can work properly ... it also has 
a fix to DataCollector.finishPipe to fix a error-handling bug.

> Broken pipe if excuting the streaming script via the stream command directory
> -----------------------------------------------------------------------------
>
>                 Key: PIG-182
>                 URL: https://issues.apache.org/jira/browse/PIG-182
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Xu Zhang
>            Assignee: Arun C Murthy
>            Priority: Blocker
>         Attachments: MySimpleStreamApp.pl, PIG-182_0_20080404.patch, 
> script.pig
>
>
> I got "broken pipe" exception with the following Pig script.  I also attached 
> the Pig script and the perl script to this bug report.
> {code}
> A = load '/user/pig/tests/data/singlefile/studenttab10k';
> B = stream A through `perl /home/xu/streamingscript/MySimpleStreamApp.pl` as 
> (name, age, gpa); 
> store B into 'results_9';
> {code}
> Here is Pig's console output
> {noformat}
> I can't find HOD configuration for piglet, hopefully you weren't planning on 
> using HOD.
> 2008-04-02 18:37:29,214 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: wilbur11.labs.corp.sp1.yahoo.com:8020
> 2008-04-02 18:37:30,030 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce 
> Job -----
> 2008-04-02 18:37:30,030 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input: 
> [/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage()]
> 2008-04-02 18:37:30,031 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->[EMAIL 
> PROTECTED]
> 2008-04-02 18:37:30,031 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
> 2008-04-02 18:37:30,032 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
> 2008-04-02 18:37:30,032 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
> 2008-04-02 18:37:30,032 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output: 
> results_9:org.apache.pig.builtin.BinaryStorage
> 2008-04-02 18:37:30,032 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
> 2008-04-02 18:37:30,032 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: 
> -1
> 2008-04-02 18:37:30,033 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce 
> parallelism: -1
> 219190 
> hdfs://wilbur11.labs.corp.sp1.yahoo.com:8020/user/pig/tests/data/singlefile/studenttab10k
> 2008-04-02 18:37:32,889 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Pig progress = 0%
> 2008-04-02 18:37:53,985 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (map) tip_200803281454_0803_m_000000 
> java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
>         at 
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at 
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
>         at 
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
>         at 
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
>         ... 4 more
>  java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
>         at 
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at 
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
>         at 
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
>         at 
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
>         ... 4 more
>  java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
>         at 
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at 
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
>         at 
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
>         at 
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
>         ... 4 more
>  java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:152)
>         at 
> org.apache.pig.impl.eval.collector.DataCollector.finishPipe(DataCollector.java:131)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:119)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.io.IOException: Broken pipe
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at 
> org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:56)
>         at 
> org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:128)
>         at 
> org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:115)
>         at 
> org.apache.pig.impl.eval.StreamSpec$StreamDataCollector.finish(StreamSpec.java:148)
>         ... 4 more
> 2008-04-02 18:37:53,998 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000000
> 2008-04-02 18:37:53,998 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000001
> 2008-04-02 18:37:53,998 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000002
> 2008-04-02 18:37:53,998 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000003
> 2008-04-02 18:37:53,998 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000004
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000005
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000006
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000007
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000008
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000009
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000010
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000011
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000012
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000013
> 2008-04-02 18:37:53,999 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000014
> 2008-04-02 18:37:54,000 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000015
> 2008-04-02 18:37:54,001 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000016
> 2008-04-02 18:37:54,001 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000017
> 2008-04-02 18:37:54,001 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000018
> 2008-04-02 18:37:54,001 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher 
> - Error message from task (reduce) tip_200803281454_0803_r_000019
> 2008-04-02 18:37:54,005 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
> java.io.IOException: Unable to store alias null
>         at 
> org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java:16)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:283)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:446)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:226)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:62)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
>         at org.apache.pig.Main.main(Main.java:265)
> Caused by: org.apache.pig.backend.executionengine.ExecException: 
> java.io.IOException: Job failed
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:288)
>         at org.apache.pig.PigServer.optimizeAndRunQuery(PigServer.java:400)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:280)
>         ... 5 more
> Caused by: java.io.IOException: Job failed
>         at 
> org.apache.pig.backend.hadoop.executionengine.POMapreduce.open(POMapreduce.java:179)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:274)
>         ... 7 more
> 2008-04-02 18:37:54,005 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
> Unable to store alias null
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to