[ 
https://issues.apache.org/jira/browse/PIG-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507822#comment-15507822
 ] 

Koji Noguchi commented on PIG-4976:
-----------------------------------

bq. Ok, so in this case it seems that the file is not getting created in 
FileOutputHandler:

File not created is expected.  Streaming process failed so I'd rather keep it 
that way instead of risking getting a false positive by having an empty output 
file.


Trying to sort out the issue.  So far, I've only tested on my macbook.  

Code path for the error I saw were two types.
(1) With small input (including the test from Daniel's patch), 
It fails in close() at  
{code}
308                     if (inp != null && inp.returnStatus == 
POStatus.STATUS_EOP) {
309                         // signal cleanup in ExecutableManager
310                         close();
311                         return;
312                     }
{code}
which then calls 
-> ExecutableManager.close():{{114         inputHandler.close(process);}} 
somehow pass then
->ExecutableManager.close():{{160             outputHandler.bindTo("", null, 0, 
-1);}} fails with 
{noformat}
2016-09-20 16:59:42,425 [Thread-30] ERROR 
org.apache.pig.impl.streaming.ExecutableManager - Error while reading from 
POStream and passing it to thes
java.io.FileNotFoundException: foo (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at 
org.apache.pig.impl.streaming.FileOutputHandler.bindTo(FileOutputHandler.java:57)
        at 
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:160)
        at 
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:131)
        at 
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:310)
{noformat}
then outer catch block call killprocess and also fails with 
{noformat}
Exception in thread "Thread-30" java.lang.NullPointerException
        at 
org.apache.pig.impl.streaming.OutputHandler.close(OutputHandler.java:178)
        at 
org.apache.pig.impl.streaming.ExecutableManager.killProcess(ExecutableManager.java:184)
        at 
org.apache.pig.impl.streaming.ExecutableManager.access$200(ExecutableManager.java:52)
        at 
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:368)
{noformat}

(2) When input is large, {{326                             
inputHandler.putNext(t);}} threw Exception
{code} 
324                         try {
325                             t = (Tuple) inp.result;
326                             inputHandler.putNext(t);
327                         } catch (IOException e) {
328                             // if input type is synchronous then it could
329                             // be related to the process terminating
330                             if(inputHandler.getInputType() == 
InputType.SYNCHRONOUS) {
331                                 LOG.warn("Exception while trying to write 
to stream binary's input", e);
...
343                                 close();
344                                 return;
{code}

{noformat}
2016-09-20 17:07:12,362 [Thread-30] WARN  
org.apache.pig.impl.streaming.ExecutableManager - Exception while trying to 
write to stream binary's input
java.io.IOException: Stream closed
        at 
java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
        at java.io.OutputStream.write(OutputStream.java:116)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at 
org.apache.pig.impl.streaming.InputHandler.putNext(InputHandler.java:72)
        at 
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:326)
{noformat} 

then {{343                                 close();}} call inside the catch 
block failed at {{:{{114         inputHandler.close(process);}}}} with 

{noformat}
2016-09-20 17:07:12,365 [Thread-30] ERROR 
org.apache.pig.impl.streaming.ExecutableManager - Error while reading from 
POStream and passing it to thes
java.io.IOException: Stream closed
        at 
java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
        at java.io.OutputStream.write(OutputStream.java:116)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at 
org.apache.pig.impl.streaming.InputHandler.close(InputHandler.java:93)
        at 
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:50)
        at 
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:114)
        at 
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:131)
        at 
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:343)
{noformat}

then you would also see killprocess fails with NullPointerException just like 
in (1). 


Daniel's {{PIG-4976-1.patch}} and {{PIG-4976-2.patch}} both handles issue (2) 
at different level.  I am not sure why the provided testcase is hitting (1) in 
my environment but (2) in Daniel's environment. 

> streaming job with store clause stuck if the script fail
> --------------------------------------------------------
>
>                 Key: PIG-4976
>                 URL: https://issues.apache.org/jira/browse/PIG-4976
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.17.0
>
>         Attachments: PIG-4976-1.patch, PIG-4976-2.patch, PIG-4976-3.patch, 
> PIG-4976-4.patch
>
>
> When investigating PIG-4972, I also notice Pig job stuck when the perl script 
> have syntax error. This happens if we have output clause in stream 
> specification (means use a file as staging). The bug exist in both Tez and 
> MR, and it is not a regression.
> Here is an example:
> {code}
> define CMD `perl kk.pl` output('foo') ship('kk.pl');
> A = load 'studenttab10k' as (name, age, gpa);
> B = foreach A generate name;
> C = stream B through CMD;
> store C into 'ooo';
> {code}
> kk.pl is any perl script contain a syntax error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to