[
https://issues.apache.org/jira/browse/PIG-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554210#comment-13554210
]
James commented on PIG-3113:
----------------------------
Sorta. The input and output streams should be processed in separate threads
before calling waitFor(). Once waitFor returns, the i/o threads can return
their contents for evaluation by executeShellCommand(). I'll see if I can put
together a patch, time permitting.
> Shell command execution hangs job
> ---------------------------------
>
> Key: PIG-3113
> URL: https://issues.apache.org/jira/browse/PIG-3113
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1
> Reporter: James
>
> Executing a shell command inside a Pig script has the potential to deadlock
> the job. For example, the following statement will block when somebigfile.txt
> is sufficiently large:
> {code}
> %declare input `cat /path/to/somebigfile.txt`
> {code}
> This happens because PreprocessorContext.executeShellCommand(String)
> incorrectly uses Runtime.exec(). The sub-process's stderr and stdout streams
> should be read in a separate thread to prevent p.waitFor() from hanging when
> the sub-process's output is larger than the output buffer.
> Per the Java Process class javadoc: "Because some native platforms only
> provide limited buffer size for standard input and output streams, failure to
> promptly write the input stream or read the output stream of the subprocess
> may cause the subprocess to block, and even deadlock".
> See http://www.javaworld.com/jw-12-2000/jw-1229-traps.html for a correct
> solution.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira