I was at least partly responsible for the pipe buffer cleanup code.

Subprocess terminates, but may have written some data to the pipe buffer
(typically 4k on Linux).  Usually the pipe buffer is empty, but in case
it's not, you don't want to lose the straggler data, you want to drain it
and close the file descriptor, because it's easier to manage the memory
than the fd.  Messy, but I didn't see a better way.

On Tue, Apr 14, 2015 at 11:31 PM, Peter Levart <peter.lev...@gmail.com>
wrote:

> Hi Roger,
>
> So I started new thread...
>
>
> On 04/14/2015 11:33 PM, Roger Riggs wrote:
>
>>
>> On 4/14/2015 11:47 AM, Peter Levart wrote:
>>
>>> I have been thinking of another small Process API update. Some people
>>> find it odd how redirected in/out/err streams are exposed:
>>>
>>> http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html
>>>
>> yep, I've read that several times.
>>
>
> To be fair, it's mostly, but not entirely correct. The part that says:
>
> " So when the child process exits, the any data waiting to be read from
> its output stream is drained into a buffer. All of it. In memory.
>
> Did you launch a process that writes a gigabyte of data to its output
> stream and then terminates? Well, friend, I sure hope you have a gigabyte
> of memory, because the JDK is going to read that sucker in and there's
> nothing you can do about it. And let's hope there's not more than 2GB of
> data, since this code basically just grows a byte[], which in Java can only
> grow to 2GB. If there's more than 2GB of data on that stream, this logic
> errors out and the data is lost forever."
>
> ...is exaggeration. This does not happen as the pipe has a bounded buffer.
> When subprocess exits, there is at most that much data left in the buffer
> (64k typically) and only that much is sucked into the Java process and the
> underlying handle closed.
>
>
>>> They basically don't like:
>>>
>>> - that exposed Input/Output streams are buffered
>>> - that underlying streams are File(Input/Output)Streams which, although
>>> the backing OS implementation are not files but pipes, don't expose
>>> selectable channels so that non-blocking event-based IO could be performed
>>> on them.
>>> - that exposed IO streams are automatically "managed" in UNIX variants
>>> of ProcessImpl which needs subtle "hacks" to do it in a perceptively
>>> transparent way (delayed close, draining input on exit and making it
>>> available after the underlying handle is already closed, ...)
>>>
>>> So I've been playing with the idea of exposing the "real" pipe channels
>>> in last couple of days. Here's the prototype I came up with:
>>>
>>> http://cr.openjdk.java.net/~plevart/jdk9-sandbox/JDK-
>>> 8046092-branch/Process.PipeChannel/webrev.01/
>>>
>>> This adds new Redirect type to the API and 3 new methods to Process that
>>> return Pipe channels when this new Redirect type is used. It's interesting
>>> that no native code changes were necessary. The behavior of pipes on
>>> Windows is a little different (perhaps because the Pipe NIO API uses
>>> sockets under the hood on Windows - why is that? Windows does have a pipe
>>> equivalent). What bothers me is that file handles opened on files (when
>>> redirecting to/from File) can be closed as soon as the subprocess is
>>> started and the subprocess is still able to read/write from the files (like
>>> with UNIX). It's not the same with pipe (i.e. socket) handles on Windows.
>>> They must be closed only after subprocess exits.
>>>
>>> If this subtle difference between file handles and socket handles on
>>> Windows could be dealt with (perhaps some options exist that affect
>>> subprocess spawning), then the extra waiting thread would not be needed on
>>> Windows.
>>>
>>> So what do you think of this API update?
>>>
>> Definitely worthy of a separate thread.  It looks promising and addresses
>> some of the issues
>> raised, while moving other problems from the implementation to the
>> application.
>> Such as closing of the channels and cleanup.  I worry about how the
>> resources are freed
>> if the code spawning the app doesn't do the cleanup.  Will it require
>> hooks (like a finalizer)
>> to do the cleanup?
>> Also, it doesn't help with Martin's goal of being able to implement
>> emacs in Java since it doesn't provide pty control.
>> As you are aware the complexity in Process is to ensure a timely cleanup
>> and
>> allowing the Process to terminate and release the process resources
>> when it was done and not having to wait for the stdout/stderr consumer.
>>
>
> I wonder how this automatic stream cleanup really helps in real-world
> programs. It doesn't help the Process to terminate and release the process
> resources any sooner as the process terminates on it's own (unless killed)
> and OS releases it's resources without the outside help anyway. Draining
> and closing the stream after the process has already exited just releases
> one file handle (the consuming side of the pipe) in a promptly manner. This
> could be left to the user and/or finalizer. Draining after the process has
> already exited does not help the process to exit any sooner as it happens
> after the fact. A program that doesn't consume the stream can cause the
> process to hang forever as the pipe's buffer is bounded (64k typically). So
> draining and closing after the process has exited only potentially helps
> for the last 64k of the stream and only to release one file handle in a
> potentially more timely manner.
>
> OTOH now that ProcessImpl for UNIX does that (and why does Windows
> implementation not do that?) sloppy programs might exist that would
> potentially break if the status quo is not maintained.
>
> But new functionality need not be so permissive. I'll take a look at how
> and if Channel(s) do any kind of automatic cleanup based on reachability
> and whether this can be bolted on for Process use. I doubt it is possible
> to drain and close a Channel without disturbing the ongoing Selector IO
> processing...
>
> Regards, Peter
>
>
>> Thanks, Roger
>>
>>
>>
>

Reply via email to