Re: [haskell-pipes] Re: pipes-process

Gabriel Gonzalez Sat, 22 Feb 2014 20:09:21 -0800

Yeah, prioritizing stderr makes sense in general.

On 02/23/2014 09:50 AM, Patrick Wheeler wrote:

Thanks for putting this out, I just got through a few experiments withShelly, pipes-shell, and Jeremy Shaw's pipes-process repo.

Using `cmdOut <|> cmdErr ` to mix the cmdOut and the cmdErr out worksuntil the cmdOut buffer is always full and the cmdErr is offset untilthe cmdOut is empty.

I have heard of non-determinism in the order of cmdOut and cmdErrcausing errors in the shell scripting world, though rarely. Doesanyone know of a work around for this? A few google searches tells methat you can not depend on the ordering of the two and if there is awork around it is not standard knowledge.

I did learn that the cmdOut is buffered while the cmdErr is normallynot. This might be better modeled by giving cmdErr the right of wayover cmdOut as `cmdErr <|> cmdOut' this way the error is more likelyto end up close to the the associated output if any.


Patrick

On Sat, Feb 22, 2014 at 8:14 PM, Gabriel Gonzalez<[email protected] <mailto:[email protected]>> wrote:


    Alright, I wrote up what I had in mind and you can find my draft here:

    https://github.com/Gabriel439/pipes-process


    On 02/17/2014 05:10 AM, Daniel Díaz wrote:

    I recently had to work with Ruby's 'Open3' package, and that got
    me thinking about this thread again.

    I have cobbled together a few helper functions and wrappers over
    System.Process that implement some of the ideas floated in the
    thread. Ideas like avoiding deadlock by reading continuously from
    the handles and buffering the results in memory. I've  also tried
    to avoid throwing exceptions, making errors explicit in the type
    signatures.

    The repo is at https://github.com/danidiaz/process-streaming

    and some exampes at:
    
https://github.com/danidiaz/process-streaming/blob/milestone2/examples/Main.hs

    -- stdout and stderr to different files, using pipes-safe.
    example1 :: IO (Either String ((),()))
    example1 = ec show $
       execute2 (proc "script1.bat" [])
                show
                (consume "stdout.log")
                (consume "stderr.log")
       where
       consume file = surely . safely . useConsumer $
                          S.withFile file WriteMode toHandle

    The code is not exactly well tested, I must say.

    Any comments or suggestions welcome!

    On Wednesday, December 4, 2013 7:27:16 PM UTC+1, Gabriel Gonzalez
    wrote:

        If you want to keep the buffers in memory, this is exactly
        what `pipes-concurrency` does.  Just use `spawn` to create a

buffer that you can write to and read from at your leisure.It lets you specify a bounded or unlimited buffer size.


        This will also make sure that consumers of the buffers
        properly wait for more input when they exhaust the buffer and
        terminate when the buffer is done.  You don't need to keep
        track of the number of bytes written to the buffer.

        I'm not certain this is the best approach, yet, because I
        haven't had time to think about this yet, but I just wanted
        to mention this potential solution to what you just described.

        On 12/04/2013 04:23 PM, Daniel Díaz wrote:

        To avoid the possibility of filling the output buffers and
        blocking the process, while still keeping separate stdout
        and stderr producers, perhaps two temporary files could be
        created. Stdout would be written to one and stderr to the
        other. Clients would read the temporary files as they are
        being written, but would always block before reaching the
        "not yet written" zone (we would ensure this by keeping
        track of the number of bytes written to each file.)

        Or perhaps these intermediate buffers could be kept in
        memory, if they didn't grew too big.

        Could this work?

        On Wednesday, September 25, 2013 5:48:16 AM UTC+2, Jeremy
        Shaw wrote:

            On Tue, Sep 24, 2013 at 1:46 PM, John Wiegley
            <[email protected]> wrote:

                >>>>> Gabriel Gonzalez <[email protected]> writes:

                >     readProcess :: Process -> Producer (Either
                ByteString ByteString) (SafeT
                > m) ()

                Wouldn't it be better to give two Producers, one for
                stdout and one for stdin?
                They be written two at the same time by the process,
                can't they?  It would
                then seem odd that they can only be processed in
                sequence.


            I assume you mean one for stdout and one for *stderr*?

            Alas, the unix process model is so fundamentally stupid
            that I think we really need both variants. Many
            command-line apps are run from the command-line where
            stdout and stderr are interleaved in a somewhat
            arbitrary manner. But, there is some time-based
            information there -- even if there is a bit of
            fuzziness. For example, an app could print several lines
            of success to stdout, some error message to stderr, and
            more success to stdout. So, the stuff on stderr is
            presented in the context of what happened around the
            same time on stdout.

            If you treat them as two completely independent sources,
            then you lose that temporal context.

            So, I think it is useful to have a version that does
            interleave the stdout/stderr in whatever order it seems
            to get them. In theory you can just use partitionEithers
            to separate them if you don't want them interleaved like
            that. But that is not always the most convenient thing
            to do. It's clear that there are times when it seems
            like having stdout and stderr be separate Producers
            would be the most convenient solution.

            On the other hand -- I think there is a real danger to
            have two Producers, one for stdout and one for stderr.
            Let's say you only care about stdout and you don't do
            anything with stderr. Since you are ignoring it, nobody
            is reading from stderr and now stderr is at risk at
            blocking due to having a full output buffer, and the
            whole process may then block. Even worse, maybe you do
            care about stdout and stderr, but you try do something
            where you first write all of stdout to a file, and then
            all of stderr. You could still end up blocked. If you
            want to safely process stdout and stderr separately,
            then I think you must do that in separate threads so
            that you don't deadlock?

            I think it is necessary that we always read data from
            stdout and stderr when it becomes available, though we
            can choose to discard one or the other if we don't
            actually want it.

            Now, we should also note that a similar problem exists
            in the current code. If we start the process and use
            only writeProcess, but not readProcess, then the process
            might block trying to write output and the input will
            never get read.

            So modeling a process as a Pipe does not work, but
            modelling it an independent Consumer and Producer is not
            entirely correct either. There is, in fact, some
            interaction between the Consumer and Producer ends of a
            process -- but not in a way that we can really reason
            about it?

            still.. I feel like allow the user to read only stdout
            or only stderr is asking for more trouble than allow the
            user to call only readProcess vs only writeProcess.

            Unfortunately, it is extremely easy to deadlock when
            calling a unix process that streams both inputs and
            outputs. I wonder if there is another way we can wrap a
            process into a pipe that is safer?

            - jeremy

--You received this message because you are subscribed to the

        Google Groups "Haskell Pipes" group.
        To unsubscribe from this group and stop receiving emails
        from it, send an email to [email protected].
        To post to this group, send email to [email protected].

--You received this message because you are subscribed to the

    Google Groups "Haskell Pipes" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:[email protected]>.
    To post to this group, send email to
    [email protected]
    <mailto:[email protected]>.

--You received this message because you are subscribed to the Google

    Groups "Haskell Pipes" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:haskell-pipes%[email protected]>.
    To post to this group, send email to
    [email protected]
    <mailto:[email protected]>.




--
Patrick Wheeler
[email protected] <mailto:[email protected]>
[email protected] <mailto:[email protected]>
[email protected] <mailto:[email protected]>
--

You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected].

To post to this group, send email to [email protected].


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Re: pipes-process

Reply via email to