Re: [haskell-pipes] Re: pipes-process

Patrick Wheeler Sat, 22 Feb 2014 18:51:12 -0800

Thanks for putting this out, I just got through a few experiments with
Shelly, pipes-shell, and Jeremy Shaw's pipes-process repo.


Using `cmdOut  <|> cmdErr ` to mix the cmdOut and the cmdErr out works
until the cmdOut buffer is always full and the cmdErr is offset until the
cmdOut is empty.

I have heard of non-determinism in the order of cmdOut and cmdErr causing
errors in the shell scripting world, though rarely.  Does anyone know of a
work around for this?  A few google searches tells me that you can not
depend on the ordering of the two and if there is a work around it is not
standard knowledge.

I did learn that the cmdOut is buffered while the cmdErr is normally not.
 This might be better modeled by giving cmdErr the right of way over cmdOut
as `cmdErr <|> cmdOut' this way the error is more likely to end up close to
the the associated output if any.

Patrick


On Sat, Feb 22, 2014 at 8:14 PM, Gabriel Gonzalez <[email protected]>wrote:

>  Alright, I wrote up what I had in mind and you can find my draft here:
>
> https://github.com/Gabriel439/pipes-process
>
>
> On 02/17/2014 05:10 AM, Daniel Díaz wrote:
>
>  I recently had to work with Ruby's 'Open3' package, and that got me
> thinking about this thread again.
>
>  I have cobbled together a few helper functions and wrappers over
> System.Process that implement some of the ideas floated in the thread.
> Ideas like avoiding deadlock by reading continuously from the handles and
> buffering the results in memory. I've  also tried to avoid throwing
> exceptions, making errors explicit in the type signatures.
>
>  The repo is at https://github.com/danidiaz/process-streaming
>
>  and some exampes at:
> https://github.com/danidiaz/process-streaming/blob/milestone2/examples/Main.hs
>
>  -- stdout and stderr to different files, using pipes-safe.
>  example1 :: IO (Either String ((),()))
>  example1 = ec show $
>     execute2 (proc "script1.bat" [])
>              show
>              (consume "stdout.log")
>              (consume "stderr.log")
>     where
>     consume file = surely . safely . useConsumer $
>                        S.withFile file WriteMode toHandle
>
>  The code is not exactly well tested, I must say.
>
>  Any comments or suggestions welcome!
>
> On Wednesday, December 4, 2013 7:27:16 PM UTC+1, Gabriel Gonzalez wrote:
>>
>>  If you want to keep the buffers in memory, this is exactly what
>> `pipes-concurrency` does.  Just use `spawn` to create a buffer that you can
>> write to and read from at your leisure.  It lets you specify a bounded or
>> unlimited buffer size.
>>
>> This will also make sure that consumers of the buffers properly wait for
>> more input when they exhaust the buffer and terminate when the buffer is
>> done.  You don't need to keep track of the number of bytes written to the
>> buffer.
>>
>> I'm not certain this is the best approach, yet, because I haven't had
>> time to think about this yet, but I just wanted to mention this potential
>> solution to what you just described.
>>
>> On 12/04/2013 04:23 PM, Daniel Díaz wrote:
>>
>> To avoid the possibility of filling the output buffers and blocking the
>> process, while still keeping separate stdout and stderr producers, perhaps
>> two temporary files could be created. Stdout would be written to one and
>> stderr to the other. Clients would read the temporary files as they are
>> being written, but would always block before reaching the "not yet written"
>> zone (we would ensure this by keeping track of the number of bytes written
>> to each file.)
>>
>> Or perhaps these intermediate buffers could be kept in memory, if they
>> didn't grew too big.
>>
>> Could this work?
>>
>> On Wednesday, September 25, 2013 5:48:16 AM UTC+2, Jeremy Shaw wrote:
>>>
>>> On Tue, Sep 24, 2013 at 1:46 PM, John Wiegley <[email protected]>wrote:
>>>
>>>> >>>>> Gabriel Gonzalez <[email protected]> writes:
>>>>
>>>> >     readProcess :: Process -> Producer (Either ByteString ByteString)
>>>> (SafeT
>>>> > m) ()
>>>>
>>>>  Wouldn't it be better to give two Producers, one for stdout and one
>>>> for stdin?
>>>> They be written two at the same time by the process, can't they?  It
>>>> would
>>>> then seem odd that they can only be processed in sequence.
>>>>
>>>
>>>  I assume you mean one for stdout and one for *stderr*?
>>>
>>>  Alas, the unix process model is so fundamentally stupid that I think
>>> we really need both variants. Many command-line apps are run from the
>>> command-line where stdout and stderr are interleaved in a somewhat
>>> arbitrary manner. But, there is some time-based information there -- even
>>> if there is a bit of fuzziness. For example, an app could print several
>>> lines of success to stdout, some error message to stderr, and more success
>>> to stdout. So, the stuff on stderr is presented in the context of what
>>> happened around the same time on stdout.
>>>
>>>  If you treat them as two completely independent sources, then you lose
>>> that temporal context.
>>>
>>>  So, I think it is useful to have a version that does interleave the
>>> stdout/stderr in whatever order it seems to get them. In theory you can
>>> just use partitionEithers to separate them if you don't want them
>>> interleaved like that. But that is not always the most convenient thing to
>>> do. It's clear that there are times when it seems like having stdout and
>>> stderr be separate Producers would be the most convenient solution.
>>>
>>>  On the other hand -- I think there is a real danger to have two
>>> Producers, one for stdout and one for stderr. Let's say you only care about
>>> stdout and you don't do anything with stderr. Since you are ignoring it,
>>> nobody is reading from stderr and now stderr is at risk at blocking due to
>>> having a full output buffer, and the whole process may then block. Even
>>> worse, maybe you do care about stdout and stderr, but you try do something
>>> where you first write all of stdout to a file, and then all of stderr. You
>>> could still end up blocked. If you want to safely process stdout and stderr
>>> separately, then I think you must do that in separate threads so that you
>>> don't deadlock?
>>>
>>>  I think it is necessary that we always read data from stdout and
>>> stderr when it becomes available, though we can choose to discard one or
>>> the other if we don't actually want it.
>>>
>>>  Now, we should also note that a similar problem exists in the current
>>> code. If we start the process and use only writeProcess, but not
>>> readProcess, then the process might block trying to write output and the
>>> input will never get read.
>>>
>>>  So modeling a process as a Pipe does not work, but modelling it an
>>> independent Consumer and Producer is not entirely correct either. There is,
>>> in fact, some interaction between the Consumer and Producer ends of a
>>> process -- but not in a way that we can really reason about it?
>>>
>>>  still.. I feel like allow the user to read only stdout or only stderr
>>> is asking for more trouble than allow the user to call only readProcess vs
>>> only writeProcess.
>>>
>>>  Unfortunately, it is extremely easy to deadlock when calling a unix
>>> process that streams both inputs and outputs. I wonder if there is another
>>> way we can wrap a process into a pipe that is safer?
>>>
>>>  - jeremy
>>>
>>>
>>>
>>>
>>>    --
>> You received this message because you are subscribed to the Google Groups
>> "Haskell Pipes" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>>
>>
>>   --
> You received this message because you are subscribed to the Google Groups
> "Haskell Pipes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Haskell Pipes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
>



-- 
Patrick Wheeler
[email protected]
[email protected]
[email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Re: pipes-process

Reply via email to