Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-17 Thread Alexander Kjeldaas
On 16 May 2011 21:31, dm-list-haskell-c...@scs.stanford.edu wrote:

 At Mon, 16 May 2011 10:56:02 +0100,
 Simon Marlow wrote:
 
  Yes, it's not actually documented as far as I know, and we should fix
  that.  But if you think about it, sequential consistency is really the
  only sensible policy: suppose one processor creates a heap object and
  writes a reference to it in the IORef, then another processor reads the
  IORef.  The writes that created the heap object must be visible to the
  second processor, otherwise it will encounter uninitialised memory and
  crash.  So sequential consistency is necessary to ensure concurrent
  programs can't crash.
 
  Now perhaps it's possible to have a relaxed memory model that provides
  the no-crashes guarantee but still allows IORef writes to be reordered
  (e.g. some kind of causal consistency).  That might be important if
  there is some processor arcitecture that provides that memory model, but
  as far as I know there isn't.

 Actually, in your heap object example, it sounds like you only really
 care about preserving program order, rather than write atomicity.
 Thus, you can get away with less-than-sequential consistency and not
 crash.

 The x86 is an example of a relaxed memory model that provides the
 no-crashes guarantee you are talking about.  Specifically, the x86
 deviates from sequential consistency in two ways

  1. A load can finish before an earlier store to a different memory
 location.  [intel, Sec. 8.2.3.4]

  2. A thread can read its own writes early. [intel, 8.2.3.5]

  [Section references are to the intel architecture manual, vol 3a:
   http://www.intel.com/Assets/PDF/manual/253668.pdf]

 One could imagine an implementation of IORefs that relies on the fact
 that pointer writes are atomic and that program order is preserved to
 avoid mutex overhead for most calls.  E.g.:

  struct IORef {
spinlock_t lock; /* Only ever used by atomicModifyIORef */
HaskellValue *val;   /* Updated atomically because pointer-sized
writes are atomic */
  };

  HaskellValue *
  readIORef (struct IORef *ref)
  {
return ref-val;
  }

  void
  writeIORef (struct IORef *ref, HaskellValue *val)
  {
/* Note that if *val was initialized in the same thread, then by
 * the time another CPU sees ref-val, it will also see the
 * correct contents of *ref-val, because stores are seen in a
 * consistent order by other processors [intel, Sec. 8.2.3.7].
 *
 * If *val was initialized in a different thread, then since this
 * thread has seen it, other threads will too, because x86
 * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
 */
ref-val = val;
  }

  /* modifyIORef is built out of readIORef and writeIORef */

  HaskellValue *
  atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
  {
HaskellValue *result;
spinlock_acquire (ref-lock);

result = modifyIORef (ref, f);

spinlock_release (ref-lock);
return result;
  }

 This is actually how I assumed IORefs worked.  But then consider the
 following program:

  maybePrint :: IORef Bool - IORef Bool - IO ()
  maybePrint myRef yourRef = do
writeIORef myRef True
yourVal - readIORef yourRef
unless yourVal $ putStrLn critical section

  main :: IO ()
  main = do
r1 - newIORef False
r2 - newIORef False
forkOS $ maybePrint r1 r2
forkOS $ maybePrint r2 r1
threadDelay 100

 Under sequential consistency, the string critical section should be
 output at most once.  However, with the above IORef implementation on
 x86, since a read can finish before a write to a different location,
 both threads might see False for the value of yourVal.

 To prevent this deviation from sequential consistency, you would need
 to do something like stick an MFENCE instruction at the end of
 writeIORef, and that would slow down the common case where you don't
 care about sequential consistency.  In fact, I would argue that if you
 care about S.C., you should either be using atomicModifyIORef or
 MVars.


mfence is apparently slower than lock add.  see
http://blogs.oracle.com/dave/entry/instruction_selection_for_volatile_fences
so using mfence would make it slower than atomicModifyIORef, and with weaker
guarantees.  not a good combination.

Alexander


 Can you explain what actually happens inside the real IORef
 implementation?

 As an aside, these days one sees a lot of hand-wringing over the fact
 that CPU clock rates have been flat for a while and the only way to
 get more performance is through parallelism.  How are we going to
 teach programmers to write concurrent code when it's so hard to write
 and debug? I've heard numerous people ask.

 Haskell could be a major step in the right direction, since in the
 absence of variables, it's impossible to have data races.  (You can
 still have deadlock and other kinds of race condition, such as the one
 in maybePrint above, if you had my definition of 

Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread Simon Marlow

On 13/05/2011 21:12, Bernie Pope wrote:

On 13 May 2011 19:06, Simon Marlow marlo...@gmail.com
mailto:marlo...@gmail.com wrote:

As far as memory consistency goes, we claim to provide sequential
consistency for IORef and IOArray operations, but not for peeks and
pokes.


Hi Simon,

Could you please point me to more information about the sequential
consistency of IORefs? I was looking for something about this recently
but couldn't find it. I don't see anything in the Haddock for Data.IORef.


Yes, it's not actually documented as far as I know, and we should fix 
that.  But if you think about it, sequential consistency is really the 
only sensible policy: suppose one processor creates a heap object and 
writes a reference to it in the IORef, then another processor reads the 
IORef.  The writes that created the heap object must be visible to the 
second processor, otherwise it will encounter uninitialised memory and 
crash.  So sequential consistency is necessary to ensure concurrent 
programs can't crash.


Now perhaps it's possible to have a relaxed memory model that provides 
the no-crashes guarantee but still allows IORef writes to be reordered 
(e.g. some kind of causal consistency).  That might be important if 
there is some processor arcitecture that provides that memory model, but 
as far as I know there isn't.


For some background there was a discussion about this on the 
haskell-prime mailing list a few years ago, I think.


Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread Bernie Pope
On 16 May 2011 19:56, Simon Marlow marlo...@gmail.com wrote:

 On 13/05/2011 21:12, Bernie Pope wrote:

Could you please point me to more information about the sequential
 consistency of IORefs? I was looking for something about this recently
 but couldn't find it. I don't see anything in the Haddock for Data.IORef.


 Yes, it's not actually documented as far as I know, and we should fix that.


Thanks Simon. I was thinking about this in the context of a blog post by
Lennart Augustsson:


http://augustss.blogspot.com/2011/04/ugly-memoization-heres-problem-that-i.html

He says that There's no guarantee about readIORef and writeIORef when doing
multi-threading.. But I was wondering if that was true, and if it were,
what the consequences would be. If you read his reply to my question on the
blog, then I believe that he was saying that sequential consistency was not
guaranteed.

If you have time to read his blog article, I wonder if you could comment on
the need (or lack of need) for MVars or atomicModifyIORef?  If I understand
correctly, it would be okay to use readIORef/writeIORef, assuming that it is
okay for some computations to be repeated.


 But if you think about it, sequential consistency is really the only
 sensible policy: suppose one processor creates a heap object and writes a
 reference to it in the IORef, then another processor reads the IORef.  The
 writes that created the heap object must be visible to the second processor,
 otherwise it will encounter uninitialised memory and crash.  So sequential
 consistency is necessary to ensure concurrent programs can't crash.


Yes, I agree, and that was what I was thinking. Otherwise well-typed
programs could go horribly wrong.

For some background there was a discussion about this on the haskell-prime
 mailing list a few years ago, I think.


Thanks, I try to dig it up.

Cheers,
Bernie.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread dm-list-haskell-cafe
At Mon, 16 May 2011 10:56:02 +0100,
Simon Marlow wrote:

 Yes, it's not actually documented as far as I know, and we should fix 
 that.  But if you think about it, sequential consistency is really the 
 only sensible policy: suppose one processor creates a heap object and 
 writes a reference to it in the IORef, then another processor reads the 
 IORef.  The writes that created the heap object must be visible to the 
 second processor, otherwise it will encounter uninitialised memory and 
 crash.  So sequential consistency is necessary to ensure concurrent 
 programs can't crash.
 
 Now perhaps it's possible to have a relaxed memory model that provides
 the no-crashes guarantee but still allows IORef writes to be reordered
 (e.g. some kind of causal consistency).  That might be important if
 there is some processor arcitecture that provides that memory model, but
 as far as I know there isn't.

Actually, in your heap object example, it sounds like you only really
care about preserving program order, rather than write atomicity.
Thus, you can get away with less-than-sequential consistency and not
crash.

The x86 is an example of a relaxed memory model that provides the
no-crashes guarantee you are talking about.  Specifically, the x86
deviates from sequential consistency in two ways

  1. A load can finish before an earlier store to a different memory
 location.  [intel, Sec. 8.2.3.4]

  2. A thread can read its own writes early. [intel, 8.2.3.5]

  [Section references are to the intel architecture manual, vol 3a:
   http://www.intel.com/Assets/PDF/manual/253668.pdf]

One could imagine an implementation of IORefs that relies on the fact
that pointer writes are atomic and that program order is preserved to
avoid mutex overhead for most calls.  E.g.:

  struct IORef {
spinlock_t lock; /* Only ever used by atomicModifyIORef */
HaskellValue *val;   /* Updated atomically because pointer-sized
writes are atomic */
  };

  HaskellValue *
  readIORef (struct IORef *ref)
  {
return ref-val;
  }

  void
  writeIORef (struct IORef *ref, HaskellValue *val)
  {
/* Note that if *val was initialized in the same thread, then by
 * the time another CPU sees ref-val, it will also see the
 * correct contents of *ref-val, because stores are seen in a
 * consistent order by other processors [intel, Sec. 8.2.3.7].
 *
 * If *val was initialized in a different thread, then since this
 * thread has seen it, other threads will too, because x86
 * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
 */
ref-val = val;
  }

  /* modifyIORef is built out of readIORef and writeIORef */

  HaskellValue *
  atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
  {
HaskellValue *result;
spinlock_acquire (ref-lock);

result = modifyIORef (ref, f);

spinlock_release (ref-lock);
return result;
  }

This is actually how I assumed IORefs worked.  But then consider the
following program:

  maybePrint :: IORef Bool - IORef Bool - IO ()
  maybePrint myRef yourRef = do
writeIORef myRef True
yourVal - readIORef yourRef
unless yourVal $ putStrLn critical section

  main :: IO ()
  main = do
r1 - newIORef False
r2 - newIORef False
forkOS $ maybePrint r1 r2
forkOS $ maybePrint r2 r1
threadDelay 100

Under sequential consistency, the string critical section should be
output at most once.  However, with the above IORef implementation on
x86, since a read can finish before a write to a different location,
both threads might see False for the value of yourVal.

To prevent this deviation from sequential consistency, you would need
to do something like stick an MFENCE instruction at the end of
writeIORef, and that would slow down the common case where you don't
care about sequential consistency.  In fact, I would argue that if you
care about S.C., you should either be using atomicModifyIORef or
MVars.

Can you explain what actually happens inside the real IORef
implementation?

As an aside, these days one sees a lot of hand-wringing over the fact
that CPU clock rates have been flat for a while and the only way to
get more performance is through parallelism.  How are we going to
teach programmers to write concurrent code when it's so hard to write
and debug? I've heard numerous people ask.

Haskell could be a major step in the right direction, since in the
absence of variables, it's impossible to have data races.  (You can
still have deadlock and other kinds of race condition, such as the one
in maybePrint above, if you had my definition of IORef, but data races
are by far the most pernicious concurrency problems.)  Of course, the
key to making Haskell useful in a parallel setting is that things like
the memory model have to be fully specified...

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org

Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread dm-list-haskell-cafe
At Tue, 17 May 2011 02:18:55 +1000,
Bernie Pope wrote:
 
  http://augustss.blogspot.com/2011/04/
 ugly-memoization-heres-problem-that-i.html
 
 He says that There's no guarantee about readIORef and writeIORef when doing
 multi-threading.. But I was wondering if that was true, and if it were, what
 the consequences would be. If you read his reply to my question on the blog,
 then I believe that he was saying that sequential consistency was not
 guaranteed.

While I don't know how IORefs work and I'd love to understand this
better, I can't imagine any IORef implementation in which memoIO (in
the blog post) would give the wrong answer on x86.  It might, of
course, cause f x to be evaluated multiple times on the same x.

However, on other CPUs (e.g., the DEC alpha), there could maybe, maybe
be issues.  Though I'm not sure, since to avoid crashes, the alpha
implementation of IORef would need to include a memory barrier.  The
question is whether there is an architecture in which IORef avoids
crashes AND memoIO can give you the wrong answer.  Also, if Simon's
original post means that IORef operations all contain barrier
instructions, it could be that memoIO is simply correct and the blog
post is simply wrong about needing MVars.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread Simon Marlow

On 16/05/11 20:31, dm-list-haskell-c...@scs.stanford.edu wrote:

At Mon, 16 May 2011 10:56:02 +0100,
Simon Marlow wrote:


Yes, it's not actually documented as far as I know, and we should fix
that.  But if you think about it, sequential consistency is really the
only sensible policy: suppose one processor creates a heap object and
writes a reference to it in the IORef, then another processor reads the
IORef.  The writes that created the heap object must be visible to the
second processor, otherwise it will encounter uninitialised memory and
crash.  So sequential consistency is necessary to ensure concurrent
programs can't crash.

Now perhaps it's possible to have a relaxed memory model that provides
the no-crashes guarantee but still allows IORef writes to be reordered
(e.g. some kind of causal consistency).  That might be important if
there is some processor arcitecture that provides that memory model, but
as far as I know there isn't.


Actually, in your heap object example, it sounds like you only really
care about preserving program order, rather than write atomicity.
Thus, you can get away with less-than-sequential consistency and not
crash.

The x86 is an example of a relaxed memory model that provides the
no-crashes guarantee you are talking about.  Specifically, the x86
deviates from sequential consistency in two ways

   1. A load can finish before an earlier store to a different memory
  location.  [intel, Sec. 8.2.3.4]

   2. A thread can read its own writes early. [intel, 8.2.3.5]

   [Section references are to the intel architecture manual, vol 3a:
http://www.intel.com/Assets/PDF/manual/253668.pdf]

One could imagine an implementation of IORefs that relies on the fact
that pointer writes are atomic and that program order is preserved to
avoid mutex overhead for most calls.  E.g.:

   struct IORef {
 spinlock_t lock; /* Only ever used by atomicModifyIORef */
 HaskellValue *val;   /* Updated atomically because pointer-sized
 writes are atomic */
   };

   HaskellValue *
   readIORef (struct IORef *ref)
   {
 return ref-val;
   }

   void
   writeIORef (struct IORef *ref, HaskellValue *val)
   {
 /* Note that if *val was initialized in the same thread, then by
  * the time another CPU sees ref-val, it will also see the
  * correct contents of *ref-val, because stores are seen in a
  * consistent order by other processors [intel, Sec. 8.2.3.7].
  *
  * If *val was initialized in a different thread, then since this
  * thread has seen it, other threads will too, because x86
  * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
  */
 ref-val = val;
   }

   /* modifyIORef is built out of readIORef and writeIORef */

   HaskellValue *
   atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
   {
 HaskellValue *result;
 spinlock_acquire (ref-lock);

 result = modifyIORef (ref, f);

 spinlock_release (ref-lock);
 return result;
   }

This is actually how I assumed IORefs worked.


Right, that is how IORefs work. (well, atomicModifyIORef is a bit 
different, but the differences aren't important here)



But then consider the
following program:

   maybePrint :: IORef Bool -  IORef Bool -  IO ()
   maybePrint myRef yourRef = do
 writeIORef myRef True
 yourVal- readIORef yourRef
 unless yourVal $ putStrLn critical section

   main :: IO ()
   main = do
 r1- newIORef False
 r2- newIORef False
 forkOS $ maybePrint r1 r2
 forkOS $ maybePrint r2 r1
 threadDelay 100

Under sequential consistency, the string critical section should be
output at most once.  However, with the above IORef implementation on
x86, since a read can finish before a write to a different location,
both threads might see False for the value of yourVal.

To prevent this deviation from sequential consistency, you would need
to do something like stick an MFENCE instruction at the end of
writeIORef, and that would slow down the common case where you don't
care about sequential consistency.  In fact, I would argue that if you
care about S.C., you should either be using atomicModifyIORef or
MVars.


Good example - so it looks like we don't get full sequential consistency 
on x86 (actually I'd been thinking only about write ordering and 
forgetting that reads could be reordered around writes).


But that's bad because it means Haskell has a memory model, and we have 
to say what it is, or at least say that ordering is undefined.


In practice I don't think anyone actually does use IORef in this way. 
Typically you need at least one atomicModifyIORef somewhere, and that 
acts as a barrier.



As an aside, these days one sees a lot of hand-wringing over the fact
that CPU clock rates have been flat for a while and the only way to
get more performance is through parallelism.  How are we going to
teach programmers to write concurrent code when it's so hard to write
and debug? I've 

Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-13 Thread Simon Marlow

On 12/05/2011 18:24, dm-list-haskell-c...@scs.stanford.edu wrote:

At Thu, 12 May 2011 16:45:02 +0100,
Simon Marlow wrote:



There are no locks here, thanks to the message-passing implementation we
use for throwTo between processors.


Okay, that sounds good.  So then there is no guarantee about ordering
of throwTo exceptions?  That seems like a good thing since there are
other mechanisms for synchronization.


What kind of ordering guarantee did you have in mind?  We do guarantee
that in

 throwTo t e1
 throwTo t e2

Thread t will receive e1 before e2 (obviously, because throwTo is
synchronous and only returns when the exception has been raised).
...
Pending exceptions are processed in LIFO order (for no good reason other
than performance)...


I mean, suppose you have three CPUs, A, B, and C running threads ta,
tb, and tc.  Then should the following order of events be permitted?

 AB C
   throwTo tc e1
   throwTo tb e2
  catch e2
  poke p x
   peek p (sees x)
   catch e1

I would argue that this is just fine, that one should rely on MVars if
one cares about ordering.  But I'm not sure what Pending exceptions
are processed in LIFO order means in the presence of relaxed memory
consistency.


Oh, that can't happen.  A's first throwTo only returns when the 
exception has been raised in C - throwTo is like a synchronous 
communication in this sense.


We went to-and-fro on this aspect of the throwTo design a few times. 
The synchronous semantics for throwTo tends to be more useful for the 
programmer, but is harder to implement.  If you want asynchronous 
throwTo, you can always get it with forkIO.throwTo.


As far as memory consistency goes, we claim to provide sequential 
consistency for IORef and IOArray operations, but not for peeks and pokes.



The reason I'm asking is that I want to make sure I never end up
having to pay the overhead of an MFENCE instruction or equivalent
every time I use unmaskAsyncExceptions#...


Right, I don't think that should be necessary.

Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-13 Thread Bernie Pope
On 13 May 2011 19:06, Simon Marlow marlo...@gmail.com wrote:

As far as memory consistency goes, we claim to provide sequential
 consistency for IORef and IOArray operations, but not for peeks and pokes.


Hi Simon,

Could you please point me to more information about the sequential
consistency of IORefs? I was looking for something about this recently but
couldn't find it. I don't see anything in the Haddock for Data.IORef.

Cheers,
Bernie.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-12 Thread Simon Marlow

On 11/05/2011 23:57, dm-list-haskell-c...@scs.stanford.edu wrote:

At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:



However, if there's some simpler way to guarantee that= is the
point where exceptions are thrown (and might be the case for GHC in
practice), then I basically only need to update the docs.  If someone
with more GHC understanding could explain how asynchronous exceptions
work, I'd love to hear it...


There's no guarantee of the form that you mention - asynchronous
exceptions can occur anywhere.  However, there might be a way to do what
you want (disclaimer: I haven't looked at the implementation of iterIO).

Control.Exception will have a new operation in 7.2.1:

allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

which allows an asynchronous exception to be thrown inside mask (until
7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).


So to answer my own question from earlier, I did a bit of
benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
3060, running linux 2.6.38), I get the following costs:

  9 ns - return () :: IO ()   -- baseline (meaningless in itself)
 13 ns - unsafeUnmask $ return () -- with interrupts enabled
 18 ns - unsafeUnmask $ return () -- inside a mask_

 13 ns - ffi  -- a null FFI call (getpid cached by libc)
 18 ns - unsafeUnmask ffi -- with interrupts enabled
 22 ns - unsafeUnmask ffi -- inside a mask_


Those are lower than I was expecting, but look plausible.  There's room 
for improvement too (by inlining some or all of unsafeUnmask#).


However, the general case of unsafeUnmask E, where E is something more 
complex than return (), will be more expensive because a new closure for 
E has to be created.  e.g. try return x instead of return (), and 
try to make sure that the closure has to be created once per 
unsafeUnmask, not lifted out and shared.



131 ns - syscall  -- getppid through FFI
135 ns - unsafeUnmask syscall -- with interrupts enabled
140 ns - unsafeUnmask syscall -- inside a mask_



So it seems that the cost of calling unsafeUnmask inside every liftIO
would be about 22 cycles per liftIO invocation, which seems eminently
reasonable.  You could then safely run your whole program inside a big
mask_ and not worry about exceptions happening between=
invocations.  Though truly compute-intensive workloads could have
issues, the kind of applications targeted by iterIO will spend most of
their time doing I/O, so this shouldn't be an issue.

Better yet, for programs that don't use asynchronous exceptions, if
you don't put your whole program inside a mask_, the cost drops
roughly in half.  It's hard to imagine any real application whose
performance would take a significant hit because of an extra 11 cycles
per liftIO.

Is there anything I'm missing?  For instance, my machine only has one
CPU, and the tests all ran with one thread.  Does
unmaskAsyncExceptions# acquire a spinlock that could lock the memory
bus?  Or is there some other reason unsafeUnmask could become
expensive on NUMA machines, or in the presence of concurrency?


There are no locks here, thanks to the message-passing implementation we 
use for throwTo between processors.  unmaskeAsyncExceptions# basically 
pushes a small stack frame, twiddles a couple of bits in the thread 
state, and checks a word in the thread state to see whether any 
exceptions are pending.  The stack frame untwiddles the bits again and 
returns.


Cheers,
Simon





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-12 Thread Simon Marlow

On 12/05/2011 16:04, David Mazieres expires 2011-08-10 PDT wrote:

At Thu, 12 May 2011 09:57:13 +0100,
Simon Marlow wrote:



So to answer my own question from earlier, I did a bit of
benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
3060, running linux 2.6.38), I get the following costs:

   9 ns - return () :: IO ()   -- baseline (meaningless in itself)
  13 ns - unsafeUnmask $ return () -- with interrupts enabled
  18 ns - unsafeUnmask $ return () -- inside a mask_

  13 ns - ffi  -- a null FFI call (getpid cached by 
libc)
  18 ns - unsafeUnmask ffi -- with interrupts enabled
  22 ns - unsafeUnmask ffi -- inside a mask_


Those are lower than I was expecting, but look plausible.  There's room
for improvement too (by inlining some or all of unsafeUnmask#).


Do you mean inline unsafeUnmask, or unmaskAsyncExceptions#?  I tried
inlining unsafeUnmask by writing my own version and giving it the
INLINE pragma, and it didn't affect performance at all.


Right, I meant inlining unmaskAsyncExceptions#, which would require 
compiler support.



However, the general case of unsafeUnmask E, where E is something more
complex than return (), will be more expensive because a new closure for
E has to be created.  e.g. try return x instead of return (), and
try to make sure that the closure has to be created once per
unsafeUnmask, not lifted out and shared.


Okay.  I'm surprised by getpid example wasn't already stressing this,
but, indeed, I see a tiny difference with the following code:

ffi= return . (1 +) -- where ffi calls getpid

13 ns - no unmasking
20 ns - unsafeUnmask when not inside _mask
25 ns - unsafeUnmask when benchmark loop in inside one big _mask

So now we're talking about 28 cycles or something instead of 22.
Still not a huge deal.


Ok, sounds reasonable.


There are no locks here, thanks to the message-passing implementation we
use for throwTo between processors.


Okay, that sounds good.  So then there is no guarantee about ordering
of throwTo exceptions?  That seems like a good thing since there are
other mechanisms for synchronization.


What kind of ordering guarantee did you have in mind?  We do guarantee 
that in


   throwTo t e1
   throwTo t e2

Thread t will receive e1 before e2 (obviously, because throwTo is 
synchronous and only returns when the exception has been raised).


Pending exceptions are processed in LIFO order (for no good reason other 
than performance), so there's no kind of fairness guarantee of the kind 
you get with MVars.  One thread doing throwTo can be starved by others. 
 I don't think that's a serious problem.


Cheers,
Simon



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-12 Thread dm-list-haskell-cafe
At Thu, 12 May 2011 16:45:02 +0100,
Simon Marlow wrote:
 
  There are no locks here, thanks to the message-passing implementation we
  use for throwTo between processors.
 
  Okay, that sounds good.  So then there is no guarantee about ordering
  of throwTo exceptions?  That seems like a good thing since there are
  other mechanisms for synchronization.
 
 What kind of ordering guarantee did you have in mind?  We do guarantee 
 that in
 
 throwTo t e1
 throwTo t e2
 
 Thread t will receive e1 before e2 (obviously, because throwTo is 
 synchronous and only returns when the exception has been raised).
 ...
 Pending exceptions are processed in LIFO order (for no good reason other 
 than performance)...

I mean, suppose you have three CPUs, A, B, and C running threads ta,
tb, and tc.  Then should the following order of events be permitted?

AB C
  throwTo tc e1
  throwTo tb e2
 catch e2
 poke p x
  peek p (sees x)
  catch e1

I would argue that this is just fine, that one should rely on MVars if
one cares about ordering.  But I'm not sure what Pending exceptions
are processed in LIFO order means in the presence of relaxed memory
consistency.

The reason I'm asking is that I want to make sure I never end up
having to pay the overhead of an MFENCE instruction or equivalent
every time I use unmaskAsyncExceptions#...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Mon, 9 May 2011 17:55:17 +0100,
John Lato wrote:
 
 Felipe Almeida Lessa wrote:

  So, in the enumerator vs. iterIO challenge, the only big differences I
 see are:
 
   a) iterIO has a different exception handling mechanism.
   b) iterIO can have pure iteratees that don't touch the monad.
   c) iterIO's iteratees can send control messages to ther enumerators.
   d) iterIO's enumerators are enumeratees, but enumerator's enumerators
  are simpler.
   e) enumerator has fewer dependencies.
   f) enumerator uses conventional nomenclature.
   g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
  MPTC and functional dependencies).
 

 'a' is important, but I think a lot of people underestimate the
 value of 'c', which is why a control system was implemented in
 'iteratee'. ...  it's relatively simple for an enumerator to return
 data to an iteratee using an IORef for example.

Would you just embed IORefs for the result into an Exception type?
That's actually a pretty simple solution when you can do it.  It's a
bit harder for my setting, because I'm using this stuff in support of
a research project that doesn't make the IO Monad available to most
code.  I'd like to write Inums/Enumeratees that work with both the IO
Monad and my own weird monads.  This is admittedly a fringe problem,
so IORef is probably fine for most settings.  But if there's any
possible way you could do it with STRefs, that would be really cool...

After further thought, though, I'm still not 100% satisfied with
iterIO's control mechanism.  Someone earlier in this thread pointed
out that my SSL module doesn't support STARTTLS particularly
conveniently.  I read that and decided to go add a function to make
STARTTLS really convenient.  What I came up with ended up using MVars
to communicate the switch from the enumerator to the iteratee and was
ugly enough that I did not commit it.

What you really want is the ability to send both upstream and
downstream control messages.  Right now, I'd say iterIO has better
support for upstream control messages, while iteratee has better
support for downstream messages, since iteratee can just embed an
Exception in a Stream.  (I'm assuming you could have something like a
'Flush' exception to cause output to be flushed by an Iteratee that
was for some reason buffering some.)

I'm curious how this works in practice, though.  What is the
convention for Enumeratees receiving Exceptions they don't know about
in the Stream?  Are they supposed to throw the exceptions upwards
(which wouldn't help), or propagate them downwards.  And how do they
synchronize exceptions?  Suppose you have a pipeline with an
Enumeratee transcoding utf8 bytes to Chars, and another implementing
text compression or something that requires buffering:

   ByteString  +--+   [Char]+--+  [Char]
   -- | UTF8-DECODER | -- |  BUFFER  | 
   +--+ +--+

Now say a Stream with EOF (Just Flush) arrives at the UTF8-DECODER in
the middle of a multi-byte character.  Do you defer the Flush until
the character is complete, or let it skip ahead to the end of the
previous character and immediately send it to the next state?  Or,
worse, propagate it back up as an uncaught exception?

  And adding support to keep track of the stream position would be a
 pretty simple (and possibly desirable) change.

Can you explain how iteratee could keep track of the stream position?
I'm not saying it's impossible, just that it's a challenging puzzle to
make the types come out and I'd love to see the solution.  Somehow you
would need to pass the onCont continuation to itself to preserve it,
and then type a gets in the way because it's possibly no longer the
right type.  In other words, you could try something like:

{-# LANGUAGE Rank2Types #-}

data Iteratee s m a =
Iteratee (forall r. (a - Stream s - m r) - OnCont s m a r - m r)

data OnCont s m a r =
OnCont (OnCont s m a r - (Stream s - Iteratee s m a)
   - Maybe SomeException - m r)

But now I have no way of using or unpacking the OnCont in an Iteratee
that doesn't return type a, and in general a control handler has no
idea what type the iteratee that threw the exception has--it's in fact
likely a different type from whatever enclosing function is wrapped by
a catch call.  Even if you do solve the type problem, another problem
is that you don't know how many times you need to call the
continuation function before you stop getting buffered data and start
actually causing IO to happen.

Part of the reason iterIO doesn't have this problem is that iterIO's
Chunk structure (which is vaguely equivalent to iteratee's Stream) is
a Monoid, so it's really easy to save up multiple chunks of residual
and ungotten data.  Every Iter is passed all buffered data of its
input type in its entirety (and the inner pipeline stages can 

Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread Simon Marlow

On 06/05/2011 16:56, dm-list-haskell-c...@scs.stanford.edu wrote:

At Fri, 6 May 2011 10:15:50 +0200,
Gregory Collins wrote:


Hi David,

Re: this comment from catchI:


It is not possible to catch asynchronous exceptions, such as
lazily evaluated divide-by-zero errors, the throw function, or
exceptions raised by other threads using throwTo if those
exceptions might arrive anywhere outside of a liftIO call.


It might be worth investigating providing a version which can catch
asynchronous exceptions if the underlying monad supports it (via
MonadCatchIO or something similar). One of the most interesting
advantages I can see for IterIO over the other iteratee
implementations is that you actually have some control over resource
usage -- not being able to catch asynchronous exceptions nullifies
much of that advantage. A clear use case for this is timeouts on
server threads, where you typically throw a TimeoutException exception
to the handling thread using throwTo if the timeout is exceeded.


Excellent point.  There's actually a chance that iterIO already
catches those kinds of exceptions, but I wasn't sure enough about how
the Haskell runtime works to make that claim.  I've noticed in
practice that asynchronous exceptions tend to come exactly when I
execute the IO= operation.  If that's true, then since each IO=
is wrapped in a try block, the exceptions will all be caught (well,
not divide by zero, but things like throwTo, which I think are more
important).

One way I was thinking of implementing this was wrapping the whole
execution in block, and then calling unblock (unless iterIO's own
hypothetical block function is called) for every invocation of liftIO.
Unfortunately, the block and unblock functions now seem to be
deprecated, and the replacement mask/unmask ones would not be as
amenable to this technique.

However, if there's some simpler way to guarantee that= is the
point where exceptions are thrown (and might be the case for GHC in
practice), then I basically only need to update the docs.  If someone
with more GHC understanding could explain how asynchronous exceptions
work, I'd love to hear it...


There's no guarantee of the form that you mention - asynchronous 
exceptions can occur anywhere.  However, there might be a way to do what 
you want (disclaimer: I haven't looked at the implementation of iterIO).


Control.Exception will have a new operation in 7.2.1:

  allowInterrupt :: IO ()
  allowInterrupt = unsafeUnmask $ return ()

which allows an asynchronous exception to be thrown inside mask (until 
7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).


As I like saying, mask switches from fully asynchronous mode to polling 
mode, and allowInterrupt is the way you poll.


Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
 
 There's no guarantee of the form that you mention - asynchronous 
 exceptions can occur anywhere.  However, there might be a way to do what 
 you want (disclaimer: I haven't looked at the implementation of iterIO).
 
 Control.Exception will have a new operation in 7.2.1:
 
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
 
 which allows an asynchronous exception to be thrown inside mask (until 
 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

Ah.  I didn't know about unsafeUnmask.  Is unmaskAsyncExceptions# low
enough overhead that it would be reasonable to wrap every invocation
of liftIO in unsafeUnmask?

I'm now thinking it might be reasonable to execute all liftIO actions
inside unsafeUnmask (with maybe a special liftIOmasked function for
those few places where you don't want asynchronous exceptions).  Most
of the uses of mask are because you need two or more binds to execute
without interruption, e.g.:

bracket before after thing =
  mask $ \restore - do
a - before
-- Big problem if exception happens here --
r - restore (thing a) `onException` after a
_ - after a
return r

But when bind sites are the only place an exception can be thrown,
things get a lot simpler.  For instance, it is perfectly reasonable to
write:

bracket before after thing = do
  a - before
  thing a `finallyI` after a

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
 
 There's no guarantee of the form that you mention - asynchronous 
 exceptions can occur anywhere.  However, there might be a way to do what 
 you want (disclaimer: I haven't looked at the implementation of iterIO).
 
 Control.Exception will have a new operation in 7.2.1:
 
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
 
 which allows an asynchronous exception to be thrown inside mask (until 
 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

Ah.  I didn't know about unsafeUnmask.  Is unmaskAsyncExceptions# low
enough overhead that it would be reasonable to wrap every invocation
of liftIO in unsafeUnmask?

I'm now thinking it might be reasonable to execute all liftIO actions
inside unsafeUnmask (with maybe a special liftIOmasked function for
those few places where you don't want asynchronous exceptions).  Most
of the uses of mask are because you need two or more binds to execute
without interruption, e.g.:

bracket before after thing =
  mask $ \restore - do
a - before
-- Big problem if exception happens here --
r - restore (thing a) `onException` after a
_ - after a
return r

But when bind sites are the only place an exception can be thrown,
things get a lot simpler.  For instance, it is perfectly reasonable to
write:

bracket before after thing = do
  a - before
  thing a `finallyI` after a

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
 
  However, if there's some simpler way to guarantee that= is the
  point where exceptions are thrown (and might be the case for GHC in
  practice), then I basically only need to update the docs.  If someone
  with more GHC understanding could explain how asynchronous exceptions
  work, I'd love to hear it...
 
 There's no guarantee of the form that you mention - asynchronous 
 exceptions can occur anywhere.  However, there might be a way to do what 
 you want (disclaimer: I haven't looked at the implementation of iterIO).
 
 Control.Exception will have a new operation in 7.2.1:
 
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
 
 which allows an asynchronous exception to be thrown inside mask (until 
 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

So to answer my own question from earlier, I did a bit of
benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
3060, running linux 2.6.38), I get the following costs:

 9 ns - return () :: IO ()   -- baseline (meaningless in itself)
13 ns - unsafeUnmask $ return () -- with interrupts enabled
18 ns - unsafeUnmask $ return () -- inside a mask_

13 ns - ffi  -- a null FFI call (getpid cached by libc)
18 ns - unsafeUnmask ffi -- with interrupts enabled
22 ns - unsafeUnmask ffi -- inside a mask_

   131 ns - syscall  -- getppid through FFI
   135 ns - unsafeUnmask syscall -- with interrupts enabled
   140 ns - unsafeUnmask syscall -- inside a mask_

So it seems that the cost of calling unsafeUnmask inside every liftIO
would be about 22 cycles per liftIO invocation, which seems eminently
reasonable.  You could then safely run your whole program inside a big
mask_ and not worry about exceptions happening between =
invocations.  Though truly compute-intensive workloads could have
issues, the kind of applications targeted by iterIO will spend most of
their time doing I/O, so this shouldn't be an issue.

Better yet, for programs that don't use asynchronous exceptions, if
you don't put your whole program inside a mask_, the cost drops
roughly in half.  It's hard to imagine any real application whose
performance would take a significant hit because of an extra 11 cycles
per liftIO.

Is there anything I'm missing?  For instance, my machine only has one
CPU, and the tests all ran with one thread.  Does
unmaskAsyncExceptions# acquire a spinlock that could lock the memory
bus?  Or is there some other reason unsafeUnmask could become
expensive on NUMA machines, or in the presence of concurrency?

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-09 Thread John Lato

 From: dm-list-haskell-c...@scs.stanford.edu

 At Fri, 6 May 2011 10:10:26 -0300,
 Felipe Almeida Lessa wrote:

  So, in the enumerator vs. iterIO challenge, the only big differences I
 see are:
 
   a) iterIO has a different exception handling mechanism.
   b) iterIO can have pure iteratees that don't touch the monad.
   c) iterIO's iteratees can send control messages to ther enumerators.
   d) iterIO's enumerators are enumeratees, but enumerator's enumerators
  are simpler.
   e) enumerator has fewer dependencies.
   f) enumerator uses conventional nomenclature.
   g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
  MPTC and functional dependencies).
 
  Anything that I missed?
 
  The bottomline: the biggest advantage I see right now in favor of
  iterIO is c),

 I basically agree with this list, but think you are underestimating
 the value of a.  I would rank a as the most important difference
 between the packages.  (a also is the reason for d.)


'a' is important, but I think a lot of people underestimate the value of
'c', which is why a control system was implemented in 'iteratee'.  I would
argue that iteratee's control system is more powerful than you say.  For
example, the only reason iteratee can't implement tell is because it doesn't
keep track of the position in the stream, it's relatively simple for an
enumerator to return data to an iteratee using an IORef for example.  And
adding support to keep track of the stream position would be a pretty simple
(and possibly desirable) change.  But it's definitely not as sophisticated
as IterIO, and probably won't become so unless I have need of those
features.

I like the MonadTrans implementation a lot.  The vast majority of iteratees
are pure, and GHC typically produces more efficient code for pure functions,
so this is possibly a performance win.  Although it makes something like the
mutable-iter package very difficult to implement...

John Lato
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-08 Thread Richard O'Keefe

On 7/05/2011, at 2:44 PM, Mario Blažević wrote:
 As I said, the most usual name for the Enumerator concept would be Generator.
 That term is already used in several languages to signify this kind of
 restricted coroutine. I'm not aware of any good alternative naming for 
 Iteratee.

This being Haskell, I'm expecting to see Cogenerator (:-) (:-).


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-07 Thread Maciej Marcin Piechotka
Sorry for third post but I wonder why the many instances are restricted
by Monad.

Both Functor and Applicative can by constructed without Monad:

 instance (Functor m) = Functor (CtlArg t m) where
 fmap f (CtlArg arg g c) = CtlArg arg (fmap f . g) c
 
 instance (Functor m) = Functor (Iter t m) where
 {-# INLINE fmap #-}
 fmap f (Iter g) = Iter (fmap f . g

 instance (Functor m) = Functor (IterR t m) where
 fmap f (IterF i) = IterF (fmap f i)
 fmap f (IterM i) = IterM (fmap (fmap f) i)
 fmap f (IterC c) = IterC (fmap f c)
 fmap f (Done a c) = Done (f a) c
 fmap f (Fail i m mc) = Fail i (fmap f m) mc

 instance (Functor m) = Applicative (Iter t m) where
 {-# INLINE pure #-}
 pure x = Iter $ Done x
 {-# INLINE (*) #-}
 Iter a * bi@(Iter b) = Iter $ \c - fix (\f ir - case ir of
 IterF cont - cont * bi
 IterM m - IterM $ fmap f m
 IterC (CtlArg a cn ch) -
 IterC (CtlArg a (\r - cn r * bi) ch)
 Done v ch - fmap v (b ch)
 Fail f _ ch - Fail f Nothing ch) a c

Since every monad is applicative (or rather should be) it doesn't loose
generality.

Join is also defined by using only functor:

 joinI :: (Functor m) = Iter t m (Iter t m a) - Iter t m a
 joinI (Iter i) = Iter $ \c - fix (\f x - case x of
  IterF cont - IterF (joinI cont)
  IterM m - IterM $ fmap f m
  IterC (CtlArg a cn ch) -
  IterC (CtlArg a (\r - joinI (cn r)) ch)
  Done v ch - runIter v ch
  Fail f _ ch - Fail f Nothing ch) (i c)

Regards

PS. I haven't tested the code or benchmarked it - but it seems it is
possible.


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-07 Thread dm-list-haskell-cafe
At Sat, 07 May 2011 21:50:13 +0100,
Maciej Marcin Piechotka wrote:
 
 Sorry for third post but I wonder why the many instances are restricted
 by Monad.

It would be great if Functor were a superclass of Monad.  However,
since it isn't, and since I can't think of anything particularly
useful to do with Iters over Functors that aren't also Monads, I'd
rather just pass one dictionary around than two.  So my convention
throughout the library is that m has to be a Monad but doesn't have to
be a Functor.

In general, I try to place as few requirements in the contexts of
functions as possible.  However, I also want to be able to call most
functions from most other ones.  If some of the useful low-level
functions end up requiring Functor, then most functions in the library
are going to end up requiring (Functor m, Monad m) = instead of
(Monad m) =, which will actually end up increasing the amount of
stuff in contexts.

(Of course, (Iter t m) itself is an Applicative Functor, even when m
is just a Monad.  So that I make use of in the parsing module.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-07 Thread wren ng thornton

On 5/7/11 5:15 PM, dm-list-haskell-c...@scs.stanford.edu wrote:

In general, I try to place as few requirements in the contexts of
functions as possible.


One counterargument to this philosophy is that there are many cases 
where fmap can be defined more efficiently than the liftM derived from 
return and (=). Similarly, the applicative operators (*) and (*) 
often admit more efficient implementations than the default.


So, when dealing with monads that have those more efficient definitions, 
you're restricting performance unnecessarily by forcing them to use the 
generic monadic definitions. There's nothing wrong with having multiple 
constraints in the context.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Gregory Collins
Hi David,

Re: this comment from catchI:

 It is not possible to catch asynchronous exceptions, such as lazily evaluated 
 divide-by-zero errors, the throw function, or exceptions raised by other 
 threads using throwTo if those exceptions might arrive anywhere outside of a 
 liftIO call.

It might be worth investigating providing a version which can catch
asynchronous exceptions if the underlying monad supports it (via
MonadCatchIO or something similar). One of the most interesting
advantages I can see for IterIO over the other iteratee
implementations is that you actually have some control over resource
usage -- not being able to catch asynchronous exceptions nullifies
much of that advantage. A clear use case for this is timeouts on
server threads, where you typically throw a TimeoutException exception
to the handling thread using throwTo if the timeout is exceeded.

Another question re: resource cleanup: in the docs I see:

 Now suppose inumHttpBody fails (most likely because it receives an EOF before 
 reading the number of bytes specified in the Content-Length header). Because 
 inumHttpBody is fused to handler, the failure will cause handler to receive 
 an EOF, which will cause foldForm to fail, which will cause handleI to 
 receive an EOF and return, which will ensure hClose runs and the file handle 
 h is not leaked.

 Once the EOFs have been processed, the exception will propagate upwards 
 making inumHttpServer fail, which in turn will send an EOF to iter. Then the 
 exception will cause enum to fail, after which sock will be closed. In 
 summary, despite the complex structure of the web server, because all the 
 components are fused together with pipe operators, corner cases like this 
 just work with no need to worry about leaked file descriptors.

Could you go into a little bit of detail about the mechanism behind this?

Thanks!

G
-- 
Gregory Collins g...@gregorycollins.net

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread David Virebayre
2011/5/6 David Mazieres dm-list-haskell-c...@scs.stanford.edu:
   * Every aspect of the library is thoroughly document in haddock
     including numerous examples of use.

I'm reading the documentation, it's impressively well detailed. It has
explanations, examples, all that one could dream for.

Thanks !

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Ertugrul Soeylemez
David Mazieres dm-list-haskell-c...@scs.stanford.edu wrote:

 Hi, everyone.  I'm pleased to announce the release of a new iteratee
 implementation, iterIO:

   http://hackage.haskell.org/package/iterIO

 IterIO is an attempt to make iteratees easier to use through an
 interface based on pipeline stages reminiscent of Unix command
 pipelines.  Particularly if you've looked at iteratees before and been
 intimidated, please have a look at iterIO to see if it makes them more
 accessible.

 [...]

 Please enjoy.  I'd love to hear feedback.

Thanks a lot, David.  This looks like really good work.  I'm using the
'enumerator' package, and looking at the types your library seems to use
a similar, but more complicated representation.  Is there any particular
reason, why you didn't base your library on an existing iteratee package
like 'enumerator'?


Greets,
Ertugrul


-- 
nightmare = unsafePerformIO (getWrongWife = sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread David Virebayre
2011/5/6 Ertugrul Soeylemez e...@ertes.de:
 David Mazieres dm-list-haskell-c...@scs.stanford.edu wrote:

 Please enjoy.  I'd love to hear feedback.

 Thanks a lot, David.  This looks like really good work.  I'm using the
 'enumerator' package, and looking at the types your library seems to use
 a similar, but more complicated representation.  Is there any particular
 reason, why you didn't base your library on an existing iteratee package
 like 'enumerator'?

David has documented some design decisions in
http://hackage.haskell.org/packages/archive/iterIO/0.1/doc/html/Data-IterIO.html#g:3

Perhaps you may find some answers there.

David (another one :) )

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Maciej Marcin Piechotka
On Thu, 2011-05-05 at 21:15 -0700, David Mazieres wrote:
 Hi, everyone.  I'm pleased to announce the release of a new iteratee
 implementation, iterIO:
 
   http://hackage.haskell.org/package/iterIO
 
 IterIO is an attempt to make iteratees easier to use through an
 interface based on pipeline stages reminiscent of Unix command
 pipelines.  Particularly if you've looked at iteratees before and been
 intimidated, please have a look at iterIO to see if it makes them more
 accessible.
 
 Some aspects of iterIO that should simplify learning and using
 iteratees are:
 
* Every aspect of the library is thoroughly document in haddock
  including numerous examples of use.
 
* Enumerators are easy to build out of iteratees.
 
* There is no difference between enumerators and enumeratees
  (i.e., inner pipeline stages).  The former is just a
  type-restricted version of the latter.
 
* Parsing combinators provide detailed error reporting and support
  LL(*) rather than LL(1) parsing, leading to fewer non-intuitive
  parsing failures.  A couple of tricks avoid consuming excessive
  memory for backtracking.
 
* Super-fast LL(1) parsing is also available through seamless
  integration with attoparsec.
 
* A universal exception mechanism works across invocations of mtl
  monad transformers, thereby unifying error handling.
 
* All pipe operators have uniform semantics, eliminating corner
  cases.  In particular, if the writing end of a pipe fails, the
  reading end always gets EOF, allowing it to clean up resources.
 
* One can catch exceptions thrown by any contiguous subset of
  stages in a pipeline.  Moreover, enumerator exception handlers
  can resume downstream stages that haven't failed.
 
* The package is full of useful iteratees and enumerators,
  including basic file and socket processing, parsec-like
  combinators, string search, zlib/gzip compression, SSL, HTTP, and
  loopback enumerator/iteratee pairs for testing a protocol
  implementation against itself.
 
 Please enjoy.  I'd love to hear feedback.
 
 David

1. It looks nice - however it causes problem as we have 3 iteratees
packages, all of which have some advantages. 4 if we count coroutine. (I
don't count original implementations).

2. What is the reason of using Inum/Onum instead of
Iteratee/Enumerator/Enumeratee. The latter seems to be a standard naming
in the community?

Regards


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Felipe Almeida Lessa
On Fri, May 6, 2011 at 6:46 AM, David Virebayre
dav.vire+hask...@gmail.com wrote:
 2011/5/6 Ertugrul Soeylemez e...@ertes.de:
 David Mazieres dm-list-haskell-c...@scs.stanford.edu wrote:

 Please enjoy.  I'd love to hear feedback.

 Thanks a lot, David.  This looks like really good work.  I'm using the
 'enumerator' package, and looking at the types your library seems to use
 a similar, but more complicated representation.  Is there any particular
 reason, why you didn't base your library on an existing iteratee package
 like 'enumerator'?

 David has documented some design decisions in
 http://hackage.haskell.org/packages/archive/iterIO/0.1/doc/html/Data-IterIO.html#g:3

 Perhaps you may find some answers there.

He says that enumerator's Iteratee doesn't have special support for
pure Iteratees.  When he says that the iteratee package doesn't have
special support for control messages, the same applies for enumerator
as well.  He also says that enumerator can't distinguish failures from
iteratees and enumeratees.

He also says that the enumerator package's Enumerators aren't
iteratees, only iterIO's enumerators are.  Well, that's not what I'm
reading:

  -- from enumerator package
  newtype Iteratee a m b = Iteratee {runIteratee :: m (Step a m b)}
  type Enumerator a m b = Step a m b - Iteratee a m b
  type Enumeratee ao ai m b = Step ai m b - Iteratee ao m (Step ai m b)

  -- from iterIO package
  newtype Iter t m a = Iter {runIter :: Chunk t - IterR t m a}
  type Inum tIn tOut m a = Iter tOut m a - Iter tIn m (IterR tOut m a)
  type Onum t m a = Inum () t m a

The enumerator package's Enumerator *is* an iteratee, an so is its
Enumeratee.  The only real difference is that iterIO represents
enumerators as enumeratees from () to something.  In enumerator
package terms, that would be

  -- enumerator packages's enumerator if it was iterIO's :)
  -- note that Inum's tIn and tOut are reversed w.r.t Enumeratee
ao and ai
  type Enumerator a m b = Enumeratee () a m b

Whether this representation is better or worse isn't clear for me.

Now, one big problem that iterIO has that enumerator hasn't, is that
iterIO is a *big* library with many dependencies, including OpenSSL.
IMHO, that package should be split into many others.

So, in the enumerator vs. iterIO challenge, the only big differences I see are:

 a) iterIO has a different exception handling mechanism.
 b) iterIO can have pure iteratees that don't touch the monad.
 c) iterIO's iteratees can send control messages to ther enumerators.
 d) iterIO's enumerators are enumeratees, but enumerator's enumerators
are simpler.
 e) enumerator has fewer dependencies.
 f) enumerator uses conventional nomenclature.
 g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
MPTC and functional dependencies).

Anything that I missed?

The bottomline: the biggest advantage I see right now in favor of
iterIO is c), although it still has the problem that you may get
runtime errors if you send the wrong control message.  However, right
now e) and g) may stop many users of enumerator from porting to
iterIO, even if they like its approach.

Cheers! =)

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Henk-Jan van Tuyl
On Fri, 06 May 2011 15:10:26 +0200, Felipe Almeida Lessa  
felipe.le...@gmail.com wrote:


So, in the enumerator vs. iterIO challenge, the only big differences I  
see are:


 a) iterIO has a different exception handling mechanism.
 b) iterIO can have pure iteratees that don't touch the monad.
 c) iterIO's iteratees can send control messages to ther enumerators.
 d) iterIO's enumerators are enumeratees, but enumerator's enumerators
are simpler.
 e) enumerator has fewer dependencies.
 f) enumerator uses conventional nomenclature.
 g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
MPTC and functional dependencies).

Anything that I missed?


iterIO cannot be compiled on Windows, because it depends on the package  
unix.


Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Felipe Almeida Lessa
On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl hjgt...@chello.nl wrote:
 iterIO cannot be compiled on Windows, because it depends on the package
 unix.

That's a big showstopper.  I wonder if the package split I recommend
could solve this issue, or if it's something deeper.

Cheers,

-- 
Felipe.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Alex Mason
Hi All,

I really love the look of this package, but if this is going be *the* iteratee 
package, I would absolutely love to see it fix some of the biggest mistakes in 
the other iteratee packages, soecifically naming. A change in naming for the 
terms iteratee, enumerator and enumeratee would go a hell of a long way here; 
Peaker on #haskell suggested Consumer/Producer/Transformer, and there is a lot 
of agreement in the channel that these are vastly better names. They’re also 
far less intimidating to users.

I personally feel that maybe Transformer isn't such a great name (being closely 
associated with monad transformers), and that maybe something like Mapper would 
be better, but I'm by no means in love with that name either. More people in 
#haskell seem to like Transformer, and I don't think my argument against it is 
very strong, so the hivemind seems to have settled on the 
Producer/Transformer/Consumer trilogy.

I'd love to hear thoughts on the issue, especially from David.

Cheers,

Alex Mason


On 06/05/2011, at 20:17, Maciej Marcin Piechotka wrote:

 On Thu, 2011-05-05 at 21:15 -0700, David Mazieres wrote:
 Hi, everyone.  I'm pleased to announce the release of a new iteratee
 implementation, iterIO:
 
  http://hackage.haskell.org/package/iterIO
 
 IterIO is an attempt to make iteratees easier to use through an
 interface based on pipeline stages reminiscent of Unix command
 pipelines.  Particularly if you've looked at iteratees before and been
 intimidated, please have a look at iterIO to see if it makes them more
 accessible.
 
 Some aspects of iterIO that should simplify learning and using
 iteratees are:
 
   * Every aspect of the library is thoroughly document in haddock
 including numerous examples of use.
 
   * Enumerators are easy to build out of iteratees.
 
   * There is no difference between enumerators and enumeratees
 (i.e., inner pipeline stages).  The former is just a
 type-restricted version of the latter.
 
   * Parsing combinators provide detailed error reporting and support
 LL(*) rather than LL(1) parsing, leading to fewer non-intuitive
 parsing failures.  A couple of tricks avoid consuming excessive
 memory for backtracking.
 
   * Super-fast LL(1) parsing is also available through seamless
 integration with attoparsec.
 
   * A universal exception mechanism works across invocations of mtl
 monad transformers, thereby unifying error handling.
 
   * All pipe operators have uniform semantics, eliminating corner
 cases.  In particular, if the writing end of a pipe fails, the
 reading end always gets EOF, allowing it to clean up resources.
 
   * One can catch exceptions thrown by any contiguous subset of
 stages in a pipeline.  Moreover, enumerator exception handlers
 can resume downstream stages that haven't failed.
 
   * The package is full of useful iteratees and enumerators,
 including basic file and socket processing, parsec-like
 combinators, string search, zlib/gzip compression, SSL, HTTP, and
 loopback enumerator/iteratee pairs for testing a protocol
 implementation against itself.
 
 Please enjoy.  I'd love to hear feedback.
 
 David
 
 1. It looks nice - however it causes problem as we have 3 iteratees
 packages, all of which have some advantages. 4 if we count coroutine. (I
 don't count original implementations).
 
 2. What is the reason of using Inum/Onum instead of
 Iteratee/Enumerator/Enumeratee. The latter seems to be a standard naming
 in the community?
 
 Regards
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:15:50 +0200,
Gregory Collins wrote:
 
 Hi David,
 
 Re: this comment from catchI:
 
  It is not possible to catch asynchronous exceptions, such as
  lazily evaluated divide-by-zero errors, the throw function, or
  exceptions raised by other threads using throwTo if those
  exceptions might arrive anywhere outside of a liftIO call.
 
 It might be worth investigating providing a version which can catch
 asynchronous exceptions if the underlying monad supports it (via
 MonadCatchIO or something similar). One of the most interesting
 advantages I can see for IterIO over the other iteratee
 implementations is that you actually have some control over resource
 usage -- not being able to catch asynchronous exceptions nullifies
 much of that advantage. A clear use case for this is timeouts on
 server threads, where you typically throw a TimeoutException exception
 to the handling thread using throwTo if the timeout is exceeded.

Excellent point.  There's actually a chance that iterIO already
catches those kinds of exceptions, but I wasn't sure enough about how
the Haskell runtime works to make that claim.  I've noticed in
practice that asynchronous exceptions tend to come exactly when I
execute the IO = operation.  If that's true, then since each IO =
is wrapped in a try block, the exceptions will all be caught (well,
not divide by zero, but things like throwTo, which I think are more
important).

One way I was thinking of implementing this was wrapping the whole
execution in block, and then calling unblock (unless iterIO's own
hypothetical block function is called) for every invocation of liftIO.
Unfortunately, the block and unblock functions now seem to be
deprecated, and the replacement mask/unmask ones would not be as
amenable to this technique.

However, if there's some simpler way to guarantee that = is the
point where exceptions are thrown (and might be the case for GHC in
practice), then I basically only need to update the docs.  If someone
with more GHC understanding could explain how asynchronous exceptions
work, I'd love to hear it...

 Another question re: resource cleanup: in the docs I see:
 
  Now suppose inumHttpBody fails (most likely because it receives an
  EOF before reading the number of bytes specified in the
  Content-Length header). Because inumHttpBody is fused to handler,
  the failure will cause handler to receive an EOF, which will cause
  foldForm to fail, which will cause handleI to receive an EOF and
  return, which will ensure hClose runs and the file handle h is not
  leaked.
 
  Once the EOFs have been processed, the exception will propagate
  upwards making inumHttpServer fail, which in turn will send an EOF
  to iter. Then the exception will cause enum to fail, after which
  sock will be closed. In summary, despite the complex structure of
  the web server, because all the components are fused together with
  pipe operators, corner cases like this just work with no need to
  worry about leaked file descriptors.
 
 Could you go into a little bit of detail about the mechanism behind this?

Yes, absolutely.  This relies on the fact that an Inum must always
return its target Iter, even when the Inum fails.  This invariant is
ensured by the two Inum construction functions, mkInumC and mkInumM,
which catch exceptions thrown by the codec iteratee and add in the
state of the target iteratee.

Now when you execute code like inum .| iter, the immediate result of
running inum is IterR tIn m (IterR tOut m a)--i.e., the result of an
iteratee returning the result an iteratee (because Inums are
iteratees, too).  If the Inum failed, then the outer IterR will use
the Fail constructor:

Fail !IterFail !(Maybe a) !(Maybe (Chunk t))

Where the Maybe a will be a Maybe (IterR tOut m b), and, because
of the Inum invariant, will be Just an actual result.  .| then must
translate the inner iteratee result to the appropriate return type for
the Inum (since the Inum's type (IterR tIn m ...) is different from
the Iter's (Iter tOut m ...)).  This happens through the internal
function joinR, which says:

joinR (Fail e (Just i) c) = flip onDoneR (runR i) $ \r -
case r of
  Done a _- Fail e (Just a) c
  Fail e' a _ - Fail e' a c
  _ - error joinR

Where the 'runR' function basically keeps feeding EOF to an Iter (and
executing it's monadic actions and rejecting its control requests)
until it returns a result, at which point the result's residual input
can be discarded and replaced with the residual input of the Inum.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:10:26 -0300,
Felipe Almeida Lessa wrote:
 
 He also says that the enumerator package's Enumerators aren't
 iteratees, only iterIO's enumerators are.  Well, that's not what I'm
 reading:
 
   -- from enumerator package
   newtype Iteratee a m b = Iteratee {runIteratee :: m (Step a m b)}
   type Enumerator a m b = Step a m b - Iteratee a m b
   type Enumeratee ao ai m b = Step ai m b - Iteratee ao m (Step ai m b)
 
   -- from iterIO package
   newtype Iter t m a = Iter {runIter :: Chunk t - IterR t m a}
   type Inum tIn tOut m a = Iter tOut m a - Iter tIn m (IterR tOut m a)
   type Onum t m a = Inum () t m a
 
 The enumerator package's Enumerator *is* an iteratee, an so is its
 Enumeratee.

Strictly speaking, I guess that's precise if you look at the type of
Enumerator.  However, it's not really an iteratee in the spirit of
iteratees, since it isn't really a data sink and has no input type.

 The only real difference is that iterIO represents
 enumerators as enumeratees from () to something.  In enumerator
 package terms, that would be
 
   -- enumerator packages's enumerator if it was iterIO's :)
   -- note that Inum's tIn and tOut are reversed w.r.t Enumeratee
 ao and ai
   type Enumerator a m b = Enumeratee () a m b
 
 Whether this representation is better or worse isn't clear for me.

Exactly.  The reason it's better (and for a long time my library was
more like the enumerator one) is that the mechanics of uniform error
handling are complex enough as it is.  When enumerators and
enumeratees are two different types, you need two different mechanisms
for constructing them, and then have to worry about handing errors in
the two different cases.  I found that unifying enumerators and
enumeratees (or Inums and Onums as I call them) significantly
simplified a lot of code.

 Now, one big problem that iterIO has that enumerator hasn't, is that
 iterIO is a *big* library with many dependencies, including OpenSSL.
 IMHO, that package should be split into many others.

Yes, this is definitely true.

 So, in the enumerator vs. iterIO challenge, the only big differences I see 
 are:
 
  a) iterIO has a different exception handling mechanism.
  b) iterIO can have pure iteratees that don't touch the monad.
  c) iterIO's iteratees can send control messages to ther enumerators.
  d) iterIO's enumerators are enumeratees, but enumerator's enumerators
 are simpler.
  e) enumerator has fewer dependencies.
  f) enumerator uses conventional nomenclature.
  g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
 MPTC and functional dependencies).
 
 Anything that I missed?
 
 The bottomline: the biggest advantage I see right now in favor of
 iterIO is c),

I basically agree with this list, but think you are underestimating
the value of a.  I would rank a as the most important difference
between the packages.  (a also is the reason for d.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:54:16 -0300,
Felipe Almeida Lessa wrote:
 
 On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl hjgt...@chello.nl wrote:
  iterIO cannot be compiled on Windows, because it depends on the package
  unix.
 
 That's a big showstopper.  I wonder if the package split I recommend
 could solve this issue, or if it's something deeper.

It's actually worse than this, unfortunately.

The unix package dependency is mostly there for efficiency.  For the
HTTP package, in order to handle things like directories,
If-Modified-Since, and Content-Length, I need to look at file
attributes.  The platform-independent code lets me do this, but I
would have to make many more system calls.  Also, I would have a
slight race condition, because it's hard to get the attributes of the
file you actually opened (to make sure the length hasn't changed,
etc), while the unix package gets me access to both stat and fstat.

This has all been abstracted away by the FileSystemCalls class, so if
there's a way to implement those five functions on Windows, we could
move defaultFileSystemCalls to its own module (or even its own
package), and solve the problem without sacrificing performance or
correctness on unix.

Unfortunately, there are two worse unix dependencies:

 1) I'm using the network IO package to do IO on ByteStrings, and the
network library claims this doesn't work on windows.

 2) Proper implementation of many network protocols requires the
ability to send a TCP FIN segment without closing the underlying
file descriptor (so you can still read from it).  Thus, I'm using
FFI to call the shutdown() system call on the file descriptors of
Handles.  I have no idea how to make this work on Windows.

I'm hoping that time eventually solves problem #1.  As for problem #2,
the ideal solution would be to get something like hShutdown into the
system libraries.

I'd obviously love to make my stuff work on Windows, but probably lack
the experience to do it on my own.  Suggestions and help are of course
welcome...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Heinrich Apfelmus

Alex Mason wrote:


I really love the look of this package, but if this is going be *the*
iteratee package, I would absolutely love to see it fix some of the
biggest mistakes in the other iteratee packages, soecifically naming.
A change in naming for the terms iteratee, enumerator and enumeratee
would go a hell of a long way here; Peaker on #haskell suggested
Consumer/Producer/Transformer, and there is a lot of agreement in the
channel that these are vastly better names. They’re also far less
intimidating to users.

I personally feel that maybe Transformer isn't such a great name
(being closely associated with monad transformers), and that maybe
something like Mapper would be better, but I'm by no means in love
with that name either. More people in #haskell seem to like
Transformer, and I don't think my argument against it is very strong,
so the hivemind seems to have settled on the
Producer/Transformer/Consumer trilogy.

I'd love to hear thoughts on the issue, especially from David.


I vastly prefer the names Producer/Transformer/Consumer over the others. 
Then again, I never quite understood what Iteratees were all about in 
the first place.



Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Sat, 7 May 2011 01:15:25 +1000,
Alex Mason wrote:
 
 Hi All,
 
 I really love the look of this package, but if this is going be
 *the* iteratee package, I would absolutely love to see it fix some
 of the biggest mistakes in the other iteratee packages, soecifically
 naming. A change in naming for the terms iteratee, enumerator and
 enumeratee would go a hell of a long way here; Peaker on #haskell
 suggested Consumer/Producer/Transformer, and there is a lot of
 agreement in the channel that these are vastly better names. They’re
 also far less intimidating to users.
 
 I personally feel that maybe Transformer isn't such a great name
 (being closely associated with monad transformers), and that maybe
 something like Mapper would be better, but I'm by no means in love
 with that name either. More people in #haskell seem to like
 Transformer, and I don't think my argument against it is very
 strong, so the hivemind seems to have settled on the
 Producer/Transformer/Consumer trilogy.
 
 I'd love to hear thoughts on the issue, especially from David.

This is a question I struggled a lot with.  I definitely agree that
the terms are pretty intimidating to new users.

At least one thing I've concluded is that it really should be
presented as two concepts, rather than three.  So we should talk
about, e.g., producers, consumers, and pipeline stages that do both.

I'd been thinking about using the terms Source and Sink, but Source is
very overloaded, and SinkSource doesn't exactly roll off the tongue
or evoke a particularly helpful intuition.

In the end, I decided just to come up with new terms that wouldn't
carry any pre-conceptions (e.g., what's an Inum?), and then build
the intuition through copious documentation...

I'm open to suggestion here.  I've already overhauled the naming
conventions in the library once.  Initially I used the names EnumI and
EnumO for Inum and Onum.  I think the old names were much worse,
especially since Enum is a fundamental typeclass that has absolutely
nothing to do with enumerators.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Tom Brow

 At least one thing I've concluded is that it really should be
 presented as two concepts, rather than three.  So we should talk
 about, e.g., producers, consumers, and pipeline stages that do both.


I think that's a great idea.

I'd been thinking about using the terms Source and Sink, but Source is
 very overloaded, and SinkSource doesn't exactly roll off the tongue
 or evoke a particularly helpful intuition.


One good thing I can say for the Enumerator/Iteratee nomenclature is that it
nicely connotes the inversion of control (i.e., the push data flow) that
enumerator is all about. Enumera*tor** *feeds Itera*tee* -- subject, verb,
object. Producer/Consumer connotes the same by allusion to the
producer-consumer pattern of thread synchronization.

Tom

On Fri, May 6, 2011 at 9:47 AM, dm-list-haskell-c...@scs.stanford.eduwrote:

 This is a question I struggled a lot with.  I definitely agree that
 the terms are pretty intimidating to new users.

 At least one thing I've concluded is that it really should be
 presented as two concepts, rather than three.  So we should talk
 about, e.g., producers, consumers, and pipeline stages that do both.

 I'd been thinking about using the terms Source and Sink, but Source is
 very overloaded, and SinkSource doesn't exactly roll off the tongue
 or evoke a particularly helpful intuition.

 In the end, I decided just to come up with new terms that wouldn't
 carry any pre-conceptions (e.g., what's an Inum?), and then build
 the intuition through copious documentation...

 I'm open to suggestion here.  I've already overhauled the naming
 conventions in the library once.  Initially I used the names EnumI and
 EnumO for Inum and Onum.  I think the old names were much worse,
 especially since Enum is a fundamental typeclass that has absolutely
 nothing to do with enumerators.

 David

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Maciej Marcin Piechotka
Sorry for second-posting. In addition to the problems mentioned
elsewhere (too big packages) I would like to point problems with SSL:

 - It uses OpenSSL from what I understand which is not compatible with
GPL-2 as it uses Apache 1.0 licence (in addition to BSD4) as it requires
mentioning OpenSSL (This product includes software developed by the
OpenSSL Project for use in the OpenSSL Toolkit).
 - It doesn't allow to use it after STARTTLS command

Regards


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Henk-Jan van Tuyl
On Fri, 06 May 2011 18:28:07 +0200,  
dm-list-haskell-c...@scs.stanford.edu wrote:



At Fri, 6 May 2011 10:54:16 -0300,
Felipe Almeida Lessa wrote:


On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl hjgt...@chello.nl  
wrote:
 iterIO cannot be compiled on Windows, because it depends on the  
package

 unix.

That's a big showstopper.  I wonder if the package split I recommend
could solve this issue, or if it's something deeper.




[...]

I'd obviously love to make my stuff work on Windows, but probably lack
the experience to do it on my own.  Suggestions and help are of course
welcome...


Is the unix-compat package any good?

Regards,
Henk-Jan van Tuyl


--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Sat, 07 May 2011 00:09:46 +0200,
Henk-Jan van Tuyl wrote:
 
  On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl hjgt...@chello.nl  
  wrote:
   iterIO cannot be compiled on Windows, because it depends on the  
  package
   unix.
 [...]
  I'd obviously love to make my stuff work on Windows, but probably lack
  the experience to do it on my own.  Suggestions and help are of course
  welcome...
 
 Is the unix-compat package any good?

Thanks for the suggestion.  I'm not sure I totally understand how to
use unix-compat, though.  It gives me calls like

 getFdStatus :: Fd - IO FileStatus

which is one of the things I need.  But how do I get an Fd in the
first place?  (unix-compat seems to have no equivalent of openFd.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Mario Blažević

On 11-05-06 11:15 AM, Alex Mason wrote:

Hi All,

I really love the look of this package, but if this is going be *the* iteratee 
package, I would absolutely love to see it fix some of the biggest mistakes in 
the other iteratee packages, soecifically naming. A change in naming for the 
terms iteratee, enumerator and enumeratee would go a hell of a long way here; 
Peaker on #haskell suggested Consumer/Producer/Transformer, and there is a lot 
of agreement in the channel that these are vastly better names. They’re also 
far less intimidating to users.

I personally feel that maybe Transformer isn't such a great name (being closely 
associated with monad transformers), and that maybe something like Mapper would 
be better, but I'm by no means in love with that name either. More people in 
#haskell seem to like Transformer, and I don't think my argument against it is 
very strong, so the hivemind seems to have settled on the 
Producer/Transformer/Consumer trilogy.

I'd love to hear thoughts on the issue, especially from David.


The Producer/Consumer terminology, if I'm not mistaken, is usually 
applied to coroutine pairs. I use these terms myself in the SCC package, 
together with terms Transducer and Splitter. The former term is also 
well established, the latter was my own.


Though I like and use this terminology, I'm not sure it's a good 
fit for the existing Enumerator/Iteratee pairs, which are not real 
symmetric coroutines. Enumerators are more like the Python (2.5) 
Generators. I don't know what the Python terminology would be for the 
Iteratee.



On 11-05-06 12:47 PM, dm-list-haskell-c...@scs.stanford.edu wrote:

This is a question I struggled a lot with.  I definitely agree that
the terms are pretty intimidating to new users.

At least one thing I've concluded is that it really should be
presented as two concepts, rather than three.  So we should talk
about, e.g., producers, consumers, and pipeline stages that do both.

I'd been thinking about using the terms Source and Sink, but Source is
very overloaded, and SinkSource doesn't exactly roll off the tongue
or evoke a particularly helpful intuition.


The SCC package happens to use Source and Sink names as well. They 
are used not for coroutines directly, but instead for references to 
coroutines of the appropriate type. Every consumer thus owns a Source 
from which it fetches its input, and that Source is always bound to 
another coroutine that yields those values through a Sink. Source and 
Sink are a passive handle to a Producer and Consumer. I may be 
subjective, but I find this use of the terms very fitting.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 06 May 2011 21:27:21 -0400,
Mario Blažević wrote:
 
  I'd been thinking about using the terms Source and Sink, but Source is
  very overloaded, and SinkSource doesn't exactly roll off the tongue
  or evoke a particularly helpful intuition.
 
  The SCC package happens to use Source and Sink names as well. They 
 are used not for coroutines directly, but instead for references to 
 coroutines of the appropriate type. Every consumer thus owns a Source 
 from which it fetches its input, and that Source is always bound to 
 another coroutine that yields those values through a Sink. Source and 
 Sink are a passive handle to a Producer and Consumer. I may be 
 subjective, but I find this use of the terms very fitting.

You mean fitting for references to coroutines, or fitting for the
replacement names for Enumerator/Iteratee?

If there's overwhelming consensus, I would certainly consider changing
the names in the iterIO library, but it's a pretty big change...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread Mario Blažević

On 11-05-06 09:58 PM, dm-list-haskell-c...@scs.stanford.edu wrote:

At Fri, 06 May 2011 21:27:21 -0400,
Mario Blažević wrote:

I'd been thinking about using the terms Source and Sink, but Source is
very overloaded, and SinkSource doesn't exactly roll off the tongue
or evoke a particularly helpful intuition.

  The SCC package happens to use Source and Sink names as well. They
are used not for coroutines directly, but instead for references to
coroutines of the appropriate type. Every consumer thus owns a Source
from which it fetches its input, and that Source is always bound to
another coroutine that yields those values through a Sink. Source and
Sink are a passive handle to a Producer and Consumer. I may be
subjective, but I find this use of the terms very fitting.

You mean fitting for references to coroutines, or fitting for the
replacement names for Enumerator/Iteratee?


The former, unfortunately. As I said, the most usual name for the 
Enumerator concept would be Generator. That term is already used in 
several languages to signify this kind of restricted coroutine. I'm not 
aware of any good alternative naming for Iteratee.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread wren ng thornton

On 5/6/11 11:15 AM, Alex Mason wrote:

Hi All,

I really love the look of this package, but if this is going be *the* iteratee 
package, I would absolutely love to see it fix some of the biggest mistakes in 
the other iteratee packages, soecifically naming. A change in naming for the 
terms iteratee, enumerator and enumeratee would go a hell of a long way here; 
Peaker on #haskell suggested Consumer/Producer/Transformer, and there is a lot 
of agreement in the channel that these are vastly better names. They’re also 
far less intimidating to users.

I personally feel that maybe Transformer isn't such a great name (being closely 
associated with monad transformers), and that maybe something like Mapper would 
be better, but I'm by no means in love with that name either. More people in 
#haskell seem to like Transformer, and I don't think my argument against it is 
very strong, so the hivemind seems to have settled on the 
Producer/Transformer/Consumer trilogy.


I believe transducer is the proper term. (Of course, producers and 
consumers are both special cases of transducers, trivializing the input 
or output stream, respectively.)


Though, IMO, I don't find the names producer and consumer 
enlightening as to why this particular pattern of iteration/enumeration 
is different from the conventional pattern found in OOP's iterators. Any 
time you have a bunch of things being created and passed around you have 
producers and consumers; the terminology is insufficient to define the 
pattern. The shift from iterator to enumerator helps to capture that 
difference. Given as there's no common name for the code calling an 
iterator, it's not immediately apparent what the push-based enumerative 
version should be called; iteratee seems as good as any other name, 
because it expresses the duality involved in the switch from the 
iterative style.


control  : pull, push
producer : iterator, enumerator
consumer : ???,  iteratee

Of course, this pattern of names suggests that enumeratee should 
properly be a backformation for naming the consumer of the pull-based 
iterative pattern. But then we're still left with the problem of what 
the transducers should be called in both cases.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-05 Thread Eugene Kirpichov
Sounds just terrific! Thanks!



06.05.2011, в 8:15, David Mazieres dm-list-haskell-c...@scs.stanford.edu 
написал(а):

 Hi, everyone.  I'm pleased to announce the release of a new iteratee
 implementation, iterIO:
 
http://hackage.haskell.org/package/iterIO
 
 IterIO is an attempt to make iteratees easier to use through an
 interface based on pipeline stages reminiscent of Unix command
 pipelines.  Particularly if you've looked at iteratees before and been
 intimidated, please have a look at iterIO to see if it makes them more
 accessible.
 
 Some aspects of iterIO that should simplify learning and using
 iteratees are:
 
   * Every aspect of the library is thoroughly document in haddock
 including numerous examples of use.
 
   * Enumerators are easy to build out of iteratees.
 
   * There is no difference between enumerators and enumeratees
 (i.e., inner pipeline stages).  The former is just a
 type-restricted version of the latter.
 
   * Parsing combinators provide detailed error reporting and support
 LL(*) rather than LL(1) parsing, leading to fewer non-intuitive
 parsing failures.  A couple of tricks avoid consuming excessive
 memory for backtracking.
 
   * Super-fast LL(1) parsing is also available through seamless
 integration with attoparsec.
 
   * A universal exception mechanism works across invocations of mtl
 monad transformers, thereby unifying error handling.
 
   * All pipe operators have uniform semantics, eliminating corner
 cases.  In particular, if the writing end of a pipe fails, the
 reading end always gets EOF, allowing it to clean up resources.
 
   * One can catch exceptions thrown by any contiguous subset of
 stages in a pipeline.  Moreover, enumerator exception handlers
 can resume downstream stages that haven't failed.
 
   * The package is full of useful iteratees and enumerators,
 including basic file and socket processing, parsec-like
 combinators, string search, zlib/gzip compression, SSL, HTTP, and
 loopback enumerator/iteratee pairs for testing a protocol
 implementation against itself.
 
 Please enjoy.  I'd love to hear feedback.
 
 David
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe