Re: [Haskell-cafe] [iteratee] empty chunk as special case of input

2011-07-14 Thread Sergey Mironov
2011/7/14 John Lato jwl...@gmail.com:
 Sorry for the followup, but I forgot about one other important reason
 (probably the real reason) for the nullC case in bindIteratee.  Note
 what happens in the regular case: the iteratee is run, and if it's in
 a completed state, the result is passed to the bound function (in the
 m_done line), which is then also run.  Examine what happens if the
 inner iteratee is also complete:

 const . flip onDone stream

 which would be more clearly written as

 \b _str - onDone b stream

 so in this case the leftover stream result from the first iteratee
 (stream) is used as the result of the second iteratee, and the
 leftover stream from the second iteratee (_str) is discarded.

 This doesn't seem right; what should happen is that the two streams
 should be appended somehow.

Yes I see. From this point ov view, the way of ignoring second
iteratee's leftover stream is neither worse or better comparing to
other possible ways, like ignoring stream of first iteratee or
appending them together somehow. I thought about it, and now it seems
that all this problem exists because of iteratee's possibility to jump
into done state without processing any data.

I came to iteratees from IncrementalGet library (binary-strict
package), and thought that they are using similar concepts, but now I
see big difference - IncrementalGet's approach doesn't allow such
state change. That is how they define /Get/ (iteratee-like structure).

newtype Get r a = Get { unGet :: S - (a - S - IResult r) - IResult r }

data IResult a = IFailed S String
   | IFinished S a
   | IPartial (B.ByteString - IResult a)

data S = S ...  -- contains data chunk (bytestring) and some other state holders

unGet has similar design in onDone branch, but onCont is hidden inside
IResult. So, user can't obtain the result without providing a stream
as input. Well, there is also black magic there..  but I think It
makes impossible to have two conflicting iteratees like bindIteratee
may discover.

I would like to compare  those approaches and decide what is better
(it depends on task of course, but how?).. binary-strict's code is
easier to understand, but iteratees are more general and offer more
features, including very powerfull stream transformations. Is it good
idea to merge somehow those approaces? For example, if I'll replace
IncrementalGet's hardcoded stream type with type variable like
iterarees do, will I be able to implement convStream on top of Get,
how do you think? What about enumeratees?

By the way, Iteratee package contains itertut.lhs - very good
tutorial, thanks! It says that CPS was used to eliminate constructors.
How do yo think, may I hope that one day compiler will be able to
transform constructor-based approach, introduced there, into CPS
automatically?

Thanks,
Sergey

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [iteratee] empty chunk as special case of input

2011-07-13 Thread Sergey Mironov
Hi community, hi John. I find myself reading bindIteratee[1] function
for a several days.. there is something that keeps me away from
completely understanding of the concept. The most noticeble thing is
\nullC\ guard in the definition. To demonstate the consequences of
this solution, let me define an iterator like

myI = Iteratee $ \onDone _ - onDone 'a' (Chunk xyz)

It is a bit unusial, since myI substitutes real stream with a fake one
(xyz). Now lets define two actions producing different results in
unusual manner:

printI i = enumPure1Chunk ['a'..'g'] i = run = print

i1 = (return 'b'  myI  I.head)  -- myI substitutes the stream,
last /I.head/ produces 'x', OK
i2 = (I.head  myI  I.head) -- produces 'b'!  I expected another
'x' here but myI's stream was ignored by =

Well, I understand that this is probably an expected behaviour, but
what is it for? Why we can't handle null input like non-null? Iterator
may just stay in it's current state in that case.

Thanks in advance
Sergey

--
[1] - bindIteratee (basically, =) code from Data.Iteratee.Base.hs

bindIteratee :: (Monad m, Nullable s)
= Iteratee s m a
- (a - Iteratee s m b)
- Iteratee s m b
bindIteratee = self
where
self m f = Iteratee $ \onDone onCont -
 let m_done a (Chunk s)
   | nullC s  = runIter (f a) onDone onCont
 m_done a stream = runIter (f a) (const . flip onDone
stream) f_cont
   where f_cont k Nothing = runIter (k stream) onDone onCont
 f_cont k e   = onCont k e
 in runIter m m_done (onCont . (flip self f .))

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [iteratee] empty chunk as special case of input

2011-07-13 Thread John Lato
Hi Sergey,

iteratee (the package) uses a null chunk to signify that no further
stream data is available within the iteratee, that is, at some point
the stream has been entirely consumed.  Therefore, if any of the
composed iteratees haven't run to completion, they need to get more
data from an enumerator.  Thus 'bindIteratee' has the nullC guard in
the definition as an optimization; there's no need to send the null
chunk to bound iteratees because in most cases they won't be able to
do anything with it.

I've recently considered removing this, but at present when I take it
out some unit tests fail and I haven't had time to explore further.
Since this would have other benefits I would like to do so provided it
doesn't strongly impact performance.  Rather than simply removing the
case I could add a null case to the Stream type, but that could cause
some extra work for users.

Also, one rule for writing iteratees is that they shouldn't put
elements into the stream.  Doing so may cause various transformers to
behave incorrectly.  If you want to modify a stream rather than simply
consuming elements, the correct approach is to create an enumeratee
(stream transformer).

John L.

On Wed, Jul 13, 2011 at 11:00 PM, Sergey Mironov ier...@gmail.com wrote:
 Hi community, hi John. I find myself reading bindIteratee[1] function
 for a several days.. there is something that keeps me away from
 completely understanding of the concept. The most noticeble thing is
 \nullC\ guard in the definition. To demonstate the consequences of
 this solution, let me define an iterator like

 myI = Iteratee $ \onDone _ - onDone 'a' (Chunk xyz)

 It is a bit unusial, since myI substitutes real stream with a fake one
 (xyz). Now lets define two actions producing different results in
 unusual manner:

 printI i = enumPure1Chunk ['a'..'g'] i = run = print

 i1 = (return 'b'  myI  I.head)  -- myI substitutes the stream,
 last /I.head/ produces 'x', OK
 i2 = (I.head  myI  I.head) -- produces 'b'!  I expected another
 'x' here but myI's stream was ignored by =

 Well, I understand that this is probably an expected behaviour, but
 what is it for? Why we can't handle null input like non-null? Iterator
 may just stay in it's current state in that case.

 Thanks in advance
 Sergey

 --
 [1] - bindIteratee (basically, =) code from Data.Iteratee.Base.hs

 bindIteratee :: (Monad m, Nullable s)
    = Iteratee s m a
    - (a - Iteratee s m b)
    - Iteratee s m b
 bindIteratee = self
    where
        self m f = Iteratee $ \onDone onCont -
             let m_done a (Chunk s)
                   | nullC s      = runIter (f a) onDone onCont
                 m_done a stream = runIter (f a) (const . flip onDone
 stream) f_cont
                   where f_cont k Nothing = runIter (k stream) onDone onCont
                         f_cont k e       = onCont k e
             in runIter m m_done (onCont . (flip self f .))


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [iteratee] empty chunk as special case of input

2011-07-13 Thread John Lato
Sorry for the followup, but I forgot about one other important reason
(probably the real reason) for the nullC case in bindIteratee.  Note
what happens in the regular case: the iteratee is run, and if it's in
a completed state, the result is passed to the bound function (in the
m_done line), which is then also run.  Examine what happens if the
inner iteratee is also complete:

 const . flip onDone stream

which would be more clearly written as

 \b _str - onDone b stream

so in this case the leftover stream result from the first iteratee
(stream) is used as the result of the second iteratee, and the
leftover stream from the second iteratee (_str) is discarded.

This doesn't seem right; what should happen is that the two streams
should be appended somehow.  It works because at this stage an
iteratee won't have been enumerated over (by the current stream at
least), so it can't have any leftover data, just a null chunk.  But
bindIteratee explicitly checks for the null chunk case also so that's
not a problem.  If the iteratee was enumerated over by another stream
and therefore does have leftover data, then since that data isn't part
of the current stream it's rightfully discarded anyway.

This is why your function produced an unexpected result; it's in a
completed state without having been enumerated over, but also has
leftover data, which bindIteratee ignores.

Now that I've thought about it, I'm not convinced this is always
correct; in particular I suspect it for being responsible for a
slightly convoluted implementation of enumFromCallbackCatch.  I'll
have to expend more brain cells on it, I think.

John L.

On Thu, Jul 14, 2011 at 1:15 AM, John Lato jwl...@gmail.com wrote:
 Hi Sergey,

 iteratee (the package) uses a null chunk to signify that no further
 stream data is available within the iteratee, that is, at some point
 the stream has been entirely consumed.  Therefore, if any of the
 composed iteratees haven't run to completion, they need to get more
 data from an enumerator.  Thus 'bindIteratee' has the nullC guard in
 the definition as an optimization; there's no need to send the null
 chunk to bound iteratees because in most cases they won't be able to
 do anything with it.

 I've recently considered removing this, but at present when I take it
 out some unit tests fail and I haven't had time to explore further.
 Since this would have other benefits I would like to do so provided it
 doesn't strongly impact performance.  Rather than simply removing the
 case I could add a null case to the Stream type, but that could cause
 some extra work for users.

 Also, one rule for writing iteratees is that they shouldn't put
 elements into the stream.  Doing so may cause various transformers to
 behave incorrectly.  If you want to modify a stream rather than simply
 consuming elements, the correct approach is to create an enumeratee
 (stream transformer).

 John L.

 On Wed, Jul 13, 2011 at 11:00 PM, Sergey Mironov ier...@gmail.com wrote:
 Hi community, hi John. I find myself reading bindIteratee[1] function
 for a several days.. there is something that keeps me away from
 completely understanding of the concept. The most noticeble thing is
 \nullC\ guard in the definition. To demonstate the consequences of
 this solution, let me define an iterator like

 myI = Iteratee $ \onDone _ - onDone 'a' (Chunk xyz)

 It is a bit unusial, since myI substitutes real stream with a fake one
 (xyz). Now lets define two actions producing different results in
 unusual manner:

 printI i = enumPure1Chunk ['a'..'g'] i = run = print

 i1 = (return 'b'  myI  I.head)  -- myI substitutes the stream,
 last /I.head/ produces 'x', OK
 i2 = (I.head  myI  I.head) -- produces 'b'!  I expected another
 'x' here but myI's stream was ignored by =

 Well, I understand that this is probably an expected behaviour, but
 what is it for? Why we can't handle null input like non-null? Iterator
 may just stay in it's current state in that case.

 Thanks in advance
 Sergey

 --
 [1] - bindIteratee (basically, =) code from Data.Iteratee.Base.hs

 bindIteratee :: (Monad m, Nullable s)
    = Iteratee s m a
    - (a - Iteratee s m b)
    - Iteratee s m b
 bindIteratee = self
    where
        self m f = Iteratee $ \onDone onCont -
             let m_done a (Chunk s)
                   | nullC s      = runIter (f a) onDone onCont
                 m_done a stream = runIter (f a) (const . flip onDone
 stream) f_cont
                   where f_cont k Nothing = runIter (k stream) onDone onCont
                         f_cont k e       = onCont k e
             in runIter m m_done (onCont . (flip self f .))



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe