On 06.09.2012 10:44, Sven Van Caekenberghe wrote:
Mariano,

What would be wrong with something more elegant, like

consumeWhitespace
        "Strip whitespaces from the input stream."

        [ readStream atEnd not and: [ readStream peek isSeparator ] ]
                whileTrue: [
                        readStream next ]

?

In my book #peek, which means a one-element pushback buffer, is a whole lot 
less of a requirement that a general #position[:] interface which actually 
assumes that all contents is in memory, a contradiction to the concept of 
streaming.
Considering we are talking of PositionableStream subclasses, using the position: protocol (also to implement peek) isn't exactly heresy. :) Though, neglecting converter state (which in one case involves more than position) when one does so is a bad idea.

Rewriting #peek to employ a buffer instead of position: for positionable streams(double dispatched through saveStateOf:) purely for the sake of not dealing with position: seems like a bad use of time imho, though you'd lose the compexity of dd'ing through the converter, dealing with the buffer translates into quite a few changes elsewhere.

For non-positionable streams, a buffer is really the only choice though, and isn't _that_ complicated to deal with in face of no arbitrary position: changes.

(Ironically in light of the above, the use cases of CompoundTextConverter, which is the one which _does_ care about other state than position in saveStateOf:, should never necessitate arbitary position: changes, so it seems targetted purely at making peek work)
Maybe these uglier versions are more efficient, but who can say for sure they 
are correct ?

;-)

Sven
When it comes to streams on in-memory collections (like MBBOTS), using peek shouldn't be very noticeable. For streams with external sources, position: can be costly, inlining the position book-keeping peek does (as MBFS>>skipSeparators does) might be justified. With the amount of whitespace one normally parse through, I sort of doubt it even in that case.

Anyways, who can say for sure the peek approach is correct? ;)
As mentioned above, for MultiByteBinaryOrTextStreams using CompoundTextConverters it won't be., since #peek in MBBOTS doesn't account for converter state. (Just like skipSeparators, which in addition does not account for multibyte elements, the source of the bug Mariano ran into in the first place, and why peek fixes his non-CompTextConv usecase)

TLDR;
- Yap, using peek is cleaner, and is a nicer implementation as long as internally you use a buffer / position: is cheap. - MultiByteBinaryOrTextStream>>#peek needs fixing, along the lines Mariano originally intended for skipSeparaters wrt. copying what is done in MultiByteFileStreams, in order to work correctly with all converters.

Cheers,
Henry

Reply via email to