On 06.09.2012 10:44, Sven Van Caekenberghe wrote:
Mariano,
What would be wrong with something more elegant, like
consumeWhitespace
"Strip whitespaces from the input stream."
[ readStream atEnd not and: [ readStream peek isSeparator ] ]
whileTrue: [
readStream next ]
?
In my book #peek, which means a one-element pushback buffer, is a whole lot
less of a requirement that a general #position[:] interface which actually
assumes that all contents is in memory, a contradiction to the concept of
streaming.
Considering we are talking of PositionableStream subclasses, using the
position: protocol (also to implement peek) isn't exactly heresy. :)
Though, neglecting converter state (which in one case involves more than
position) when one does so is a bad idea.
Rewriting #peek to employ a buffer instead of position: for positionable
streams(double dispatched through saveStateOf:) purely for the sake of
not dealing with position: seems like a bad use of time imho, though
you'd lose the compexity of dd'ing through the converter, dealing with
the buffer translates into quite a few changes elsewhere.
For non-positionable streams, a buffer is really the only choice though,
and isn't _that_ complicated to deal with in face of no arbitrary
position: changes.
(Ironically in light of the above, the use cases of
CompoundTextConverter, which is the one which _does_ care about other
state than position in saveStateOf:, should never necessitate arbitary
position: changes, so it seems targetted purely at making peek work)
Maybe these uglier versions are more efficient, but who can say for sure they
are correct ?
;-)
Sven
When it comes to streams on in-memory collections (like MBBOTS), using
peek shouldn't be very noticeable.
For streams with external sources, position: can be costly, inlining the
position book-keeping peek does (as MBFS>>skipSeparators does) might be
justified.
With the amount of whitespace one normally parse through, I sort of
doubt it even in that case.
Anyways, who can say for sure the peek approach is correct? ;)
As mentioned above, for MultiByteBinaryOrTextStreams using
CompoundTextConverters it won't be., since #peek in MBBOTS doesn't
account for converter state.
(Just like skipSeparators, which in addition does not account for
multibyte elements, the source of the bug Mariano ran into in the first
place, and why peek fixes his non-CompTextConv usecase)
TLDR;
- Yap, using peek is cleaner, and is a nicer implementation as long as
internally you use a buffer / position: is cheap.
- MultiByteBinaryOrTextStream>>#peek needs fixing, along the lines
Mariano originally intended for skipSeparaters wrt. copying what is done
in MultiByteFileStreams, in order to work correctly with all converters.
Cheers,
Henry