Hi Diego,

Streams as they are currently defined, especially ReadStreams, can hardly be 
called IO streams. Many operations assume that there is a single collection 
over which you can move in any direction without restrictions. As a consequence 
most parsing code relies this ability, looking forward and going back multiple 
times, like using #skip: with a negative argument.

Real IO ReadStreams like a SocketStreams or FileStreams cannot be assumed to 
keep more than just a buffer in memory, hence the unrestricted operations are 
not implementable.

IMO, parsing code should be written against a much more restricted API, 
assuming at most a one element peek/buffer. I always try to do that, I even 
have Mock ReadStreams that ensure this.

While benchmarking I have also found that slow #peek behaviour can be a 
bottleneck, especially on some of the more complex streams.

Sven

PS: Have a look at ZnCharacterReadStream for example, which implements this 
restricted behaviour: it is not positionable, but allows a one character peek 
using a one character internal buffer.

On 04 Nov 2013, at 09:57, Diego Lont <diego.l...@delware.nl> wrote:

> Working on Petit Delphi we found a strange implementation for asPetitStream:
> Stream>asPetitStream
>       ^ self contents asPetitStream
> 
> Further investigation showed that the basic peek was not fast enough for 
> Petit Parser, as it is used a lot. So it implemented a "improved unchecked 
> peek":
> PPStream>peek
>       "An improved version of peek, that is slightly faster than the built in 
> version."
>       ^ self atEnd ifFalse: [ collection at: position + 1 ]
> 
> PPStream>uncheckedPeek
>       "An unchecked version of peek that throws an error if we try to peek 
> over the end of the stream, even faster than #peek."
>       ^ collection at: position + 1
> 
> But in my knowledge a basic peek should be fast. The real problem is the peek 
> in the underlying peek:
> PositionableStream>peek
>       "Answer what would be returned if the message next were sent to the 
>       receiver. If the receiver is at the end, answer nil."
> 
>       | nextObject |
>       self atEnd ifTrue: [^nil].
>       nextObject := self next.
>       position := position - 1.
>       ^nextObject
> 
> That actually uses "self next". The least thing one should do is to cache the 
> next object. But isn't there a primitive for peek in a file stream? Because 
> al overriding peeks of PositionableStream have basically the same 
> implementation: reading the next and restoring the state to before the peek 
> (that is slow). So we would like to be able to remove PPStream without 
> causing performance issues, as the only added method is the "improved peek".
> 
> Stephan and Diego


Reply via email to