Re: efficient input range buffering

Martin Nowak Sun, 02 Oct 2011 18:19:16 -0700

On Mon, 03 Oct 2011 03:12:00 +0200, Timon Gehr <[email protected]> wrote:

On 10/03/2011 01:14 AM, Martin Nowak wrote:

On Sun, 02 Oct 2011 21:15:12 +0200, Timon Gehr <[email protected]>wrote:

On 10/02/2011 04:12 PM, Martin Nowak wrote:

I've written a wrapper to promote input ranges to buffered forward
ranges.
It allows to write range agnostic lexers/parsers with infinite
lookahead.

Buffering is done through a singly linked list of memory blocks thatare

reference counted.
Each saved range holds a reference to all future blocks.
Blocks are recycled when being no longer used.

https://gist.github.com/1257196

There is a major issue with the somewhat broken
implicit-save-through-copy concept.
A lot of copies in function parameters, foreach loops etc. will also
create
references and thus can be easily responsible for inefficiencies.

martin


I have implemented something similar (but it is not generic). It uses
a dynamic circular buffer on top of a single dynamic array. I believe
it is more efficient that way for usual parsing tasks. (and uses less
lines of code) The clients of the range have to explicitly state that
they want to go back to a certain state. It works in LIFO order, which
is sufficient for parsing with lookahead.

Which is about the same as I did where free memory blocks are recycled
and appended to the front. So to say a cyclic singly linked list.
Only you have two performance issues when using a ring buffer based on
arrays.
- The head reader needs to check if he can overwrite a location for
every element.

It does not. The only position that is important is that of the veryfirst anchor, therefore there is only one check per buffer refill. (therange saves the very first anchor, as well as how many anchors arearound).

- When resizing the buffer you need to copy the already read data
(possibly causing costly copy ctor invocations).

The buffer is always doubled in size, quickly reaching the maximal sizeneeded (I start with a quite large buffer, usually it is never resized).There are no invocations of the copy ctor, because the data can be moved.


With block based reads you never need to copy read data and it also
frees from
checking for overwrites until the block end is reached.

I never need to copy data, and moving data only happens on degenerateinput.

I see, little more elaborate than I guessed from your description.
Is the code public?

Re: efficient input range buffering

Reply via email to