Re: protocol for using InputRanges

Walter Bright Tue, 25 Mar 2014 16:26:28 -0700

On 3/25/2014 2:29 PM, Andrei Alexandrescu wrote:

We want to appeal to the high performance coders. To maximize
performance, ranges should be optimized to the inner loop case, which is:


     while (!r.empty) { auto e = r.front; ... e ...; r.popFront(); }


Actually generic code often prefers:

while (!r.empty) { ... r.front ...; r.popFront(); }

That saves copying r.front if it returns by ref.


Yes, I agree. The idiom foreach (ref e; r) { ... } must also work.

Since front is guaranteed to succeed if !empty, this puts a requirement
on many ranges that they have a non-trivial constructor that 'primes the
pump'. Of course, priming may fail, and so construction may throw, which
is not good design.


Again, I disagree with this assertion.


The throwing part, or the not good design part?

Next, priming requires a buffer in the range.

Priming has nothing do to with the range being buffered. The entire design of
ranges is a one-element buffer.

Buffers add size, making them slower to copy, and will often require
another field saying if the buffer has data in it or not, further
bumping the size.

That's an argument for adding an unbuffered range abstraction.


I don't know how that would fit into the ecosystem.

In any case, with the protocol I suggest, the designer of the range is free todistribute the functionality into the three functions, however it makes bestsense. With arbitrarily calling any in any order, then all three need to havesome sort of signalling mechanism between them.

All this saves for the user is one call to empty for the entire
algorithm, at a cost incurred with every iteration. I.e. it selects O(n)
to save O(1).


I don't understand how that O() reasoning works.

I meant adding extra signalling with every iteration of the loop, rather thanrequiring the user to call empty before front.

B) I want to be able to call front multiple times in a row, and expect
to get the same value.


Correct.

This can force the range to incorporate a one element buffer and a flag
indicating the status of that buffer.


It may, but many ranges naturally work that way (e.g. arrays).


Sure, but there are far more range types than arrays.

The range instance gets bigger and
more expensive to copy, and the cost of manipulating the flag and the
buffer is added to every loop iteration. Note that the user of a range
can trivially use:
      auto e = r.front;
      ... using e multiple times ...
instead.


That would pessimize code using arrays of large structs.

You're already requiring copying with the buffering requirement. And besides, ifI was creating a range of expensive-to-copy objects, I would add a layer ofindirection to cheapen it.

The proposed protocol pessimizes arrays of large structs


Not really more than the existing protocol does (i.e. required buffering).

and will trip the unwary if calling r.front again returns something else.

I'm not proposing that calling them wrongly would make things unsafe. But Idon't think it's unreasonable to expect the protocol to be followed, just likefile open/read/close.

I don't think the proposal is good.

I don't like being in the position of when I need high performance code, I haveto implement my own ranges & algorithms, or telling customers they need to do so.

Re: protocol for using InputRanges

Reply via email to