On Thu, 26 Aug 2010 10:18:51 -0400, Pillsy <[email protected]> wrote:
These thoughts were inspired by the recent thread, "Retrieving the
traversed range". I am generally pretty sold on the range concept,
but I think for some operations some sort of cursor is required.
However, there are some unsafe cursor-related operations that are
best avoided.
However, as far as I can see, the unsafe operations revolve around
synthesizing a new range out of two cursors. However, you can get
a lot of the desired functionality associated with finding elements and
splitting ranges *without* allowing the synthesis of of ranges from
unrelated cursors. The key to doing this is to have each cursor be
permanently bound to the underlying range which it was derived from.
If you have this kind of cursor, you only really need it to support two
operations to get a huge amount of benefit:
Range before ();
Range after ();
The before () operation just returns a range of the appropriate type
(Forward, Bidirectional, RandomAccess) of all the elements that come
before the cursor is pointing to, while the after () operation returns a
range of the appropriate type with all the elements after the cursor.
Given this concept, a cursor really points at the boundary between
elements of a range, if you will.
If find and related operations return this kind of cursor, all of the
operations involving splitting ranges become trivial. It's completely
consistent for either before () or after () to be empty, and since you
can't glue two of these cursors together, you are always guaranteed to
get well-defined ranges out.
While I think the before () and after () operations are the only ones
which are strictly necessary for the desiderata I've seen so far, people
could write their own generic algorithms if cursors supported more of
the iterator operations, for moving forward (and backward, where
appropriate) and inspecting the element immediately after the cursor (or
before, where appropriate), and if ranges supported operations to return
cursors from reasonable places (like from the front or back, or from a
specific indexed elemnt for RandomAccess ranges).
The key idea is that these cursors aren't a primitive part of a range;
instead, they take a range and add a position *inside* the range.
They're a perfect fit for all those "three-legged" operations out there
because they actually have three legs.
That's exactly how cursors perform in dcollections. An excerpt from
dcollections' concepts document:
Cursors:
Cursors are special 1 or 0 element ranges. They implement the forward
range
interface. The benefit to using cursors is that they always refer to
exactly
one element. A normal range uses two marker elements, a begin and an end
element. Therefore, cursors are less susceptible to invalidation on
removal of
elements.
Cursors support these additional features:
- Use a cursor as a reference point when dealing with a container.
- Use as an end point when slicing a container. Note that some containers
only
support slicing with an arbitrary cursor and the beginning or end of the
container. Slicing is guaranteed to be a fast operation (lgn or better).
Determining cursor/range ownership:
All collections can identify whether a cursor/range belongs to them. Each
collection class has a 'belongs' method to determine ownership. The
ownership check is guaranteed to be a fast operation (O(lgN) or better).
-------------------
The 'three-legged' cursor as you propose was an early idea of mine too.
Because dcollections containers are classes however, there is an easy
point of ownership to establish -- the class reference. So a dcollections
cursor doesn't have to refer to its owner, but it can in order to fulfill
the requirement.
My view is that a cursor should be separate from a range, and a range
should either be able to tell if a cursor is part of it (@safe) or be told
that a cursor is a part of it (@system or @trusted). Inside a
three-legged algorithm which accepts a range as input, and generates a
cursor from the range for the third leg, it can force the range to assume
the cursor is a part of it, knowing that it's true. You can make the
second operation @safe by wrapping a cursor + range in a struct that
generates the cursor from the range (and therefore knows the cursor is a
part of the range).
For two examples of ranges that can tell if a cursor is a part of it, an
array can tell if a cursor is a part of it without any extra mechanisms
because it is an O(1) check for an interior pointer. A balanced tree can
also perform an O(lgn) check that a cursor is part of it (though it's
questionable whether a tree is a range).
-Steve