SelectFS provides different way of accessing the selection:

- as list
- as array
- as FS iterator
- as a single result
- ...

I believe it is a desirable property that these behave consistently.

In particular, when obtaining the iterator and moving over it in any
order, one should encounter the same results as when accessing the
results as a list or array (although hopefully more efficiently, 
in particular if not all of the results need to be looked at).

I fixed up the code such that going through the iterator front-to-back
and back-to-front yields results consistent with the list/array.

But working on this, I found two issues:

== limit(x)

It is possible to limit the results of the selection operation. This
is implemented as a limit on the number of get() operations one can 
make on the iterator. That means, one can move the iterator around
after obtaining it (without calling get()) without triggering the limit:

it = select().limit(2).fsIterator();
it.moveToNext();
it.moveToNext();
it.moveToNext();
it.get() <= can still get here!

I would rather expect that the limit would restrict size of the
result space and that the iterator could be used to freely move
inside that limited space while allowing to call get() as often
as one likes. So the limit should in my opinion rather be imposed
on the move operations than on the get operations.

WDYT?


== shifted(x < 0)

The shift operation is defined as moving the iterator to the left/right
from the starting position. Consider the following situation:

t1 = new Token(0,1)
t2 = new Token(2,3)
t3 = new Token(4,5)
t4 = new Token(6,7)
t5 = new Token(8,9)

select().following(t3) => {t4, t5}
select().preceding(t3) => {t1, t2} 

NOTE: results are returned in index order, not in iteration order!
select.preceding(t3) iterates backwards starting from t3, but then 
reverses the result list.

So with positive shift, we can skip some of the results before returning.

select().shifted(1).following(t3) => {t5}

When I started looking into this, it was also possible to use a negative
shift with following, e.g.:

select().shifted(-1).following(t3) => {t3, t4, t5}

So the "following" operation would select "t4" as the starting point and
the shift would then move the iterator one to the left so that the result
list would include t3 as well.

However, in order to align the following and preceding operations with the
I have installed filtered iterators to ensure that edge cases with
zero-width annotations are properly handled. These filters put a hard limit
on the iterator when the boundaries of the startFS (here t3) are hit and
will not allow moving beyond this limit. That means that using shift with
following/preceding returns an empty list because moving the iterator
backwards from its starting position invariably hits the filtered iterator
limit and invalidates it.

select().shifted(-1).following(t3) => {}
select().shifted(-1).preceding(t3) => {}

IMHO that makes sense. Operations like preceding/following/coveredBy/etc.
are supposed to work offset-oriented and to gloss over the index order
(and type priorities). Mixing this with an operation that strongly
depends on index order like shift feels like a bad idea. It works out
nicely for positive shift though - it boils down to skipping the first x
elements of the result. However, allowing a negative shift to move to some
element just before the start position for this kind of operation seems
like a bad idea conceptually and also causes quite a bit of headache
at the implementation level.

Instead of using a negative shift with preceding/following, it should
be used in conjunction with startAt. startAt itself is strongly dependent
on index order and the shift operation is well defined in conjunction with
startAt.

So when SelectFS following/preceding is used with a negative shift now,
a warning is logged suggesting the use of startAt. 

I believe this change is defendable in a feature-level release because 
- the behavior of shift is quite odd to start with (and IMHO should also
 be adjusted in other cases)
- I would not expect much if any code to actually rely on the shift 
 operator with a negative index

WDYT?

Cheers,

-- Richard

Reply via email to