Re: Adjusting the behavior of SelectFS limit and shifted operations

Mario Juric Fri, 13 Nov 2020 03:09:31 -0800

Hi Richard,

I haven’t used any of the mentioned features yet, so I have to think of some 
use cases that we have, where they could be applied.


> it = select().limit(2).fsIterator();
> it.moveToNext();
> it.moveToNext();
> it.moveToNext();
> it.get() <= can still get here!
>
> I would rather expect that the limit would restrict size of the
> result space and that the iterator could be used to freely move
> inside that limited space while allowing to call get() as often
> as one likes. So the limit should in my opinion rather be imposed
> on the move operations than on the get operations.

A top N filter would be the closest use case that we have for something like 
this, and it picks the first N that satisfy some constraints, so in this case 
it doesn’t make sense to limit the iteration range, because we don’t know 
whether the next item satisfies the constraints, but then I would probably 
implement it something like this:

It = select().filter(t -> satisfiesConstraints(t)).limit(n);

In this case it makes sense to limit the iterator range as you propose, and 
this is basically also how our implementation works with streams.

> == shifted(x < 0)
>
> The shift operation is defined as moving the iterator to the left/right
> from the starting position. Consider the following situation:
>
> t1 = new Token(0,1)
> t2 = new Token(2,3)
> t3 = new Token(4,5)
> t4 = new Token(6,7)
> t5 = new Token(8,9)
>
> select().following(t3) => {t4, t5}
> select().preceding(t3) => {t1, t2}
>
> NOTE: results are returned in index order, not in iteration order!
> select.preceding(t3) iterates backwards starting from t3, but then
> reverses the result list.
>
> So with positive shift, we can skip some of the results before returning.
>
> select().shifted(1).following(t3) => {t5}
>
> When I started looking into this, it was also possible to use a negative
> shift with following, e.g.:
>
> select().shifted(-1).following(t3) => {t3, t4, t5}
>
> So the "following" operation would select "t4" as the starting point and
> the shift would then move the iterator one to the left so that the result
> list would include t3 as well.
>
> However, in order to align the following and preceding operations with the
> I have installed filtered iterators to ensure that edge cases with
> zero-width annotations are properly handled. These filters put a hard limit
> on the iterator when the boundaries of the startFS (here t3) are hit and
> will not allow moving beyond this limit. That means that using shift with
> following/preceding returns an empty list because moving the iterator
> backwards from its starting position invariably hits the filtered iterator
> limit and invalidates it.
>
> select().shifted(-1).following(t3) => {}
> select().shifted(-1).preceding(t3) => {}
>
> IMHO that makes sense. Operations like preceding/following/coveredBy/etc.
> are supposed to work offset-oriented and to gloss over the index order
> (and type priorities). Mixing this with an operation that strongly
> depends on index order like shift feels like a bad idea. It works out
> nicely for positive shift though - it boils down to skipping the first x
> elements of the result. However, allowing a negative shift to move to some
> element just before the start position for this kind of operation seems
> like a bad idea conceptually and also causes quite a bit of headache
> at the implementation level.
>
> Instead of using a negative shift with preceding/following, it should
> be used in conjunction with startAt. startAt itself is strongly dependent
> on index order and the shift operation is well defined in conjunction with
> startAt.
>
> So when SelectFS following/preceding is used with a negative shift now,
> a warning is logged suggesting the use of startAt.
>
> I believe this change is defendable in a feature-level release because
> - the behavior of shift is quite odd to start with (and IMHO should also
> be adjusted in other cases)
> - I would not expect much if any code to actually rely on the shift
> operator with a negative index

We have no use cases where something like shift is used, especially negative 
shifts, so it’s hard for me to have a qualified opinion about this. The closest 
thing it reminds me of are negative list indexes in Python, and I am not sure 
that is comparable. I guess, invalidating or sending out warnings in some of 
the mentioned edge cases seems reasonable for now.

Cheers
Mario



________________________________
Disclaimer:
This email and any files transmitted with it are confidential and directed 
solely for the use of the intended addressee or addressees and may contain 
information that is legally privileged, confidential, and exempt from 
disclosure. If you have received this email in error, please notify the sender 
by telephone, fax, or return email and immediately delete this email and any 
files transmitted along with it. Unintended recipients are not authorized to 
disclose, disseminate, distribute, copy or take any action in reliance on 
information contained in this email and/or any files attached thereto, in any 
manner other than to notify the sender; any unauthorized use is subject to 
legal prosecution.

Re: Adjusting the behavior of SelectFS limit and shifted operations

Reply via email to