I think I'm getting a handle on what might be the issue with another test
difference running with UIMA v3 and Ruta, but would like to confirm something
about how Ruta operates.


I see the RutaStream checkAnchor code "splitting" a RutaBasic annotation into
two.  It carefully first removes the original from the index, changes its "end",
and creates a new annotation, and adds both of these back to the index.

It does this while the stream has iterator(s), e.g. the currentIt field. 

Normally, this is not allowed (updating an index while iterating over it), but
UIMA v2 made an exception: this was allowed if the first operation on the
iterator was a moveTo first/last/some-specific-FS.  These moveTo operations
"reset" the iterator state to a known position, using the then-current values of
the indexes.

In version 3, we added a copy-on-write style for iterators, that changed this (I
need to fix that) to avoid throwing ConcurrentModificationExceptions.  This
needs to be altered so that iterators that do the special moveTo operations that
formerly "reset" the state, acquire the new current state of the index, if a
copy-on-write had occurred, so they can "see" the changed index.

Before I embark on this fix, I'd feel better if I could get some confirmation
that Ruta is operating in this manner (at least for this test case) (i.e.,

1) adding Annotations to indexes
2) getting iterator(s) over those in RutaStream
3) removing and adding Annotations to the indexes while holding on to these
iterators
4) avoiding any ConcurrentModificationExceptions by always doing one of the 3
repositioning iterator operations: moveTo First/Last/a-Feature_structure, before
doing any other operation on the iterator.

Thanks. -Marshall

Reply via email to