Re: How to properly update offsets of an annotation?

Richard Eckart de Castilho Wed, 13 Aug 2014 07:46:05 -0700

On 13.08.2014, at 14:49, Marshall Schor <[email protected]> wrote:

> hi,
> 
> some things to check: 
> 
> When you say the tokens "remain in the CAS", I think you mean the tokens 
> remain
> in one or more indexes, because,
> of course, the removeFromIndexes doesn't remove things from the CAS.


Sure. 

> The behavior of the removeFromIndexes depends on the kinds of indexes you have
> defined;  if you have a bag or sorted index (that is, not a "set" index), then
> it is quite possible to "add-to-indexes" the same feature structure multiple
> times.  If this has happened, and then you do just one "remove-from-index", 
> the
> other indexing would still be in the index.

No custom indexes are defined - so we're talking only about the default index
over the Annotation type.

> What kinds of indexes do you have defined, here, and what index is being
> selected to use in the
> 
> "for (def token : ....)"
> 
> syntax?

The annotation index is used here: cas.getAnnotationIndex(type)

Mind that the only difference between the tests I did was the text and
consequently the number of tokens and different offsets. The rest of the
setup (type system, indexes, etc) was all the same.

I'm confused...

Cheers,

-- Richard

> On 8/13/2014 5:33 AM, Richard Eckart de Castilho wrote:
>> Hi all,
>> 
>> I am facing a very odd situation with the following type of (pseudo-)code:
>> 
>> def previousToken;
>> def toDelete[];
>> for (def token : select(jcas, Token)) {
>>  if (previousToken && isName(previousToken, token) {
>>    token.setBegin(previousToken.getBegin());
>>    toDelete.add(previousToken);
>>  }
>>  previousToken = token;
>> }
>> 
>> for (def token : toDelete) {
>>  token.removeFromIndexes();
>> }
>> 
>> Depending on the text in the CAS, sometimes I get
>> the effect that the tokens in toDelete actually remain
>> in the CAS.
>> 
>> I tried a different approach in which I also record the
>> tokens with the updated start index and then do a
>> 
>> for (def token : toReindex) {
>>  token.removeFromIndexes();
>>  token.addToIndexes();
>> }
>> 
>> That seems to flip around the situation. If a token was
>> previously correctly removed, it now remains, and if a
>> token was not removed, it is removed now.
>> 
>> I would like to avoid having to create a new token annotation
>> with new offsets and then delete both the old annotations.
>> 
>> If need be, I can probably set up a minimal test case, but
>> before that, maybe somebody could give me a clue...
>> 
>> Cheers!
>> 
>> -- Richard

Re: How to properly update offsets of an annotation?

Reply via email to