Re: How to properly update offsets of an annotation?

Marshall Schor Tue, 19 Aug 2014 14:19:59 -0700
easy test case?

Then I'll take a look :-) -Marshall
On 8/13/2014 10:44 AM, Richard Eckart de Castilho wrote:
> On 13.08.2014, at 14:49, Marshall Schor <[email protected]> wrote:
>
>> hi,
>>
>> some things to check: 
>>
>> When you say the tokens "remain in the CAS", I think you mean the tokens 
>> remain
>> in one or more indexes, because,
>> of course, the removeFromIndexes doesn't remove things from the CAS.
> Sure. 
>
>> The behavior of the removeFromIndexes depends on the kinds of indexes you 
>> have
>> defined;  if you have a bag or sorted index (that is, not a "set" index), 
>> then
>> it is quite possible to "add-to-indexes" the same feature structure multiple
>> times.  If this has happened, and then you do just one "remove-from-index", 
>> the
>> other indexing would still be in the index.
> No custom indexes are defined - so we're talking only about the default index
> over the Annotation type.
>
>> What kinds of indexes do you have defined, here, and what index is being
>> selected to use in the
>>
>> "for (def token : ....)"
>>
>> syntax?
> The annotation index is used here: cas.getAnnotationIndex(type)
>
> Mind that the only difference between the tests I did was the text and
> consequently the number of tokens and different offsets. The rest of the
> setup (type system, indexes, etc) was all the same.
>
> I'm confused...
>
> Cheers,
>
> -- Richard
>
>> On 8/13/2014 5:33 AM, Richard Eckart de Castilho wrote:
>>> Hi all,
>>>
>>> I am facing a very odd situation with the following type of (pseudo-)code:
>>>
>>> def previousToken;
>>> def toDelete[];
>>> for (def token : select(jcas, Token)) {
>>>  if (previousToken && isName(previousToken, token) {
>>>    token.setBegin(previousToken.getBegin());
>>>    toDelete.add(previousToken);
>>>  }
>>>  previousToken = token;
>>> }
>>>
>>> for (def token : toDelete) {
>>>  token.removeFromIndexes();
>>> }
>>>
>>> Depending on the text in the CAS, sometimes I get
>>> the effect that the tokens in toDelete actually remain
>>> in the CAS.
>>>
>>> I tried a different approach in which I also record the
>>> tokens with the updated start index and then do a
>>>
>>> for (def token : toReindex) {
>>>  token.removeFromIndexes();
>>>  token.addToIndexes();
>>> }
>>>
>>> That seems to flip around the situation. If a token was
>>> previously correctly removed, it now remains, and if a
>>> token was not removed, it is removed now.
>>>
>>> I would like to avoid having to create a new token annotation
>>> with new offsets and then delete both the old annotations.
>>>
>>> If need be, I can probably set up a minimal test case, but
>>> before that, maybe somebody could give me a clue...
>>>
>>> Cheers!
>>>
>>> -- Richard
>
>
Re: How to properly update offsets of an annotation?

Reply via email to