On 13.08.2014, at 14:49, Marshall Schor <[email protected]> wrote:
> hi,
>
> some things to check:
>
> When you say the tokens "remain in the CAS", I think you mean the tokens
> remain
> in one or more indexes, because,
> of course, the removeFromIndexes doesn't remove things from the CAS.
Sure.
> The behavior of the removeFromIndexes depends on the kinds of indexes you have
> defined; if you have a bag or sorted index (that is, not a "set" index), then
> it is quite possible to "add-to-indexes" the same feature structure multiple
> times. If this has happened, and then you do just one "remove-from-index",
> the
> other indexing would still be in the index.
No custom indexes are defined - so we're talking only about the default index
over the Annotation type.
> What kinds of indexes do you have defined, here, and what index is being
> selected to use in the
>
> "for (def token : ....)"
>
> syntax?
The annotation index is used here: cas.getAnnotationIndex(type)
Mind that the only difference between the tests I did was the text and
consequently the number of tokens and different offsets. The rest of the
setup (type system, indexes, etc) was all the same.
I'm confused...
Cheers,
-- Richard
> On 8/13/2014 5:33 AM, Richard Eckart de Castilho wrote:
>> Hi all,
>>
>> I am facing a very odd situation with the following type of (pseudo-)code:
>>
>> def previousToken;
>> def toDelete[];
>> for (def token : select(jcas, Token)) {
>> if (previousToken && isName(previousToken, token) {
>> token.setBegin(previousToken.getBegin());
>> toDelete.add(previousToken);
>> }
>> previousToken = token;
>> }
>>
>> for (def token : toDelete) {
>> token.removeFromIndexes();
>> }
>>
>> Depending on the text in the CAS, sometimes I get
>> the effect that the tokens in toDelete actually remain
>> in the CAS.
>>
>> I tried a different approach in which I also record the
>> tokens with the updated start index and then do a
>>
>> for (def token : toReindex) {
>> token.removeFromIndexes();
>> token.addToIndexes();
>> }
>>
>> That seems to flip around the situation. If a token was
>> previously correctly removed, it now remains, and if a
>> token was not removed, it is removed now.
>>
>> I would like to avoid having to create a new token annotation
>> with new offsets and then delete both the old annotations.
>>
>> If need be, I can probably set up a minimal test case, but
>> before that, maybe somebody could give me a clue...
>>
>> Cheers!
>>
>> -- Richard