I've been debugging the processing of a document using the pipeline defined
by DrugAggregatePlaintextUMLSProcessor.xml that results in a
StackOverflowError.
I have traced it down to what I think is the cause and have a code
modification that appears to solve the problem but there is probably a more
appropriate solution.
Is there someone who is more experienced with the code that can evaluate
the bug?
Bug Detail:
Test document text (minimal text necessary to trigger the bug):
aspirin decreased from 2:00 PM.
Call path:
generageDrugMentionsAndAnnotations
statusChangePhraseGenerator
(infinite recursion starts here)
generateAdditionalNER
generateDrugMentionsAndAnnotations
statusChangePhraseGenerator
generateAdditionalNER
...
I believe the problem occurs in generateAdditionalNER() at ~line 2213 with
getAdjustedWindowSpan() returning -1.
This causes the call to generateDrugMentionsAndAnnotations() to process the
whole document again rather than the remaining portion of the text.
Source code snippet:
} else if (drugChangeStatus.getChangeStatus().compareTo(
DrugChangeStatusToken.DECREASEFROM) == 0)
{
if (noPriorMention) {//Look for lowest value on right side
beginChunk = getAdjustedWindowSpan(jcas, beginChunk,
endSpan, true)[0];
}
String [] changeStatusArray = new String []
{DrugChangeStatusToken.DECREASE, new Integer
(drugChangeStatus.getBegin()).toString(), new
Integer(drugChangeStatus.getEnd()).toString()};
generateDrugMentionsAndAnnotations(jcas,
buildNewNER, beginChunk, endSpan,
tokenDrugNER, changeStatusArray, count, globalNER);
A simple fix might be to add the two lines:
if (noPriorMention) {//Look for lowest value on right side
beginChunk = getAdjustedWindowSpan(jcas, beginChunk,
endSpan, true)[0];
--> if (beginChunk == -1)
--> beginChunk = drugChangeStatus.getEnd();
}
Changing the logic inside getAdjustedWindowSpan() might be a more correct
and complete fix, but requires a much more detailed knowledge of the code.
Your evaluation of the correctness of this code modification would be
appreciated.
Bruce Tietjen
Perfect Search Corp.