I'll take a look at the patch. Also be aware of 
https://issues.apache.org/jira/browse/CTAKES-31 which talks about a way of 
enhancing performance  -- if willing to assume annotations (BaseTokens 
currently) are sorted. Currently it's always BaseToken and always sorted, just 
not sure if we want to code to that assumption.

________________________________________
From: ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org 
[ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org] on behalf 
of Tim Miller [[email protected]]
Sent: Monday, February 04, 2013 3:43 PM
To: [email protected]
Subject: assistance with dictionary lookup issue

Pei helped me track down an issue with performance I'd noticed in the
dictionary annotator, and I have filed the issue here:
https://issues.apache.org/jira/browse/CTAKES-143

I implemented a quick and dirty proof of concept fix and noticed
dramatic performance improvement.  I attached the patch to the issue,
but it involves changing an interface (currently does not try to fix
other implementing classes so obviously not ready for primetime), so I
wanted to solicit the list first in case anyone with better knowledge of
that module has some better engineering ideas than what I came up with.

Thanks,

--
Tim Miller, PhD
Postdoctoral Research Fellow
Children's Hospital Informatics Program
Children's Hospital Boston and Harvard Medical School
617-919-1223

Reply via email to