Looks good to me, with one question.

Instead of getting an iterator and then building a new list, can we just skip 
getting the iterator and use the list that selectCovered returns?

I will mock up a diff here of what I mean:
-       Iterator btaItr = org.uimafit.util.JCasUtil.selectCovered(jcas, 
BaseToken.class, covering).iterator();
-       while (btaItr.hasNext())
-               {
-                       BaseToken bta = (BaseToken) btaItr.next();
-                               ltList.add(lt);
-                       }
-               }

+       ltList = org.uimafit.util.JCasUtil.selectCovered(jcas, BaseToken.class, 
covering);
        
        return ltList;

I know you said it was quick and dirty at the moment - my 2 cents - unless 
someone comes up with a better engineered solution, I think we could add the 
new method (with a name like getLookupTokens) and leave the old one so we don't 
have to deprecate anything. And phase in the change to the various 
*LookupInitializerImpl classes if needed.

-- James


> -----Original Message-----
> From: ctakes-dev-return-1138-Masanz.James=mayo....@incubator.apache.org
> [mailto:ctakes-dev-return-1138-Masanz.James=mayo....@incubator.apache.org]
> On Behalf Of Masanz, James J.
> Sent: Monday, February 04, 2013 4:01 PM
> To: [email protected]
> Subject: RE: assistance with dictionary lookup issue
> 
> I'll take a look at the patch. Also be aware of
> https://issues.apache.org/jira/browse/CTAKES-31 which talks about a way of
> enhancing performance  -- if willing to assume annotations (BaseTokens
> currently) are sorted. Currently it's always BaseToken and always sorted,
> just not sure if we want to code to that assumption.
> 
> ________________________________________
> From: ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org
> [ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org] on
> behalf of Tim Miller [[email protected]]
> Sent: Monday, February 04, 2013 3:43 PM
> To: [email protected]
> Subject: assistance with dictionary lookup issue
> 
> Pei helped me track down an issue with performance I'd noticed in the
> dictionary annotator, and I have filed the issue here:
> https://issues.apache.org/jira/browse/CTAKES-143
> 
> I implemented a quick and dirty proof of concept fix and noticed dramatic
> performance improvement.  I attached the patch to the issue, but it
> involves changing an interface (currently does not try to fix other
> implementing classes so obviously not ready for primetime), so I wanted to
> solicit the list first in case anyone with better knowledge of that module
> has some better engineering ideas than what I came up with.
> 
> Thanks,
> 
> --
> Tim Miller, PhD
> Postdoctoral Research Fellow
> Children's Hospital Informatics Program
> Children's Hospital Boston and Harvard Medical School
> 617-919-1223

Reply via email to