Looks good to me, with one question.
Instead of getting an iterator and then building a new list, can we just skip
getting the iterator and use the list that selectCovered returns?
I will mock up a diff here of what I mean:
- Iterator btaItr = org.uimafit.util.JCasUtil.selectCovered(jcas,
BaseToken.class, covering).iterator();
- while (btaItr.hasNext())
- {
- BaseToken bta = (BaseToken) btaItr.next();
- ltList.add(lt);
- }
- }
+ ltList = org.uimafit.util.JCasUtil.selectCovered(jcas, BaseToken.class,
covering);
return ltList;
I know you said it was quick and dirty at the moment - my 2 cents - unless
someone comes up with a better engineered solution, I think we could add the
new method (with a name like getLookupTokens) and leave the old one so we don't
have to deprecate anything. And phase in the change to the various
*LookupInitializerImpl classes if needed.
-- James
> -----Original Message-----
> From: ctakes-dev-return-1138-Masanz.James=mayo....@incubator.apache.org
> [mailto:ctakes-dev-return-1138-Masanz.James=mayo....@incubator.apache.org]
> On Behalf Of Masanz, James J.
> Sent: Monday, February 04, 2013 4:01 PM
> To: [email protected]
> Subject: RE: assistance with dictionary lookup issue
>
> I'll take a look at the patch. Also be aware of
> https://issues.apache.org/jira/browse/CTAKES-31 which talks about a way of
> enhancing performance -- if willing to assume annotations (BaseTokens
> currently) are sorted. Currently it's always BaseToken and always sorted,
> just not sure if we want to code to that assumption.
>
> ________________________________________
> From: ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org
> [ctakes-dev-return-1137-Masanz.James=mayo....@incubator.apache.org] on
> behalf of Tim Miller [[email protected]]
> Sent: Monday, February 04, 2013 3:43 PM
> To: [email protected]
> Subject: assistance with dictionary lookup issue
>
> Pei helped me track down an issue with performance I'd noticed in the
> dictionary annotator, and I have filed the issue here:
> https://issues.apache.org/jira/browse/CTAKES-143
>
> I implemented a quick and dirty proof of concept fix and noticed dramatic
> performance improvement. I attached the patch to the issue, but it
> involves changing an interface (currently does not try to fix other
> implementing classes so obviously not ready for primetime), so I wanted to
> solicit the list first in case anyone with better knowledge of that module
> has some better engineering ideas than what I came up with.
>
> Thanks,
>
> --
> Tim Miller, PhD
> Postdoctoral Research Fellow
> Children's Hospital Informatics Program
> Children's Hospital Boston and Harvard Medical School
> 617-919-1223