Re: Questions about dictionary-lookup and dictionary-lookup-fast

Maite Meseure Hugues Tue, 10 Mar 2015 10:05:57 -0700

Thank you Sean for your complete reply, it's helpful.

On Tue, Mar 10, 2015 at 11:53 AM, Finan, Sean <
[email protected]> wrote:


> Hi Maite,
>
> > Does anyone know why is it [UmlsDictionaryLookupAnnotator ]so slow?
> The top 5 reasons (1-3 are 90% of the problem):
> 1.  The dictionary database is bloated with unwanted entries
> 2.  The dictionary database indexing is sub-optimal
> 3.  The second drug lookup with orangebook filtering takes extra time
> 4.  The matching algorithm does a little more work than is necessary
> 5.  There is some redundancy
>
> > my interest is to be able to create my own HsqlDb-based dictionary
> If you want to build a database using a subset of UMLS, check out the
> Dictionary Tool in Sandbox.  It can build custom hsqldb dictionaries in
> both the new (-fast) and old format using sources, tuis, filters, etc. that
> you specify in plaintext parameter files.  Several types of default setups
> are already available.  It is fully functional, but it has been a
> work-in-progress during my off-hours, so functionality changes and
> documentation is lacking, but there is a howto.txt  in the
> dictionarytool/doc/ directory.
>
> *NOTE: if your custom dictionaries are small (~1000 entries?) then it
> would probably be easier to just throw them into a bar-separated value
> (bsv) file.  There are examples in the dictionary-fast-res example/bsv/
> directory.
>
> Sean
>
> -----Original Message-----
> From: Maite Meseure Hugues [mailto:[email protected]]
> Sent: Tuesday, March 10, 2015 12:35 PM
> To: [email protected]
> Subject: Questions about dictionary-lookup and dictionary-lookup-fast
>
> Hi everyone,
>
> 1) I am currently working on BagOfCuisGenerator.java with the analysis
> engine 'AggregatePlaintextUMLSProcessor.xml', but that process is very slow
> at that step:
>
> INFO UmlsDictionaryLookupAnnotator - process(JCas)
>
> Does anyone know why is it so slow?
>
> 2) I also tried with 'AggregatePlaintextFastUMLSProcessor.xml' and it's
> actually pretty fast like his name suggests, but my interest is to be able
> to create my own HsqlDb-based dictionary like we can do with a Lucene index
> and integrate it in the process, is it possible with the fast version? Do
> you have any pointers that could allow me to do that?
>
> Thank you very much for you time.
>
> --
> --
>  Maïté Meseure Hugues
>



-- 
--
 Maïté Meseure Hugues

Re: Questions about dictionary-lookup and dictionary-lookup-fast

Reply via email to