Re: Ctakes to process 5000K recoreds

2014-09-09 Thread Pei Chen
Nick, When you mean no medication is being annotated, I presume you mean the medication attributes (i.e. dosage, frequency, etc.) are not being annotated? I think the DrugNER needs a list of section names in the config; I think it includes SIMPLE_SEGMENT. I am very surprised that

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Nick Nikandish
Pei, I need the name of the medications for the application that I wrote and uses ctakes.so I cache the medication in DictionaryLookupAnnotator(in performLookup()) and use them in my program but when I have SimpleSegementAnnotator it just takes forever. After taking SimpleSegementAnnotator

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Masanz, James J.
I suspect that when you take out simple segment annotated, nothing is getting processed, and that is why it appears so fast. At least some of the annotators loop through the list of sections/segments, which is why there is a simple segment annotator - so that there is at least one

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Masanz, James J.
If you just need the medication names, you can remove these: nodeContextDependentTokenizerAnnotator/node nodeDependencyParser/node nodeAssertionAnnotator/node You might be able to get rid of the LvgAnnotator and still get decent results since variations of word form should not affect

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Hi Nick, I think that the bottleneck is probably the lookup module itself. So, I just sent you a secure email/ftp link. It contains a build of the new dictionary-lookup-fast module. Should you choose to try it, let me know how things turn out. Sean

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Nick Nikandish
Hi Sean, Many thanks, I will try it tomorrow. Do you have any special instruction to run that scrip or I have to use it with cTakes? Thanks, Nick -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:24 PM To:

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Just use it with cTakes. Instead of removing other modules from the pipeline, replace the dictionary-lookup with dictionary-lookup-fast. For the desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml , you would modify: delegateAnalysisEngine

Re: Ctakes to process 5000K recoreds

2014-09-09 Thread Bruce Tietjen
Sean, If that is a script for generating a dictionary for use with dictionary-lookup-fast, I would also be very interested in checking it out. Thanks, Bruce [image: IMAT Solutions] http://imatsolutions.com Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547

Re: Ctakes to process 5000K recoreds

2014-09-09 Thread Chen, Pei
(Trying to avoid passing individual jars via email) Sent from my iPhone On Sep 9, 2014, at 5:26 PM, Chen, Pei pei.c...@childrens.harvard.edu wrote: Sean- Aren't the scripts to generate the DB already available in the sandbox area? Sent from my iPhone On Sep 9, 2014, at 5:24 PM,

Recommendation for ctakes default (UMLS) dictionaries

2014-09-09 Thread andy mcmurry
Greetings ctakes-dev: *UMLS license restrictions have been getting more lax over the years -- *much of the UMLS can be downloaded directly from the NCBI official FTP site. In fact, the NIH (and implicitly the NLM) *have already made the standard terms public for some medical specialities*. For