There's one more little thing that I needed advice on. I am annotating clinical documents to find out Disorder Mentions in the document. I am aware that ctakes does that to some extent. What would be the best way to find Disorder Mentions in clinical documents? I am currently using the AggregatePlainTextUMLSProcessor for the same. Is there any other analysis engine that does the job better? How can I get more accurate annotations for Disorder Mentions in the clinical documents? Thanks! :)
On Wed, Jun 18, 2014 at 8:35 PM, Abhishek Raj <[email protected]> wrote: > Thanks a lot for your replies. CPE did the job for me. I used it with the > "test_plaintext.xml" CPE descriptor and "AggregatePlainTextUmlsProcessor" > as the Analysis Engine. Gave the path to input directory and gave a custom > output directory for writing CAS to XML file and that did it for me! Now I > have the annotation for each input file stored in an XML file in the output > directory. :) > > > On Wed, Jun 18, 2014 at 8:03 PM, Pei Chen <[email protected]> wrote: > >> Also check out the main class in: >> >> https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/src/main/java/org/apache/ctakes/clinicalpipeline/ClinicalPipelineFactory.java >> It uses uimaFIT style to programmatically wire up a pipeline and one can >> also use uimaFIT to access the Annotations (TypeSystem). >> >> --Pei >> >> >> On Wed, Jun 18, 2014 at 10:16 AM, vijay garla <[email protected]> wrote: >> >>> To Annotate: >>> If you have a CPE, and all the components in your pipeline are >>> threadsafe (i.e. drop LVG from your pipeline), you can increase the threads >>> in the cpe config >>> You can use this class: org.apache.ctakes.ytex.tools.RunCPE to run a >>> cpe from the command line/script >>> >>> Alternatively, run multiple CPE's in parallel (they need to be >>> processing different subsets of the corpus) >>> >>> To extract annotations: >>> Add the YTEX DBConsumer to store the annotations in a database (see >>> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1.2+-+YTEX+DBConsumer >>> ) >>> Make sure you configure 'types to ignore' - you don't want to store >>> annotations for punctuation. >>> >>> You can add the DBConsumer to any pipeline/CPE - you don't need any >>> other YTEX components (however, you do have to set up a database). >>> >>> >>> >>> >>> On Wed, Jun 18, 2014 at 4:56 AM, Richard Eckart de Castilho < >>> [email protected]> wrote: >>> >>>> A Groovy script has been mentioned on the developers list that >>>> illustrates how to use uimaFIT to compose and run a cTAKES pipeline. [1] >>>> >>>> I do not know if these scripts are only in SVN or if they are (planned >>>> to) be part of a release or of some documentation. >>>> >>>> Cheers, >>>> >>>> -- Richard >>>> >>>> [1] >>>> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3c996fc801c05df64a84246a106facacd021a...@msgpexcha08a.mfad.mfroot.org%3E >>>> >>>> On 18.06.2014, at 07:33, Abhishek Raj <[email protected]> wrote: >>>> >>>> > Hello. I have been looking for a way to run ctakes programatically to >>>> annotate large number of documents and extract those annotations. I haven't >>>> come across any docs so far which explains how to do that. If someone could >>>> throw some light on this issue, it'd be great. Thanks! :) >>>> >>>> >>> >> >
