Hi,
I am currently leveraging cTAKES inside of Apache Spark and have
written a function that takes in a single clinical note as as string
and does the following:
1) Sets the UMLS system properties.
2) Instantiates JCAS object.
3) Runs the default pipeline
4) (Not shown below) Grabs the annotation
Peter
wrote:
> About your second question with UMLS, You can build the pipeline
> initially and it will verify the license info, then just reuse the
> pipeline on each call.
>
>
>
> On 7/25/17, 4:53 PM, "Michael Trepanier" wrote:
>
>>Hi,
>>
>>I
Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++
>
>
> On 7/28/17, 1
Hi All,
We've been attempting to scale our cTAKES Pipeline on top of Spark, so
we've switched form using the "getDefaultPipeline" method to the
"getFastPipeline" method to boost the processing speed. However, while the
default pipeline works fine with Spark, the fast pipeline is throwing the
below
ulting
> in
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range:
> -7
>
> Are you using cTAKES 4.0 (either from the convenience binary download or
> as a maven dependency) or are you using cTAKES in some other way
>
> -- James
>
>
> On Fri, Sep 1, 2017 at 3:
I am attempting to run cTAKES from an executable UberJar. While the fast
pipeline seems to run correctly (in terms of producing an output), when
stepping through the LvgAnnotator related steps, cTAKES produces the below
error.
26 Sep 2017 22:47:01 INFO LvgAnnotator - URL for lvg.properties
=file:
revisit that when I look at that patch.
>
> -- James
>
>
>
> On Tue, Sep 26, 2017 at 5:53 PM, Michael Trepanier
> wrote:
>
>> I am attempting to run cTAKES from an executable UberJar. While the fast
>> pipeline seems to run correctly (in terms of producing an o
Hi,
I am attempting to run the default FastPipeline to extract various features
from clinical text. One of the features I'd like to capture is the covered
text. However, when running the below scala code, calling getOriginalText
yields a "null" value for every annotation of type IdentifiedAnnotati
Thanks Jessica - that does exactly what I need.
On Tue, Feb 13, 2018 at 1:49 PM, Jessica Glover
wrote:
> Hi Mike,
>
> Have you tried the getCoveredText() method that IdentifiedAnnotation
> inherits from Annotation?
>
> - Jessica
>
> On Tue, Feb 13, 2018 at 2:42 PM, Mi
Hi All,
Is it possible to avoid the UMLS credential check each time cTAKES is run?
It seems like cTAKES would be configurable in such a way to use UMLS
credentials to acquire the sno_rx_16abterms dictionary once, and then not
need to check against UMLS in future runs.
In particular, I am thinking
eryone a while back. From what we
> understand so far it seems that this may go away once you build and load
> your own dictionary vs using the default you mentioned. But we' haven't
> tested that yet. Lincoln
>
>
>
> *From:* Michael Trepanier [mailto:m...@metistream.com]
&
Hi All,
I am attempting to package cTAKES in a jar while while avoiding it copying
the lvg related files to /tmp/ as it does
in
/ctakes/trunk/ctakes-lvg/src/main/java/org/apache/ctakes/lvg/ae/LvgAnnotator.java.
Everything works up until cTAKES tries to path the lvg.properties file
within the jar
Ory,
In response to Gandhi's comments, the video below outlines custom
dictionary creation in detail:
https://www.youtube.com/watch?v=4aOnafv-NQs
Best,
Mike
On Mon, Aug 20, 2018 at 2:09 AM, Gandhi Rajan Natarajan <
gandhi.natara...@arisglobal.com> wrote:
> Hi Ory,
>
> I guess RxNORM and SNO
Hi,
I am running multiple cTAKES pipelines on a single machine in parallel,
each in their own JVM. Looking across the logs of each JVM, it appears that
severe blocking is occurring after the annotations are generated for a
particular segment. In particular, it looks like only one JVM is processing
The cTAKES Default Processing Pipeline requires about a minimum of 3G of
RAM due to the size of the embedded HSQLDBs (that is the default). However,
providing a fair bit of overhead is generally a good idea.
As for multi-threading, I have been using the ThreadSafeLvg class. Per the
component-use g
Hi,
I have a pipeline defined by the below aggregate description (in Scala).
aed = {
builder.add(SimpleSegmentAnnotator.createAnnotatorDescription)
builder.add(SentenceDetector.createAnnotatorDescription)
builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription)
builder.add(ThreadSafeL
Hi,
We currently have a pipeline which is generating ontology mappings for a
repository of clinical notes. However, this repository contains documents
which, after RTF parsing, can contain over 900,000 characters (albeit this
is a very small percentage of notes, out of ~13 million, around 50 conta
endently from the smaller documents.
>
> Best,
>
> Dima
>
>
>
>
> On Feb 27, 2019, at 16:59, Michael Trepanier wrote:
>
> Hi,
>
> We currently have a pipeline which is generating ontology mappings for a
> repository of clinical notes. However, this repos
curs once
> documents get above 12K-13K. We also target processing as many as 10
> annotators in a single pass of the corpus. This approach has worked well
> for us.
>
>
>
> Thanks,
>
> Ron
>
>
>
>
>
>
>
>
>
> *From: *Michael Trepanier
> *
e of a DiseaseDisorderMention, calling d
iseaseDisorderMention.getRelativeTemporalContext always returns a null. Am
I missing pipeline steps which link the TemporalTextRelation instances to
EventMention instances, or is it necessary to manually do this?
Thanks,
Mike
On Tue, Feb 19, 2019 at 4:47 PM Michael Trepanier
wrot
20 matches
Mail list logo