Re: [EXTERNAL] Large files taking forever to process

Greg Silverman Tue, 24 Sep 2019 17:40:45 -0700

Sean's fix did the magic.

Thanks for the suggestion though. I'm wondering how this would work with
our custom implementation of MetaMap with UIMA-AS (it is SLOW as molasses)


Best!

Greg--

On Tue, Sep 24, 2019 at 7:18 PM Petersam, John Contractor <
john.peter...@ssa.gov> wrote:

> Hi Greg,
> We regularly process documents that are over 5000 pages (not lines).  What
> we've found is that many of the annotators within the standard distribution
> operate at o(n^2).  The standard dependency parser is one example among
> many.
>
> The good news is that you can achieve linear results if you convert these
> classes to use TreeMaps.  We actually build the tree maps one time and
> cache them in ThreadLocal variables which allows us to process multiple
> threads simultaneously.
>
> Hope this helps,
> John
>
> -----Original Message-----
> From: Greg Silverman <g...@umn.edu>
> Sent: Tuesday, September 24, 2019 6:47 PM
> To: dev@ctakes.apache.org
> Subject: [EXTERNAL] Large files taking forever to process
>
> Any suggestions on how to speed up processing large clinical text notes
> approaching 13K lines? This is a very old corpus culled from EPIC notes
> back in 2009. I thought about splitting the notes into smaller chunks, but
> then I would have to deal with the offsets when analyzing system output
> against manual annotations that had been done.
>
> As is, I've tried different garbage collection options (this seemed to
> have worked well with CLAMP on the same set of notes).
>
> TIA!
>
> Greg--
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> Department of Surgery
> University of Minnesota
> g...@umn.edu
>
>  ›  evaluate-it.org  ‹
>


-- 
Greg M. Silverman
Senior Systems Developer
NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
Department of Surgery
University of Minnesota
g...@umn.edu

 ›  evaluate-it.org  ‹

Re: [EXTERNAL] Large files taking forever to process

Reply via email to