New to NLP and navigating the options.

Mark Ettinger Mon, 06 Apr 2009 18:05:35 -0700

Hello all,

I am a trained mathematician/computer scientist/programmer jumping into NLP and
excited by the challenge but intimidated by the algorithm and software options.
 Specifically, I am at University of Texas and am charged with putting to good
use our large database of (more-or-less unused) clinical notes.  My strategy is
roughly:


1.  Learn the theory of NLP and Information Extraction.
2.  Understand the publicly available software packages so as to avoid
reinventing the wheel.
3.  Apply #2 to our database and begin experimenting.

My question in this post centers on #2.  Not being a software engineer (though
having lots of scientific programming experience), I am sometimes puzzled by
"frameworks" and "components".  I think of everything as libraries of functions.
 Yes, I know this view is outdated.  I can wrap my head around NLP packages like
Lingpipe and NLTK but am unclear what a package like UIMA offers over and above
these types of pure libraries.  

Given what I've told you about my background (scientist, programmer, but NOT
software engineer) can someone explain to me how investing the time to learn
UIMA will pay off in the long run?  I've started to dig into the UIMA api but
thought I'd throw this rather basic question out there, hoping someone wouldn't
think it too naive for this forum.

Thanks in advance!

Mark Ettinger

New to NLP and navigating the options.

Reply via email to