One follow-up.  Is the Constituency parser needed for good results with the 
assertion modules (History, Generic, Uncertainty, etc.)?

From: Miller, Timothy [mailto:[email protected]]
Sent: Monday, March 07, 2016 11:01 AM
To: [email protected]
Subject: Re: MaxentParserWrapper

Hi Brandon,
I wrote the constituency parser module. It is basically a wrapper for the 
OpenNLP constituency parser. The only thing our module does is convert from our 
TypeSystem into tokens for the parser, run the parser, then convert the output 
back into our typesystem.

As far as slowness, it is known that there are issues with extremely long 
sentences (I believe the algorithm is n^3 on the input so this makes sense). 
But we have found (Sean Finan pointed this out) that the problem is often 
coming from upstream, with misclassified strings of punctuation used as section 
delimiters being tokenized/segmented as super long sentences. I believe he 
implemented some workarounds in some of our pipelines to recognize "non-real" 
sentences and have the parser skip them, but I don't know off the top of my 
head where that is and whether or not it's checked in.

Maybe Sean can chime in with more info if that sounds familiar.

Tim

On 03/07/2016 09:06 AM, Geise, Brandon D. wrote:
Hi,

Can someone point me in the direction of where I can dig deeper into the 
MaxentParserWrapper?  I'm seeing some long slowness  once I get to this point 
in the pipeline and would like to understand what's going on a little better.

Thanks,
Brandon

________________________________

IMPORTANT WARNING: The information in this message (and the documents attached 
to it, if any) is confidential and may be legally privileged. It is intended 
solely for the addressee. Access to this message by anyone else is 
unauthorized. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken, or omitted to be taken, in reliance on it is 
prohibited and may be unlawful. If you have received this message in error, 
please delete all electronic copies of this message (and the documents attached 
to it, if any), destroy any hard copies you may have created and notify me 
immediately by replying to this email. Thank you. Geisinger Health System 
utilizes an encryption process to safeguard Protected Health Information and 
other confidential data contained in external e-mail messages. If email is 
encrypted, the recipient will receive an e-mail instructing them to sign on to 
the Geisinger Health System Secure E-mail Message Center to retrieve the 
encrypted e-mail.

Reply via email to