RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
On my end it looks like my email was reformatted and some of my -newline- removed in those last examples ... -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, January 22, 2014 3:42 PM To: dev@ctakes.apache.org Subject: RE: sentence detector

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
Thanks James > but then no typical sentence ending punctuation at the end of the line Gotcha. > So simply using Lines would not suffice in those cases because it would run > together sentences where there are more than one on a line I was actually thinking about something like a Line using -

RE: sentence detector newline behavior

2014-01-22 Thread Masanz, James J.
I know there are notes where there are multiple sentences on a line, but then no typical sentence ending punctuation at the end of the line (or no punctuation at all at the end of the line). And in those sections, negation can be important. So simply using Lines would not suffice in those case

RE: sentence detector newline behavior

2014-01-22 Thread Finan, Sean
Just whistling in the wind here ... Perhaps before any changes are made to universally toggle cTakes in one direction or the other, we can take a poll of when & where cTakes/Ytex/OpenNLP/Omaha needs a Sentence (ignoring CR/LF) as opposed to a Line (CR/LF delimited PLUS -sentence-) If some capa

RE: sentence detector newline behavior

2014-01-22 Thread Masanz, James J.
The only rule I know of is that cTAKES (prior to ytex integration) always forces a sentence break at a newline. This was because the clinical notes cTAKES original processed never had newlines in the middle of a sentence, but did need sentence breaks to occur at end of sentence for good negation