On my end it looks like my email was reformatted and some of my -newline-
removed in those last examples ...
-Original Message-
From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
Sent: Wednesday, January 22, 2014 3:42 PM
To: dev@ctakes.apache.org
Subject: RE: sentence detector
Thanks James
> but then no typical sentence ending punctuation at the end of the line
Gotcha.
> So simply using Lines would not suffice in those cases because it would run
> together sentences where there are more than one on a line
I was actually thinking about something like a Line using -
I know there are notes where there are multiple sentences on a line, but then
no typical sentence ending punctuation at the end of the line (or no
punctuation at all at the end of the line). And in those sections, negation can
be important. So simply using Lines would not suffice in those case
Just whistling in the wind here ...
Perhaps before any changes are made to universally toggle cTakes in one
direction or the other, we can take a poll of when & where
cTakes/Ytex/OpenNLP/Omaha needs a Sentence (ignoring CR/LF) as opposed to a
Line (CR/LF delimited PLUS -sentence-)
If some capa
The only rule I know of is that cTAKES (prior to ytex integration) always
forces a sentence break at a newline.
This was because the clinical notes cTAKES original processed never had
newlines in the middle of a sentence, but did need sentence breaks to occur at
end of sentence for good negation