That sounds bizarre! I can think of two possibilities: a sentence break in the 
middle of the word (unlikely), or the different sentence splits caused the POS 
tagger some confusion, and tagged the word aspirin as a forbidden part of 
speech, like a preposition or something. If you check the token annotation on 
the word aspirin you should be able to see the part of speech tag for that word.

From: Tomasz Oliwa <>
Sent: Tuesday, March 13, 2018 5:34 PM
Subject: Re: Sentence splitter [EXTERNAL]


I tested SentenceDetectorAnnotatorBIO in cTAKES 4.0.0, simply by replacing 
SentenceDetectorAnnotator.xml with SentenceDetectorAnnotatorBIO.xml in 

While it seemed to work, I noticed that in one example, an IdentifiedAnnotation 
was not found, that was found for the same input with just 

Could somebody check this please? Run the cTAKES CVD with the following input 
(without the "):


his leg

On the machine I tested this, the MedicationMention does not show up with 
SentenceDetectorAnnotatorBIO, but it does with SentenceDetectorAnnotator.

From: Masoud Rouhizadeh <>
Sent: Tuesday, March 13, 2018 3:02:35 PM
Subject: Re: Sentence splitter [EXTERNAL]

Hi Sean,

Thank you for the pointer. I was able to run the SentenceDetectorAnnotatorBIO 
from ctakes-core. The results are way better than the SentenceDetectorAnnotator 
but I still see some issues such as splitting “Dr.” as a separate sentence 
(most likely due to the period after the abbreviation). Do you think there is a 
way to define an abbreviation list for SentenceDetectorAnnotatorBIO so that it 
knows that this is a word-final (i.e. abbreviation-final) and not a 
sentence-final period?

Thanks again,

On 3/9/18, 5:35 PM, "Finan, Sean" <> wrote:

    Hi Masoud,

    There is a very nice SentenceDetectorBIO in ctakes-core.  It will split 
sentences based upon features other than just a newline character, which 
appears to be what you want.


    From: Masoud Rouhizadeh <>
    Sent: Friday, March 9, 2018 4:41 PM
    Subject: Sentence splitter [EXTERNAL]

    Hello cTAKES team!

    I was wondering what types of sentence splitters are available in cTAKES? 
The default sentence splitter does not appear to be the best one. See output 
for the demo example from the example in cTAKES installation guide:

    Dr. Nutritious Medical Nutrition Therapy for Hyperlipidemia Referral from:

    Julie Tester, RD, LD, CNSD Phone contact:


    555-1212 Height:

    144 cm Current Weight:

    45 kg Date of current weight: 02-29-2001 Admit Weight:


    Thanks so much,



    Masoud Rouhizadeh, PhD

    NLP Specialist / Software Engineer

    Institute for Clinical and Translational Research

    Johns Hopkins University

