Hi Sean, Thank you for the pointer. I was able to run the SentenceDetectorAnnotatorBIO from ctakes-core. The results are way better than the SentenceDetectorAnnotator but I still see some issues such as splitting “Dr.” as a separate sentence (most likely due to the period after the abbreviation). Do you think there is a way to define an abbreviation list for SentenceDetectorAnnotatorBIO so that it knows that this is a word-final (i.e. abbreviation-final) and not a sentence-final period?
Thanks again, Masoud On 3/9/18, 5:35 PM, "Finan, Sean" <sean.fi...@childrens.harvard.edu> wrote: Hi Masoud, There is a very nice SentenceDetectorBIO in ctakes-core. It will split sentences based upon features other than just a newline character, which appears to be what you want. Sean ________________________________________ From: Masoud Rouhizadeh <m...@jhu.edu> Sent: Friday, March 9, 2018 4:41 PM To: dev@ctakes.apache.org Subject: Sentence splitter [EXTERNAL] Hello cTAKES team! I was wondering what types of sentence splitters are available in cTAKES? The default sentence splitter does not appear to be the best one. See output for the demo example from the example in cTAKES installation guide: Dr. Nutritious Medical Nutrition Therapy for Hyperlipidemia Referral from: Julie Tester, RD, LD, CNSD Phone contact: (555) 555-1212 Height: 144 cm Current Weight: 45 kg Date of current weight: 02-29-2001 Admit Weight: [...] Thanks so much, Masoud ---- Masoud Rouhizadeh, PhD NLP Specialist / Software Engineer Institute for Clinical and Translational Research Johns Hopkins University https://urldefense.proofpoint.com/v2/url?u=http-3A__pages.jh.edu_-7Emrouhiz1&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=aZ4yDE4zQbRJuUQ8p-T5nPrjhYvXF28sFoJWEtP3sGU&s=ob0U2sSfS7UijTI8PqCh_MwMucxPc14ovmcC2vq7rDA&e=