Before I embark on building an RTF annotator I thought I'd ask around a bit to see if anyone had built such a thing. Most of the medical notes I have to handle are in RTF format. I can pretty easily extract the text only using something like Apache TIka, but there is important information in the formatting as well (bold, italic, font sizes, centering, tables, etc) that I'd like to use. Is anyone aware of a UIMA annotator that does this already?
Thanks, Dave Kincaid
