Hi David, There is work being done on Tika/OCR integration, but I am not aware of any cTAKES RTF Annotators. What does others think? Having additional meta data such does sound very interesting especially with mark-ups (bold/italics) and semi-structured data such as tables...
--Pei On Sun, Sep 1, 2013 at 5:41 PM, David Kincaid <kincaid.d...@gmail.com>wrote: > Before I embark on building an RTF annotator I thought I'd ask around a > bit to see if anyone had built such a thing. Most of the medical notes I > have to handle are in RTF format. I can pretty easily extract the text only > using something like Apache TIka, but there is important information in the > formatting as well (bold, italic, font sizes, centering, tables, etc) that > I'd like to use. Is anyone aware of a UIMA annotator that does this already? > > Thanks, > > Dave Kincaid >