Dear UIMA devs, We have recently developed an AnnotationReader for UIMA which uses Tika to convert the markup into annotations. The resource consists of a CollectionReader, a CasAnnotator and a utility class which can populate a cas with markup annotations. It is certainly not perfect but it does a decent job. The type system is inspired see http://cwiki.apache.org/UIMA/uima-sandbox-components.html
I would be more than happy to donate the code to the Sandbox. What is the procedure for that? Have a good week end Julien -- DigitalPebble Ltd http://www.digitalpebble.com
