Julien Nioche wrote:
Hi,

I contributed an annotator to the sandbox some time ago which uses Tika to
convert original markup into UIMA annotations. It does not seem to be listed
on the website but it should be in the SVN repository of the sandbox.

Tika supports numerous formats such as PDF, XML, HTML
I checked in the code 4 months ago. Please have a look at it to make
sure everything is as intended.

Here is the svn link:
http://svn.apache.org/viewvc/incubator/uima/sandbox/trunk/TikaAnnotator/

Jörn

Reply via email to