Hi,

Lucene is up for an Apache board report this month, and Tika should
contribute its part to the report. Here's a quick draft:

<draft>
Apache Tika is a toolkit for detecting and extracting metadata and
structured text content from various documents using existing parser
libraries.

Development towards Tika 0.3 is ongoing. Metadata handling and
metadata frameworks like XMP have been a source of much discussion,
but so far no clear consensus on has been reached on whether or how
the metadata features in Tika should be extended.

A wiki was created for Tika.
</draft>

Anything I'm missing/misrepresenting?

BR,

Jukka Zitting

Reply via email to