Text and metadata extraction processor

Dmitry Goldenberg Thu, 24 Mar 2016 08:45:09 -0700

Hi,

I see that the ExtractText processor extracts text using regex.


What about a processor that extracts text and metadata from incoming
files?  That doesn't seem to exist - but perhaps I didn't quite look in the
right spots.

If that doesn't exist I'd like to implement and commit it, using Apache
Tika.  There may also be a couple of related processors to that.

Thoughts?

Thanks,
- Dmitry

Text and metadata extraction processor

Reply via email to