Hi , using standard nutch parsers, i am able to get access to the org.apache.nutch.protocol.Content
to get some data to index from the original URI if they are not already found @Metadata object. Using Nutch 1.1 i want to use the tika parsers and wonder if this can be done - the API does not look like to make it happen. So maybe i miss the glue where i can do such things - maybe via my own tika parser (where to register them with nutch?). Or is it possible to stack parsers - e.g. let tika do its "standard" work and after that let the next Nutch Parser run to do this stuff? Any hints appreciated. thx Torsten -- Bitte senden Sie mir keine Word- oder PowerPoint-Anhänge. Siehe http://www.gnu.org/philosophy/no-word-attachments.de.html Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect." -- Linus Torvalds
smime.p7s
Description: S/MIME cryptographic signature

