Hello everyone, I'm currently thinking of using Nutch in a new website project. My aim is to index files (HTML, TXT, PDF ...) stored on a filesystem (which Nutch can ), but some of the files may have meta-information stored in a separate file. Then, a web user may search the index containing those files.
For example, the " technical_documentation.pdf " file, may have a " technical_documentation.xml " linked to it (for example in the same folder ), this XML containing informations like " <type>documentation</type> " and so. Is there any way to achieve this using Nutch ? Is it able to combine informations/content from two files into a single searchable item ? Or maybe I'm not choosing the right tool to achieve this? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Merging-Searching-both-file-and-meta-information-file-tp2574567p2574567.html Sent from the Nutch - User mailing list archive at Nabble.com.

