I wonder if anyone has any experience of keeping an index of content that comes both from a web-crawl and from a series of database queries? By this I mean using Nutch to crawl the web and Lucene to index database content but merging the indices created so that queries can be made to get results from either source. It would be useful to know whether this has been tried before. I notice that the Nutch index doesn't hold content but rather references to segment indices, which makes it difficult to see how it could be merged with content indexed from other sources.

Thanks,
Kelvin

_________________________________________________________________
Be the first to hear what's new at MSN - sign up to our free newsletters! http://www.msn.co.uk/newsletters

Reply via email to