Hello,
First, let me thank all the developers who have created Nutch -- it is wonderful and elegant code. Second, a simple question: I am using "bin/nutch crawl" to crawl and index two separate sites: one is an http site, and the second is a network file system. These two crawls have completely different URL seed files, and different crawl-urlfilter.txt files. When the two crawls are done, I'd like to merge the indexes into a single index for the webapp to search. How should I do this? I tried using "bin/nutch merge" to simply merge the index directories into a third directory. This created a valid Lucene Index (verified with Luke) but it won't work with the search.jsp in the webapp. I assume that I need to merge the crawldb and linkdb as well, but I can't see how to do this? Thanks in advance, --david CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024. ---------------------------------------- --