David,
you don't need crawl and link db merged, right you need to provide a link db, but this is just for some detail information. I personal remove this feature from the jsp's. However merging the indexes will work, it is just a question where you store the index, how you name the folder and that you provide at least a dummy linkdb. I'm not sure what the name of the merged index folder should be, i guess index but you can take a look into the nutch bean init methods to verify things.
HTH
Stefan

Am 05.02.2006 um 04:54 schrieb McCallie,David:



Hello,

First, let me thank all the developers who have created Nutch -- it is
wonderful and elegant code.

Second, a simple question:

I am using "bin/nutch crawl" to crawl and index two separate sites: one
is an http site, and the second is a network file system. These two
crawls have completely different URL seed files, and different
crawl-urlfilter.txt files.  When the two crawls are done, I'd like to
merge the indexes into a single index for the webapp to search.  How
should I do this?  I tried using "bin/nutch merge" to simply merge the
index directories into a third directory.  This created a valid Lucene
Index (verified with Luke) but it won't work with the search.jsp in the webapp. I assume that I need to merge the crawldb and linkdb as well,
but I can't see how to do this?

Thanks in advance,

--david





CONFIDENTIALITY NOTICE

This message and any included attachments
are from Cerner Corporation and are intended
only for the addressee. The information
contained in this message is confidential and
may constitute inside or non-public information
under international, federal, or state
securities laws. Unauthorized forwarding,
printing, copying, distribution, or use of such
information is strictly prohibited and may be
unlawful. If you are not the addressee, please
promptly delete this message and notify the
sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas
City, Missouri, U.S.A at (+1) (816)221-1024.
---------------------------------------- --

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net


Reply via email to