I am by no means a Nutch expert yet, but this is how I merged two
separate segments so I could search through them:
Step 1:
$ bin/nutch mergesegs -local -o testmerge -i
../crawls/foo/segments/20051018224434/
../crawls/bar/segments/20051018225505/
< bunch of stuff happens >
This creates a segment 20051023112848 in the testmerge folder. The
segment contains a combined index as well as copies of all information
from the two input segments.
Step 2:
This wasn't quite enough to search with, however. I copied the index
folder and organized the directories into the same structure as used
during a crawl, then was able to run the Tomcat searcher on the new
segment.
After copying/moving/reorganizing I have:
$ ls -l testmerge/
total 0
drwxrwxrwx+ 2 Oct 23 11:42 index
drwxrwxrwx+ 3 Oct 23 11:42 segments
$ ls -l testmerge/segments/
total 0
drwxrwxrwx+ 7 Oct 23 11:28 20051023112848
Step 3:
Then place this in Tomcat's nutch-site.xml file:
<nutch-conf>
<property>
<name>searcher.dir</name>
<value>C:\path_to_testmerge\testmerge</value>
</property>
</nutch-conf>
Run Tomcat and search away.
Hope this helps,
-Graham
> -----Original Message-----
> From: AJ Chen [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 25, 2005 4:03 PM
> To: [email protected]
> Subject: merge indices from multiple webdb
>
> Has anyone merged indices from two separate webdb? I have two
> separate webdb and need to find a good way to combine them
> for unified search.
> AJ
>
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers