Murat Ali Bayir wrote:
> Hi, I want to know is there any method for merging outputs of multiple 
> crawls. Assume that We have one main
> crawler having time period 4T
> /MainCrawler/crawldb
> /MainCrawler/segments
> /MainCrawler/linkdb
> . then We have topic-spesific focused crawler having time period T
> /FocusedCrawler/crawldb
> /FocusedCrawler/segments
> /FocusedCrawler/linkdb
> I want to know is there any way to merge these two databases.  Another 
> question is that do I need to merge them for
> indexing and querying purposes? Does anyone suggest an architecture 
> about this?

"mergedb" and "mergelinkdb" serve exactly this purpose. Yes, you need to 
merge them if you want to index segments  to form a single index (and 
you need the merged linkdb on the searcher if you want to use anchors.jsp).

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to