Andrzej Bialecki wrote:

> Murat Ali Bayir wrote:
>
>> Hi, I want to know is there any method for merging outputs of 
>> multiple crawls. Assume that We have one main
>> crawler having time period 4T
>> /MainCrawler/crawldb
>> /MainCrawler/segments
>> /MainCrawler/linkdb
>> . then We have topic-spesific focused crawler having time period T
>> /FocusedCrawler/crawldb
>> /FocusedCrawler/segments
>> /FocusedCrawler/linkdb
>> I want to know is there any way to merge these two databases.  
>> Another question is that do I need to merge them for
>> indexing and querying purposes? Does anyone suggest an architecture 
>> about this?
>
>
> "mergedb" and "mergelinkdb" serve exactly this purpose. Yes, you need 
> to merge them if you want to index segments  to form a single index 
> (and you need the merged linkdb on the searcher if you want to use 
> anchors.jsp).
>
is it possible to do that without stopping main crawl or any other 
architecture suggestions?

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to