Could you also just copy segments out of NDFS to local -- perform merges
in local -- then copy segments back into NDFS?

DaveG


-----Original Message-----
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 12, 2006 2:14 PM
To: [email protected]
Subject: Re: MapReduce and segment merging

Mike Alulin wrote:
> Then how people uses the new version if they need let's say daily
crawls of the new/updated pages? I crawl updated pages every 24 hours
and if I do not merge the segments, soon I will have hundreds of them.
What is the best solution in this case? 
>    
>   Full recrawl is not a good option as i have millions of documents
and I DO know which of them were updated without requesting them.
>   

This is a development version, nobody said it's feature complete. 
Patience, my friend... or spend some effort to improve it. ;-)

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to