What is the best way to delete 404 pages from the index files? It seems that 
the fetcher saves info about 404 pages, however when Nutch builds the segment 
index it does not include that data. As a result the searcher returns the 
document link from an older version of the index file, althouth the document 
does not exist anymore.
   
  SegmentMergeTool does handle this, however it is to slow too use too often.
  

                        
---------------------------------
Yahoo! Photos
 Got holiday prints? See all the ways to get quality prints in your hands ASAP.

Reply via email to