nutch-user  

1 - 10 of 507 matches



intranet recrawl 0.9

2007/08/09 All, Does anyone have an updated recrawl script for 0.9? Also, does anyone have a link that describes each phase of a crawl / recrawl (for 0.9) it looks like it changes each version. I searched the wiki, but i am still unclear. thanks -- Brian Demers

Intranet Recrawl Script for 0.8.0

2006/07/14 Does anyone have a good Intranet recrawl script for nutch-0.8.0? Thanks.. Matt -- Matthew Holt

Recrawl urls

2006/08/03 Hello, I was searching for the method to add new url to the crawling url list and how to recrawl all urls... Can you help me ? thanks, -- Nahuel ANGELINETTI -- Nahuel ANGELINETTI

recrawl index

2006/12/29 hi, I'm new to nutch. I have crawled my website. But we can I recrawl/refresh the index without delete the crawl folder? kind regards frank -- Otto, Frank

Please Help.. recrawl script.. will send out to the list when finished for 0.8.0

2006/07/20 I sent out a few emails regarding a recrawl script I wrote. However, if it' be easier for anyone to help, can you please check that all of the below steps are the only ones that need to be taken to recrawl? Or if there is a resource online that describes manually -- Matthew Holt

Re: Recrawl urls

2009/05/14 Thanks for these information about recrawling. I am running a recrawling operation but every time I do it, I don't get the same results as the first crawl(different documents , not the same web pages). So how can I handle to recrawl same pages? Maybe fixe the property -- aidahaj

Recrawl error pages optimization

2007/05/05 Hi, I crawled a website. Around 500 out of 5000 pages generated errors/exceptions. I would like to recrawl only these 500 pages. The errors appear to be something similar to this: Segment#1: 0 errors Segment#2: 120 errors Segment#3: 10 errors Segment#4: 370 errors Segment#5: 0 errors Q1 -- karthik085

Re: recrawl in 1.0

2008/06/06://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: scottyd [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, June 5, 2008 2:44:21 PM Subject: recrawl in 1.0 I was wondering how to accomplish a recrawl in the trunk release of nutch -- ogjunk-nutch

Updating index without restarting the app server

2008/11/07 Hi, When the recrawl is being done, the app server requires a restart to get the new indexes reflected. If the folder where the recrawl must be done is pointed by the web app, a folder named merge-output is created inside the index folder once the recrawl is completed. Is there any way -- shree lakshmi

difference in time between an initial crawl and recrawl with a full crawldb

2009/12/16 hi, i just want to know the difference between a first initial crawl and a recrawl using the fetch, generate, update commands is there a diffence in time between using an initial crawl every time (by deleting the crawl_folder ) and using a recrawl without deleting the initial crawl -- BELLINI ADAM

  1   2   3   4   5   6   7   8   9   10   >