20060619230003 Does exist, but it does not have an index directory... Here are the subdirectories it has:
content/ fetcher/ fetchlist/ parse_data/ parse_text/ This also seems to be the case for 8 other segment directories. The rest of the segment directories (excluding the 9 above) have this file structure: content/ fetchlist/ index.done parse_text/ fetcher/ index/ parse_data/ What would cause these segments to act this way? Is there a sure way to fix it? Can I prevent it from happening again? Matt ----- Original Message ----- From: "TDLN" <[EMAIL PROTECTED]> To: <[email protected]>; "Honda-Search Administrator" <[EMAIL PROTECTED]> Sent: Friday, June 23, 2006 11:28 AM Subject: Re: ERROR when recrawling... can ANYONE help? > Does /home/honda/nutch-0.7.2/crawl/segments/20060619230003/index exist at > all? > > Can you confirm that all segments contain index directory? > > Rgrds,. Thomas > > > On 6/23/06, Honda-Search Administrator <[EMAIL PROTECTED]> wrote: >> To recrawl I use the command: >> >> /home/honda/nutch-0.7.2/recrawl.sh /home/honda/nutch-0.7.2/crawl 1 2 >> >> "crawl" is the name of my database directory. >> >> The script "recrawl.sh" is the standard one that comes in the package. >> I'm >> pretty sure it's the same for everyone, but I've included a link to the >> recrawl.sh script I'm using: >> >> http://www.honda-search.com/script.html >> >> As you can see I'm crawling with a depth of 1, which is intentional. I >> only >> desire to recrawl the specific pages injected each night. I'm wondering >> if >> the 'adddays' parameter is messing me up. >> >> Matt >> >> ----- Original Message ----- >> From: "TDLN" <[EMAIL PROTECTED]> >> To: <[email protected]>; "Honda-Search Administrator" >> <[EMAIL PROTECTED]> >> Sent: Friday, June 23, 2006 10:46 AM >> Subject: Re: ERROR when recrawling... can ANYONE help? >> >> >> > Please specify what exact sequence of commands you are using. >> > >> > For incremental crawling best to follow the "whole web" style process >> > as outlined in the tutorial. The one stop crawl command cannot be used >> > effectively for that. >> > >> > HTH Thomas >> > >> >> > > Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
