Thanks! I saw that one too, but according to Doug, it was for 0.8 only. Does anyone have
step-by-step introductions like the one for 0.8?
Also, anyone knows why URL total is always 1 when I ran 0.8?
060308 064420  map 0%
060308 064427  map 100%
060308 064433  reduce 100%
060308 064433 Job complete: job_ljydgp
060308 064434 parsing file:/root/nutch/conf/nutch- default.xml
060308 064434 parsing file:/root/nutch/conf/nutch-site.xml
060308 064436 Statistics for CrawlDb: /user/root/crawl-20060307224144/crawldb
060308 064436 TOTAL urls:       1
060308 064436 avg score:        1.0
060308 064436 max score:        1.0
060308 064436 min score:        1.0
060308 064436 retry 0:  1
060308 064436 status 2 (DB_fetched):    1
060308 064437 CrawlDb statistics: done


From: TDLN <[EMAIL PROTECTED]>
Reply-To: [email protected]
To: [email protected]
Subject: Re: help - distributed crawl in 0.7.1
Date: Wed, 8 Mar 2006 18:00:06 +0100
MIME-Version: 1.0
Received: from mail.apache.org ([209.237.227.199]) by bay0-mc7-f2.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Wed, 8 Mar 2006 09:00:31 -0800
Received: (qmail 90576 invoked by uid 500); 8 Mar 2006 17:00:31 -0000
Received: (qmail 90565 invoked by uid 99); 8 Mar 2006 17:00:31 -0000
Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Mar 2006 09:00:30 -0800 Received: pass (asf.osuosl.org: domain of [EMAIL PROTECTED] designates 64.233.162.200 as permitted sender) Received: from [64.233.162.200] (HELO zproxy.gmail.com) (64.233.162.200) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Mar 2006 09:00:29 -0800 Received: by zproxy.gmail.com with SMTP id 4so235445nzn for <[email protected]>; Wed, 08 Mar 2006 09:00:08 -0800 (PST) Received: by 10.36.74.1 with SMTP id w1mr2304954nza; Wed, 08 Mar 2006 09:00:06 -0800 (PST)
Received: by 10.36.227.12 with HTTP; Wed, 8 Mar 2006 09:00:06 -0800 (PST)
X-Message-Info: JGTYoYF78jEHjJx36Oi8+Z3TmmkSEdPtfpLB7P/ybN8=
Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
Precedence: bulk
List-Help: <mailto:[EMAIL PROTECTED]>
List-Unsubscribe: <mailto:[EMAIL PROTECTED]>
List-Post: <mailto:[email protected]>
List-Id: <nutch-user.lucene.apache.org>
Delivered-To: mailing list [email protected]
X-ASF-Spam-Status: No, hits=0.0 required=10.0tests=HTML_MESSAGE,SPF_PASS
X-Spam-Check-By: apache.org
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=dmLqLQUJPgvrB9Wiu1h1sG1pvL2DrxRpUM2bkCW36RjiyAo0t2/HebGIq4aNBW3Aoh83ko2xae64rHfJlg/+wzZIIayNqxJt0sq7xgLN3xuxfxBFltuBHVBPwkGK8WiyKTuk9ADXPG+G4yC1UGAUpVfc4fYGhcVDwsEC5GO2FAQ= References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
X-Virus-Checked: Checked by ClamAV on apache.org
Return-Path: [EMAIL PROTECTED] X-OriginalArrivalTime: 08 Mar 2006 17:00:32.0169 (UTC) FILETIME=[CF644190:01C642D1]

Detailed distributed crawl implementation:

http://www.mail-archive.com/[email protected]/msg02270.html

I am not sure it applies to 0.7 though, but it  has a lot of info.

Rgrds, Thomas

_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to