Dennis,

I'm curious what kind of hardware your 5 system cluster uses? CPU, RAM, HD
etc.

And I was wondering if anyone has tested a cluster using servers with
Intel's Quad Core Xeon X3210 processors? If so what type of performance
boost have you noticed over a Dual Core system?

Thanks
Marc Boucher, aTerra
--
Personal Blog: http://www.nano2sol.com

On 3/14/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:

The crawl for 1M pages completed successfully.  There was an issue with
doing a copyToLocal but that has already been filed as a HADOOP bug and
the patch will be included in 0.12.x

Statistics for CrawlDb: crawldb
TOTAL urls:         10839170
retry 0:            10816148
retry 1:             23022

min score:      0.0090
avg score:      0.173
max score:      2119.167

status 1 (db_unfetched):        9899275
status 2 (db_fetched):          667354
status 3 (db_gone):             11195
status 4 (db_redir_temp):       219507
status 5 (db_redir_perm):       41839

Dennis Kubes

Andrzej Bialecki wrote:
> Dennis Kubes wrote:
>>
>>
>> Andrzej Bialecki wrote:
>>> Dennis Kubes wrote:
>>>> I agree there may be subtle bugs.
>>>>
>>>> I can do say a full dmoz crawl (~5M pages) with nutch trunk and
hadoop
>>>> 12.1 on a small cluster of 5 machines if this would help?  We have
>>>> already
>>>>
>>>
>>> Certainly, that would be most welcome.
>>
>> I will start that up today.
>
> Thanks!
>
>>>
>>> 0.12.1 is not out the door yet. I can create a patch that uses the
>>> latest Hadoop trunk binaries, so that we could test it.
>>
>> I can just pull it down from source.  Let me know if that isn't what
>> we want'.
>
> Great, please do.
>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to