Re: Performance / cluster scaling question

2008-03-28 Thread Doug Cutting
Doug Cutting wrote: Seems like we should force things onto the same availablity zone by default, now that this is available. Patch, anyone? It's already there! I just hadn't noticed. https://issues.apache.org/jira/browse/HADOOP-2410 Sorry for missing this, Chris! Doug

Re: Performance / cluster scaling question

2008-03-27 Thread Chris K Wensel
. Thanks, dhruba -Original Message- From: André Martin [mailto:[EMAIL PROTECTED] Sent: Friday, March 21, 2008 3:06 PM To: core-user@hadoop.apache.org Subject: Re: Performance / cluster scaling question After waiting a few hours (without having any load), the block number and DFS Used space

Re: Performance / cluster scaling question

2008-03-24 Thread André Martin
@hadoop.apache.org Subject: Re: Performance / cluster scaling question After waiting a few hours (without having any load), the block number and DFS Used space seems to go down... My question is: is the hardware simply too weak/slow to send the block deletion request to the datanodes in a timely

Performance / cluster scaling question

2008-03-21 Thread André Martin
Hi everyone, I ran a distributed system that consists of 50 spiders/crawlers and 8 server nodes with a Hadoop DFS cluster with 8 datanodes and a namenode... Each spider has 5 job processing / data crawling threads and puts crawled data as one complete file onto the DFS - additionally there are

RE: Performance / cluster scaling question

2008-03-21 Thread Jeff Eastman
21, 2008 2:36 PM To: core-user@hadoop.apache.org Subject: Re: Performance / cluster scaling question 3 - the default one... Jeff Eastman wrote: What's your replication factor? Jeff -Original Message- From: André Martin [mailto:[EMAIL PROTECTED] Sent: Friday, March 21

RE: Performance / cluster scaling question

2008-03-21 Thread Jeff Eastman
-user@hadoop.apache.org Subject: Re: Performance / cluster scaling question Right, I totally forgot about the replication factor... However sometimes I even noticed ratios of 5:1 for block numbers to files... Is the delay for block deletion/reclaiming an intended behavior? Jeff Eastman wrote

RE: Performance / cluster scaling question

2008-03-21 Thread dhruba Borthakur
21, 2008 3:06 PM To: core-user@hadoop.apache.org Subject: Re: Performance / cluster scaling question After waiting a few hours (without having any load), the block number and DFS Used space seems to go down... My question is: is the hardware simply too weak/slow to send the block deletion