No :-) My question is : I defined Hadoop cluster with 7 datanodes and one namenode. The cluster capacity (from the Hadoop web admin page) is about 700GB. From this i understand that default usage for datanode disk space is 100GB / datanode. Please correct me if i wrong.
Best Regards. On Fri, Oct 17, 2008 at 1:03 AM, stack <[EMAIL PROTECTED]> wrote: > Are you asking about the below Slava? > > <property> > <name>dfs.block.size</name> > <value>67108864</value> > <description>The default block size for new files.</description> > </property> > > I do not know of a 100GB configuration in hadoop/hbase? > > If so, if configuring for hbase, you need to add the configuration to > hbase-site.xml or add under your hbase conf an hadoop-site.xml with > appropriate setting. See http://wiki.apache.org/hadoop/Hbase/FAQ#12 for > some discussion. > > St.Ack > > > > Slava Gorelik wrote: > >> Hi.Small question, little bit off topic. >> How can i change the default 100GB datanode size to be something else ? >> >> Best Regards. >> >> On Thu, Oct 16, 2008 at 10:41 PM, stack <[EMAIL PROTECTED]> wrote: >> >> >> >>> Daniel Ploeg wrote: >>> >>> >>> >>>> Hi all, >>>> >>>> I performed a cluster rebalance on my test cluster yesterday (5 >>>> regionserver >>>> / datanodes each with approx 400GB - total approx 2TB HDFS) and I would >>>> like >>>> to know if the mailing lists have seen similar results to what I've >>>> seen. >>>> >>>> >>>> >>>> >>> I talked to the lads running hbase here at powerset. They believe they >>> have seen something similar when they grow the cluster by some >>> significant >>> percentage (20-30%). The addition of new machines brings on a >>> rebalancing >>> and thereafter hbase runs "faster". >>> >>> I had a single table with a single column family and loaded it up so >>> that >>> >>> >>>> it >>>> just about filled the entire cluster. Actually one or two of the nodes >>>> had >>>> run out of space, yet the fifth machine only had 50% of its disks >>>> utilised >>>> (which is why I though a rebalance was in order). There are a total of >>>> 1475 >>>> regions in the cluster. Prior to starting the rebalance the cluster only >>>> had >>>> about 250GB left to it's disposal. After the rebalance I now have almost >>>> 800GB free. >>>> >>>> >>>> >>>> >>> If 1475 regions, update to 0.18.1 (coming soon). >>> >>> Furthermore, I was performing read tests prior to the rebalance and >>> >>> >>>> getting >>>> a response time of approx 500ms per row (each row has 10000 column >>>> instances >>>> of the column family which were deserialised as part of the test). After >>>> the >>>> rebalance my read times reduced to around 340ms. >>>> >>>> >>>> >>>> >>>> >>> If you could have fewer columns in a family column, you'll get a bit >>> better >>> performance: HBASE-867. >>> >>> Good on you Daniel, >>> St.Ack >>> >>> >>> >> >> >> > >
