Thanks, everybody. I really appreciate what you guys have done with my question. Indeed , for me the situation which I came across is too complicated and too strange to me .So I've decided to re-install the hbase tool and change the related configuration files.Hope this time it will get better. Thanks again! Best wishes~ Woo.
在 2011年7月28日 下午1:50,Nico Guba <[email protected]>写道: > Very interesting. What is a good value where there is not too much of a > trade-off in performance? > > I'd imagine that setting this too high could create a very 'chatty' > cluster. > > On 28 Jul 2011, at 00:33, Jeff Whiting wrote: > > > Replication needs to be higher than 1. If you have a node which is > running both DataNode and > > HRegionServer then shut it down you WILL loose all the data that the > DataNode was holding because no > > one else on the cluster has it. HBase relies on HDFS for the replication > of data and does NOT have > > it's own data replication mechanism unlike Cassandra or Voldemort. If you > set the HDFS replication > > factor to 3 then when you shutdown your node 2 other nodes will have the > data and HBase will be able > > to serve that data for you. > > > > You can think of each DataNode as a hard drive. Having a replication > factor of 1 means the data is > > only on one hard drive and if you unplug the hard drive that data will be > lost. Having a replication > > factor greater than 1 is like having multiple hard drives in a raid 1 > (mirrored) array. If you > > unplug one of the hard drives the data is still on the other ones and > nothing is lost. > > > > ~Jeff > > > > On 7/27/2011 10:35 AM, 吴限 wrote: > >> Here is my hbase-site.xml: > >> configuration> > >> <property> > >> <name>hbase.cluster.distributed</name> > >> <value>true</value> > >> </property> > >> <property> > >> <name>hbase.rootdir</name> > >> <value>hdfs://server3.yun.com:54310/hbase</value> > >> <description>The directory shared by region servers. > >> </description> > >> </property> > >> <property> > >> <name>hbase.zookeeper.quorum</name> > >> <value>server3.yun.com</value> > >> </property> > >> <property> > >> <name>dfs.replication</name> > >> <value>1</value> > >> </property> > >> > >> > >> 2011/7/28 Stack <[email protected]> > >> > >>> On Wed, Jul 27, 2011 at 8:58 AM, 吴限 <[email protected]> wrote: > >>>> Setup: > >>>> -cdh3u0 > >>>> - Hadoop 0.20.2 > >>> You are using the hadoop from cdh3u0? > >>> > >>> > >>>> - dfs.replication is set to 1 > >>>> > >>> You will lose data if a machine goes away. You have two machines but > >>> only one instance of each data block; think of it as half of your data > >>> one one node and the rest on another. If you kill one machine, half > >>> your data is gone. > >>> > >>> > >>>> After I restarted the regionserver which I had rebooted and checked > >>> again, > >>>> I found that some of the missing data was got back but there still > >>> existed > >>>> some data which hadn't been found yet. > >>> > >>> I wonder what was going on here that we didn't see it all restored. > >>> > >>> > >>>> This is problematic since we are supposed to > >>>> replicate at x1, so at least one other node should be able to > >>>> theoretically serve the *data* that the downed regionserver can't. > >>>> > >>> No. The behavior you describe would come with replication of 2, not 1. > >>> > >>> St.Ack > >>> > > > > -- > > Jeff Whiting > > Qualtrics Senior Software Engineer > > [email protected] > > > >
