Re: data loss due to regionserver going down

Stack Thu, 28 Jul 2011 09:03:05 -0700

Running with 1 replica is unusual -- and there is little motiviation
for running with this configuration since it means dataloss -- so few
have experience with it.
St.Ack


2011/7/28 Xian Woo <[email protected]>:
> Thanks, everybody. I really appreciate what you guys have done with my
> question. Indeed , for me the situation which I came across is too
> complicated and too strange to me .So I've decided to re-install the  hbase
> tool and change the related configuration files.Hope this time it will get
> better. Thanks again!
> Best wishes~
> Woo.
>
> 在 2011年7月28日 下午1:50，Nico Guba <[email protected]>写道：
>
>> Very interesting.  What is a good value where there is not too much of a
>> trade-off in performance?
>>
>> I'd imagine that setting this too high could create a very 'chatty'
>> cluster.
>>
>> On 28 Jul 2011, at 00:33, Jeff Whiting wrote:
>>
>> > Replication needs to be higher than 1. If you have a node which is
>> running both DataNode and
>> > HRegionServer then shut it down you WILL loose all the data that the
>> DataNode was holding because no
>> > one else on the cluster has it. HBase relies on HDFS for the replication
>> of data and does NOT have
>> > it's own data replication mechanism unlike Cassandra or Voldemort. If you
>> set the HDFS replication
>> > factor to 3 then when you shutdown your node 2 other nodes will have the
>> data and HBase will be able
>> > to serve that data for you.
>> >
>> > You can think of each DataNode as a hard drive. Having a replication
>> factor of 1 means the data is
>> > only on one hard drive and if you unplug the hard drive that data will be
>> lost. Having a replication
>> > factor greater than 1 is like having multiple hard drives in a raid 1
>> (mirrored) array. If you
>> > unplug one of the hard drives the data is still on the other ones and
>> nothing is lost.
>> >
>> > ~Jeff
>> >
>> > On 7/27/2011 10:35 AM, 吴限 wrote:
>> >> Here is my hbase-site.xml:
>> >> configuration>
>> >>    <property>
>> >>        <name>hbase.cluster.distributed</name>
>> >>        <value>true</value>
>> >>    </property>
>> >>    <property>
>> >>        <name>hbase.rootdir</name>
>> >>        <value>hdfs://server3.yun.com:54310/hbase</value>
>> >>        <description>The directory shared by region servers.
>> >>        </description>
>> >>    </property>
>> >>    <property>
>> >>        <name>hbase.zookeeper.quorum</name>
>> >>        <value>server3.yun.com</value>
>> >>    </property>
>> >>    <property>
>> >>        <name>dfs.replication</name>
>> >>        <value>1</value>
>> >>    </property>
>> >>
>> >>
>> >> 2011/7/28 Stack <[email protected]>
>> >>
>> >>> On Wed, Jul 27, 2011 at 8:58 AM, 吴限 <[email protected]> wrote:
>> >>>> Setup:
>> >>>>  -cdh3u0
>> >>>>  - Hadoop 0.20.2
>> >>> You are using the hadoop from cdh3u0?
>> >>>
>> >>>
>> >>>>  - dfs.replication is set to 1
>> >>>>
>> >>> You will lose data if a machine goes away. You have two machines but
>> >>> only one instance of each data block; think of it as half of your data
>> >>> one one node and the rest on another.  If you kill one machine, half
>> >>> your data is gone.
>> >>>
>> >>>
>> >>>> After I restarted the regionserver which I had rebooted and checked
>> >>> again,
>> >>>> I found that some of the missing data was got back but there still
>> >>> existed
>> >>>> some data which hadn't been found yet.
>> >>>
>> >>> I wonder what was going on here that we didn't see it all restored.
>> >>>
>> >>>
>> >>>> This is problematic since we are supposed to
>> >>>> replicate at x1, so at least one other node should be able to
>> >>>> theoretically serve the *data* that the downed regionserver can't.
>> >>>>
>> >>> No.  The behavior you describe would come with replication of 2, not 1.
>> >>>
>> >>> St.Ack
>> >>>
>> >
>> > --
>> > Jeff Whiting
>> > Qualtrics Senior Software Engineer
>> > [email protected]
>> >
>>
>>
>

Re: data loss due to regionserver going down

Reply via email to