[jira] Commented: (HBASE-2575) Fault scenario of dead root drive on RS causes cluster lockup

Todd Lipcon (JIRA) Wed, 19 May 2010 17:13:19 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12869419#action_12869419
 ]


Todd Lipcon commented on HBASE-2575:
------------------------------------

My thought to reproduce is something like this:
# dd if=/dev/zero of=myimage bs=1M count=1000
# losetup -f myimage
# mdadm --create /dev/md0 --level=faulty --raid-devices=1  /dev/loop1
# mkfs.ext3 /dev/md0
# mkdir /myhbase-disk
# mount /dev/md0 /myhbase-disk
# cp -a $HBASE_HOME /myhbase-disk
# start regionserver over there
# mdadm --grow /dev/md0 -l faulty -p read-persistent


> Fault scenario of dead root drive on RS causes cluster lockup
> -------------------------------------------------------------
>
>                 Key: HBASE-2575
>                 URL: https://issues.apache.org/jira/browse/HBASE-2575
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>
> We performed a fault test where we physically pulled the root drive out of a 
> machine while it was on. The regionserver continued to run fine with existing 
> clients. But any new clients that tried to connect to it for RPC would not 
> work correctly. So when I started a new client, that client made no progress. 
> Despite this, the RS continued to happily heartbeat to the master, so the 
> master did not remove it from the cluster. Note that in this case, we were 
> logging to NFS, and the logs continued to write, but no exceptions shown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2575) Fault scenario of dead root drive on RS causes cluster lockup

Reply via email to