Re: A kernel panic makes small HBase cluster to crush?

2011-03-10 Thread Tatsuya Kawano
Hi, I suggested him to upgrade his environment to the latest version, so at this time, he used CDH3b4 (HBase 0.90.1) and performed the same test procedure. Then now he got a new issue. HMaster was aborted because it couldn't reach to the host that had the kernel panic. Can anybody verify this

Re: A kernel panic makes small HBase cluster to crush?

2011-03-10 Thread Stack
On Thu, Mar 10, 2011 at 3:41 AM, Tatsuya Kawano tatsuya6...@gmail.com wrote: I suggested him to upgrade his environment to the latest version, so at this time, he used CDH3b4 (HBase 0.90.1) and performed the same test procedure. Then now he got a new issue. HMaster was aborted because it

Re: A kernel panic makes small HBase cluster to crush?

2011-03-10 Thread Tatsuya Kawano
Hi Stack, Thanks for checking this issue and filing HBASE-3617. Well, that command was supposed the node to crash and shutdown. I'll check the detailed procedure and try to reproduce this issue during weekend. This is odd. Communication with the RegionServer was working fine up until it

A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Hi, I'm trying to figure out the root cause of the crush on a small HBase cluster and I need some help from the experts here. I tried to post my question earlier but it seems the message was blocked by the mailing list. So I pasted the message here. http://pastebin.com/5xkACxMM We're having all

A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Hi, I got this question at Hadoop User Group Japan mailing list, but I need some helps from the experts here. It looks like HDFS issue, maybe append related?  but I'm not totally sure yet. The person who posted the original question is testing HA features in HBase 0.90.0 and ASF Hadoop 0.20.2

A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Hi, I got this question at Hadoop User Group Japan mailing list, but I need some helps from the experts here. It looks like HDFS issue, maybe append related? but I'm not totally sure yet. The person who posted the original question is testing HA features in HBase 0.90.0 and ASF Hadoop 0.20.2

A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Hi, I got this question at Hadoop User Group Japan mailing list, but I need some helps from the experts here. It looks like HDFS issue, maybe append related? but I'm not totally sure yet. The person who posted the original question is testing HA features in HBase 0.90.0 and ASF Hadoop

Re: A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Thanks J-D. Well, doen't the following message imply HDFS could accept writes when it has at least 1 data node available? error: java.io.IOException: File /hbase/Object_Speed_Test/1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/1275904589980700621 could only be replicated to 0 nodes, instead of 1

Re: A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Jean-Daniel Cryans
(heh this thread gives me a reason to look at the HDFS code) Well, doen't the following message imply HDFS could accept writes when it has at least 1 data node available? error: java.io.IOException: File /hbase/Object_Speed_Test/1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/1275904589980700621

Re: A kernel panic makes small HBase cluster to crush?

2011-03-04 Thread Tatsuya Kawano
Thanks for checking the HDFS code. Also it's strange that the region servers got corrupted reads when there are two more replicase available on HDFS. Corrupted reads? This is a loaded term, are you really saying that the region server read corrupted data from HDFS? Sorry, it was too early