hosts unreachables

Cyril Scetbon Tue, 29 May 2012 07:53:59 -0700

Hi,

I've installed hbase on the following configuration :


12 x (rest hbase + regionserver hbase + datanode hadoop)
2 x (zookeeper + hbase master)
1 x (zookeeper + hbase master + namenode hadoop)

OS used is ubuntu lucid (10.04)

The issue is that when I try to load data using rest api, some hostsbecome unreachable even if I can ping them. I can no longer connect tothem and even monitoring tools can not work during a laps of time. Forexample, I use SAR on each host and you can see that between 7:10 and7:35 pm the host does not write any information :

06:45:01 PM all 0.18 0.00 0.37 3.610.25 95.5806:45:01 PM 0 0.24 0.00 0.54 6.620.35 92.2506:45:01 PM 1 0.12 0.00 0.20 0.610.15 98.9206:50:02 PM all 5.69 0.00 1.79 4.231.94 86.3606:50:02 PM 0 5.68 0.00 3.00 7.912.21 81.2106:50:02 PM 1 5.70 0.00 0.59 0.551.66 91.5106:55:01 PM all 0.68 0.00 0.14 1.620.23 97.3306:55:01 PM 0 0.87 0.00 0.20 3.190.31 95.4406:55:01 PM 1 0.49 0.00 0.08 0.050.15 99.2206:58:36 PM all 0.03 0.00 0.02 0.450.07 99.4306:58:36 PM 0 0.01 0.00 0.02 0.400.13 99.4306:58:36 PM 1 0.04 0.00 0.01 0.510.00 99.4307:05:01 PM all 0.03 0.00 0.00 0.100.07 99.8007:05:01 PM 0 0.02 0.00 0.00 0.100.10 99.7807:05:01 PM 1 0.04 0.00 0.01 0.090.03 99.83 <--- last measure before host becomes reachable07:40:07 PM all 14.72 0.00 17.93 0.0213.31 54.02 <--- new measure after host becomes reachable07:40:07 PM 0 29.43 0.00 35.87 0.0026.57 8.1307:40:07 PM 1 0.00 0.00 0.00 0.040.04 99.9107:45:01 PM all 0.55 0.00 0.25 0.040.27 98.8907:45:01 PM 0 0.54 0.00 0.14 0.050.21 99.0707:45:01 PM 1 0.55 0.00 0.36 0.040.33 98.7207:50:01 PM all 0.11 0.00 0.05 0.180.06 99.6007:50:01 PM 0 0.12 0.00 0.06 0.130.09 99.6007:50:01 PM 1 0.11 0.00 0.04 0.230.04 99.5907:55:01 PM all 0.00 0.00 0.01 0.050.07 99.8807:55:01 PM 0 0.00 0.00 0.01 0.010.13 99.8407:55:01 PM 1 0.00 0.00 0.00 0.080.00 99.9108:05:01 PM all 0.01 0.00 0.00 0.000.05 99.9408:05:01 PM 0 0.00 0.00 0.00 0.000.08 99.9108:05:01 PM 1 0.03 0.00 0.00 0.000.01 99.96

I suppose it's caused by a high load but I don't have any proof :( Isthere a known bug about that ? I had a similar issue with Cassandra thatforced me to upgrade to linux kernel > 3.0


thanks.

--
Cyril SCETBON

hosts unreachables

Reply via email to