Hi!

I am running HBase 0.19.2, r771918 on top of Hadoop 0.19.1, r745977.

When I stress the system with lots of uploaded data it often happens a Regionserver gets overloaded and is lost by the cluster. This is understandable. However, right after that Hbase becomes generally unstable and then often "loses" some of the regions making the DB tables corrupt. The aftermath symptoms are:
1. A region appears in the "Regions in" list for its table.
3. That region is missing from the "Online Regions" list of the Regionserver responsible for it.

In other words, it seems, Master thinks region R belongs to the regionserver X, but X does not agree. When I request data from that region Master directs the client to X and X throws NotServingRegionException.

Has anybody met this kind of problem? Is there any remedy for this
apart from adding more power to the cluster so the regions do not fail?

It is acceptable if the system becomes unavailable for some time when it get stressed too much, but it should not lose data.

Thanks a lot!

--Kirill

Reply via email to