Hi!
I am running HBase 0.19.2, r771918 on top of Hadoop 0.19.1, r745977.
When I stress the system with lots of uploaded data it often happens a
Regionserver gets overloaded and is lost by the cluster. This is
understandable. However, right after that Hbase becomes generally
unstable and then often "loses" some of the regions making the DB tables
corrupt. The aftermath symptoms are:
1. A region appears in the "Regions in" list for its table.
3. That region is missing from the "Online Regions" list of the
Regionserver responsible for it.
In other words, it seems, Master thinks region R belongs to the
regionserver X, but X does not agree. When I request data from that
region Master directs the client to X and X throws
NotServingRegionException.
Has anybody met this kind of problem? Is there any remedy for this
apart from adding more power to the cluster so the regions do not fail?
It is acceptable if the system becomes unavailable for some time when it
get stressed too much, but it should not lose data.
Thanks a lot!
--Kirill