Hi,
I ran into this situation during a recent installation and thought it
might be useful if others were to hit a similar situation in the future.
This isn't the only way to recover from the situation but it is one option
and was proven to work as expected.
Regards,
Dennis
During a recent Trafodion cluster install the daily build was broken in
such a way that much of the installation proceeded, but the Trafodion files
were not copied to each node. This system was using CDH but I assume the
following would happen for HDP as well. After HBase was restarted as part
of the installation I noticed the HBase icon was red. I know this will
likely not look the best in plain text, but the hbase:meta showed (in a red
box):
Region State RIT time (ms)
1588230740 hbase:meta,,1.1588230740 state=FAILED_OPEN, ts=Mon Mar 07
07:19:00 UTC 2016 (1289s ago),
server=perf-sles-2.novalocal,60020,1457335120507 1289706
Looking at the Region Server's log file that was assigned the hbase:meta
table there was this output:
2016-03-07 16:45:27,243 INFO
org.apache.hadoop.hbase.regionserver.RSRpcServices: Open
hbase:meta,,1.1588230740
2016-03-07 16:45:27,249 ERROR
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
of region=hbase:meta,,1.1588230740, starting to roll back the global
memstore size.
java.lang.IllegalStateException: Could not instantiate a region instance.
at
org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:5486)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5793)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5765)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5721)
at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5672)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(Op
enRegionHandler.java:356)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenR
egionHandler.java:126)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
45)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
15)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion
not found
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2112)
at
org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:5475)
... 10 more
Caused by: java.lang.ClassNotFoundException: Class
org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion not
found
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018)
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110)
... 11 more
2016-03-07 16:45:27,250 INFO
org.apache.hadoop.hbase.coordination.ZkOpenRegionCoordination: Opening of
region {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '',
ENDKEY => ''} failed, transitioning from OPENING to FAILED_OPEN in ZK,
expecting version 115
After consulting with our installer expert, the issue was in fact that the
needed files had not been copied to each node. At that point one option
would be to re-install the previous build or at least undo the changes made
to point to the new build. I did not try that and I'll leave that fallback
option as a separate topic.
Instead, I took the path to see if I could get HBase to successfully come
up without getting the new Trafodion installation properly completed. To do
that there are two HBase properties that have to be reset:
. hbase.coprocessor.region.classes
. hbase.hregion.impl
I actually deleted all of the properties listed under the hbase-site.xml
that showed as non-default values by Cloudera Manager but I assume only the
hbase.hregion.impl property had to be removed. Remember to save the
configuration and remove both sets of properties. I forgot to do both of
those and each time the restart hit the same basic error.
Once the configuration is properly updated the restart will be successful
and after the hbase:meta table can be opened by the Region Server, all the
other regions will also be able to be opened. However, without Trafodion
running I would assume none of the Trafodion tables should be acted upon.
This exercise was to prove HBase could be restarted and running so that when
the Trafodion installation was started it would have a viable
Cloudera/HBase/HDFS environment to act on.