HI there All- I have made a jira for the installer, based on this issue.
https://issues.apache.org/jira/browse/TRAFODION-1884 Thanks! On Wed, Mar 9, 2016 at 8:41 AM, Liu, Ming (Ming) <[email protected]> wrote: > Thanks Denies to share this. We saw this issue during an expansion of > Trafodion from 4 nodes to 5 nodes, since newly add node is empty, META > region should not be there, so it does no harm. But the problem is similar, > the newly added RS cannot work until we update Trafodion into that RS node. > > There are two related JIRAs: TRAFODION-1729 and TRAFODION-1730. > we are working on them to solve the issue. Since Trafodion currently > modify the HBase server's hbase-site.xml to add coprocessor, it affect > *ALL* regions in the hbase, including META region. This is no need and not > good. META region definitely no need to load Trafodion coprocessors. It is > system region, Trafodion never need to access it directly, and once its > open fail, the whole hbase system cannot work. > So with that JIRA fully addressed, we can remove hbase-site.xml > modification from Trafodion installer, and no need to restart HBase. And as > a proper installation, Trafodion should be installed on all RS node, so > coprocessor jar files should be copied to all RS nodes. If Trafodion is not > installed on all RS node, there may still be issues, I assume Installer > still need to consider this. A better approach is to save coprocessor jar > file on HDFS, but that is just a theory, need to study further. > > Thanks, > Ming > > -----邮件原件----- > 发件人: D. Markt [mailto:[email protected]] > 发送时间: 2016年3月9日 15:23 > 收件人: [email protected] > 主题: A failed Trafodion installation can lead to the hbase:meta table > staying in the FAILED_OPEN state. > > Hi, > > I ran into this situation during a recent installation and thought it > might be useful if others were to hit a similar situation in the future. > This isn't the only way to recover from the situation but it is one option > and was proven to work as expected. > > Regards, > Dennis > > During a recent Trafodion cluster install the daily build was broken in > such a way that much of the installation proceeded, but the Trafodion files > were not copied to each node. This system was using CDH but I assume the > following would happen for HDP as well. After HBase was restarted as part > of the installation I noticed the HBase icon was red. I know this will > likely not look the best in plain text, but the hbase:meta showed (in a red > box): > > Region State RIT time (ms) > 1588230740 hbase:meta,,1.1588230740 state=FAILED_OPEN, ts=Mon Mar 07 > 07:19:00 UTC 2016 (1289s ago), > server=perf-sles-2.novalocal,60020,1457335120507 1289706 > > Looking at the Region Server's log file that was assigned the hbase:meta > table there was this output: > > 2016-03-07 16:45:27,243 INFO > org.apache.hadoop.hbase.regionserver.RSRpcServices: Open > hbase:meta,,1.1588230740 > 2016-03-07 16:45:27,249 ERROR > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed > open of region=hbase:meta,,1.1588230740, starting to roll back the global > memstore size. > java.lang.IllegalStateException: Could not instantiate a region instance. > at > org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:5486) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5793) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5765) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5721) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5672) > at > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(Op > enRegionHandler.java:356) > at > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenR > egionHandler.java:126) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 > 45) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 > 15) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > Class > org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion > not found > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2112) > at > org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:5475) > ... 10 more > Caused by: java.lang.ClassNotFoundException: Class > org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion not > found > at > > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018) > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110) > ... 11 more > 2016-03-07 16:45:27,250 INFO > org.apache.hadoop.hbase.coordination.ZkOpenRegionCoordination: Opening of > region {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', > ENDKEY => ''} failed, transitioning from OPENING to FAILED_OPEN in ZK, > expecting version 115 > > After consulting with our installer expert, the issue was in fact that the > needed files had not been copied to each node. At that point one option > would be to re-install the previous build or at least undo the changes made > to point to the new build. I did not try that and I'll leave that fallback > option as a separate topic. > > Instead, I took the path to see if I could get HBase to successfully > come up without getting the new Trafodion installation properly completed. > To do that there are two HBase properties that have to be reset: > > . hbase.coprocessor.region.classes > . hbase.hregion.impl > > I actually deleted all of the properties listed under the hbase-site.xml > that showed as non-default values by Cloudera Manager but I assume only the > hbase.hregion.impl property had to be removed. Remember to save the > configuration and remove both sets of properties. I forgot to do both of > those and each time the restart hit the same basic error. > > Once the configuration is properly updated the restart will be > successful and after the hbase:meta table can be opened by the Region > Server, all the other regions will also be able to be opened. However, > without Trafodion running I would assume none of the Trafodion tables > should be acted upon. > This exercise was to prove HBase could be restarted and running so that > when the Trafodion installation was started it would have a viable > Cloudera/HBase/HDFS environment to act on. > > -- Thanks, Amanda Moran
