It happens about 10% of the time for me. I'm going to try to grab a jstack
next time I see it happen and try to note what is preventing the graceful
shutdown.

On Fri, Jan 5, 2018 at 12:18 PM, Ted Yu <[email protected]> wrote:

> Mike:
> Do you still see the hang from region server ?
>
> I recently successfully shutdown a 5 node cluster running on top of hadoop
> 3 (after completing LoadTestTool) based on this commit:
>
> HBASE-19667 Get rid of MasterEnvironment#supportGroupCPs
>
> FYI
>
> On Wed, Jan 3, 2018 at 3:23 PM, Sergey Soldatov <[email protected]>
> wrote:
>
> > It may be something else that prevents HBase from the proper shutdown.
> > Metric system is automatically stopped and started by JmxCacheBuster and
> > may be irrelevant to your problem.
> >
> > Thanks,
> > Sergey
> >
> > On Wed, Jan 3, 2018 at 1:37 PM, Mike Drob <[email protected]> wrote:
> >
> > > Hi folks,
> > >
> > > I've been seeing on one of my testbeds intermittently that HBase will
> > fail
> > > to stop, I suspect that this somehow involves the Hadoop Metrics system
> > > having a thread that isn't getting cleaned up. This is on a fork of
> > > branch-2 testing against hadoop-3 (although not with the RC
> > specifically).
> > > Here's a log excerpt from the shutdown sequence:
> > >
> > >
> > > 2017-12-31 01:44:07,308 INFO
> > > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> > > bwiskf-2,22101,1514712293690; zookeeper connection closed.
> > > 2017-12-31 01:44:07,308 INFO
> > > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > > regionserver/bwiskf-2/x.x.x.x:22101 exiting
> > > 2017-12-31 01:44:07,309 INFO org.apache.zookeeper.ClientCnxn:
> > EventThread
> > > shut down
> > > 2017-12-31 01:44:08,521 INFO org.apache.hadoop.hbase.
> > regionserver.Leases:
> > > regionserver/bwiskf-2/x.x.x.x:22101.leaseChecker closing leases
> > > 2017-12-31 01:44:08,521 INFO org.apache.hadoop.hbase.
> > regionserver.Leases:
> > > regionserver/bwiskf-2/x.x.x.x:22101.leaseChecker closed leases
> > > 2017-12-31 01:44:09,290 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping HBase
> > metrics
> > > system...
> > > 2017-12-31 01:44:09,290 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: HBase metrics
> system
> > > stopped.
> > > 2017-12-31 01:44:09,792 INFO org.apache.hadoop.metrics2.
> > > impl.MetricsConfig:
> > > loaded properties from hadoop-metrics2.properties
> > > 2017-12-31 01:44:09,798 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric
> > > snapshot period at 10 second(s).
> > > 2017-12-31 01:44:09,798 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: HBase metrics
> system
> > > started
> > > 2017-12-31 01:45:21,905 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping HBase
> > metrics
> > > system...
> > > 2017-12-31 01:45:21,906 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: HBase metrics
> system
> > > stopped.
> > > 2017-12-31 01:45:22,408 INFO org.apache.hadoop.metrics2.
> > > impl.MetricsConfig:
> > > loaded properties from hadoop-metrics2.properties
> > > 2017-12-31 01:45:22,409 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric
> > > snapshot period at 10 second(s).
> > > 2017-12-31 01:45:22,409 INFO
> > > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: HBase metrics
> system
> > > started
> > > 2017-12-31 01:47:12,137 INFO
> > > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> > starting;
> > > hbase.shutdown.hook=true;
> > > fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$
> > > ClientFinalizer@336880df
> > > 2017-12-31 01:47:12,138 INFO
> > > org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs
> shutdown
> > > hook thread.
> > > 2017-12-31 01:47:12,139 INFO
> > > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> > finished.
> > >
> > > The first shutdown was a graceful shutdown request that started around
> > > 1:43, and then 1:47 was a shutdown via process kill request. The
> metrics
> > > system looks like it starts up again every time it is stopped. I'm
> > digging
> > > through the internals of HRegionServer trying to figure out where we
> > > interface with it but not having much luck finding the cleanup steps.
> > >
> > > Mike
> > >
> >
>

Reply via email to