[ 
https://issues.apache.org/jira/browse/HBASE-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13721953#comment-13721953
 ] 

Jean-Marc Spaggiari commented on HBASE-8974:
--------------------------------------------

I tailed the RS logs over a restart and there is only one restart displayed:
{code}
dimanche 28 juillet 2013, 09:17:02 (UTC-0400) Terminating regionserver
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server 
on 60020
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC 
Server listener on 60020
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 5 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC 
Server Responder
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC 
Server Responder
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 2 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 0 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 1 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 9 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 9 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 6 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 4 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 0 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 2 on 60020: exiting
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 3 on 60020: exiting
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 0 on 60020: exiting
2013-07-28 09:17:02,208 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 1 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 2 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 8 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 1 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 7 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 6 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 4 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 3 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 7 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.mortbay.log: Stopped 
[email protected]:60030
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 8 on 60020: exiting
2013-07-28 09:17:02,209 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 5 on 60020: exiting
2013-07-28 09:17:02,312 INFO 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Closed zookeeper sessionid=0x3400251e47305dc
dimanche 28 juillet 2013, 09:17:03 (UTC-0400) Starting regionserver on node3
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 93921
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 32768
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 93921
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
2013-07-28 09:17:03,676 INFO org.apache.hadoop.hbase.util.VersionInfo: HBase 
0.94.10
2013-07-28 09:17:03,676 INFO org.apache.hadoop.hbase.util.VersionInfo: 
Subversion https://svn.apache.org/repos/asf/hbase/tags/0.94.10RC0 -r 1504995
2013-07-28 09:17:03,676 INFO org.apache.hadoop.hbase.util.VersionInfo: Compiled 
by jenkins on Fri Jul 19 20:24:16 UTC 2013
2013-07-28 09:17:03,778 INFO org.apache.hadoop.hbase.util.ServerCommandLine: 
vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Oracle Corporation, 
vmVersion=23.1-b03
2013-07-28 09:17:03,778 INFO org.apache.hadoop.hbase.util.ServerCommandLine: 
vmInputArguments=[-XX:OnOutOfMemoryError=kill -9 %p, -Xmx6196m, 
-XX:+UseConcMarkSweepGC, -XX:+UseConcMarkSweepGC, 
-Dhbase.log.dir=/home/hbase/hbase-0.94.3/bin/../logs, 
-Dhbase.log.file=hbase-hbase-regionserver-node3.log, 
-Dhbase.home.dir=/home/hbase/hbase-0.94.3/bin/.., -Dhbase.id.str=hbase, 
-Dhbase.root.logger=INFO,DRFA, 
-Djava.library.path=/home/hbase/hbase-0.94.3/bin/../lib/native/Linux-amd64-64, 
-Dhbase.security.logger=INFO,DRFAS]
2013-07-28 09:17:03,998 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:03,998 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:03,999 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:03,999 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,000 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,000 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,001 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,002 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,002 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,002 INFO org.apache.hadoop.ipc.HBaseServer: Starting 
Thread-0
2013-07-28 09:17:04,009 INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: 
Initializing RPC Metrics with hostName=HRegionServer, port=60020
2013-07-28 09:17:04,106 INFO org.apache.hadoop.hbase.io.hfile.CacheConfig: 
Allocating LruBlockCache with maximum size 2,4g
2013-07-28 09:17:04,316 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem 
doesn't support getDefaultBlockSize
2013-07-28 09:17:04,329 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem 
doesn't support getDefaultReplication
2013-07-28 09:17:04,339 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem 
doesn't support getDefaultReplication
2013-07-28 09:17:04,339 INFO org.apache.hadoop.hbase.util.FSUtils: FileSystem 
doesn't support getDefaultBlockSize
2013-07-28 09:17:04,393 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
Initializing JVM Metrics with processName=RegionServer, 
sessionId=regionserver60020
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: revision
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: hdfsUser
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: hdfsDate
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: hdfsUrl
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: date
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: hdfsRevision
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: user
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: hdfsVersion
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: url
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: MetricsString 
added: version
2013-07-28 09:17:04,413 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2013-07-28 09:17:04,414 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2013-07-28 09:17:04,444 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-07-28 09:17:04,476 INFO org.apache.hadoop.http.HttpServer: Added global 
filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2013-07-28 09:17:04,480 INFO org.apache.hadoop.http.HttpServer: Port returned 
by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
listener on 60030
2013-07-28 09:17:04,480 INFO org.apache.hadoop.http.HttpServer: 
listener.getLocalPort() returned 60030 
webServer.getConnectors()[0].getLocalPort() returned 60030
2013-07-28 09:17:04,480 INFO org.apache.hadoop.http.HttpServer: Jetty bound to 
port 60030
2013-07-28 09:17:04,480 INFO org.mortbay.log: jetty-6.1.26
2013-07-28 09:17:04,750 INFO org.mortbay.log: Started 
[email protected]:60030
2013-07-28 09:17:04,751 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
Responder: starting
2013-07-28 09:17:04,754 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
listener on 60020: starting
2013-07-28 09:17:04,767 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 0 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 1 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 2 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 3 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 4 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 5 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 6 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 7 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 8 on 60020: starting
2013-07-28 09:17:04,768 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 9 on 60020: starting
2013-07-28 09:17:04,769 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 0 on 60020: starting
2013-07-28 09:17:04,769 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 1 on 60020: starting
2013-07-28 09:17:04,769 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 2 on 60020: starting
2013-07-28 09:17:04,769 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 3 on 60020: starting
2013-07-28 09:17:04,769 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 4 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 5 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 6 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 7 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 8 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server 
handler 9 on 60020: starting
2013-07-28 09:17:04,770 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 0 on 60020: starting
2013-07-28 09:17:04,775 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 1 on 60020: starting
2013-07-28 09:17:04,775 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC Server 
handler 2 on 60020: starting
2013-07-28 09:17:07,197 ERROR 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics: Inconsistent 
configuration. Previous configuration for using table name in metrics: true, 
new configuration: false
2013-07-28 09:17:07,202 INFO org.apache.hadoop.hbase.util.ChecksumType: 
Checksum can use java.util.zip.CRC32
2013-07-28 09:17:28,700 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: 
Snappy native library is available
2013-07-28 09:17:28,701 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded 
the native-hadoop library
2013-07-28 09:17:28,701 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: 
Snappy native library loaded
2013-07-28 09:17:28,702 INFO org.apache.hadoop.io.compress.CodecPool: Got 
brand-new compressor
2013-07-28 09:17:28,715 INFO org.apache.hadoop.io.compress.CodecPool: Got 
brand-new decompressor
2013-07-28 09:17:31,776 INFO org.apache.hadoop.io.compress.CodecPool: Got 
brand-new decompressor
{code}

That's all what I got over the entire rolling-restart. So from the RS side, 
seems that it's not restarted more than one.

[~ndimiduk] can you take a look at your RS logs too to see if it matches what 
you are seeing?


                
> bin/rolling-restart.sh restarts all active RS's with each iteration instead 
> of one at a time
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8974
>                 URL: https://issues.apache.org/jira/browse/HBASE-8974
>             Project: HBase
>          Issue Type: Bug
>          Components: scripts
>            Reporter: Nick Dimiduk
>
> I'm exercising the patch over on HBASE-8803 and I've noticed something in the 
> logs: it looks like {{rolling-restart.sh}} is restarting all the region 
> servers multiple times instead of just the current entry in the loop 
> iteration.
> The logic looks like this:
> {noformat}
> for each rs in active region server list:
>   unload $rs // move all regions to other RS's
>   restart all Region Servers // !?! bug?
>   reload $rs // pile 'em back on
> {noformat}
> Shouldn't that step 2 be only {{restart $rs}}?
> This is what I see in the logs. My cluster has 9 active RegionServers. Notice 
> the bit in the middle where all 9 are stopped and started again after 
> unloading the target RS.
> {noformat}
> $ time /usr/lib/hbase/bin/rolling-restart.sh --rs-only --graceful 
> --maxthreads 30                                                               
>                                         
> Gracefully restarting: hor18n39.gq1.ygridcore.net
> Disabling balancer!
> ...
> Unloading hor18n39.gq1.ygridcore.net region(s)
> ...
> Valid region move targets: 
> hor18n37.gq1.ygridcore.net,60020,1374094975268
> hor17n37.gq1.ygridcore.net,60020,1374094975264
> hor18n35.gq1.ygridcore.net,60020,1374094975327
> hor17n39.gq1.ygridcore.net,60020,1374094975281
> hor18n36.gq1.ygridcore.net,60020,1374094975254
> hor17n36.gq1.ygridcore.net,60020,1374094975277
> hor17n34.gq1.ygridcore.net,60020,1374094975291
> hor18n38.gq1.ygridcore.net,60020,1374094975259
> 13/07/17 21:44:38 INFO region_mover: Moving 330 region(s) from 
> hor18n39.gq1.ygridcore.net,60020,1374094975326 during this cycle
> 13/07/17 21:44:38 INFO region_mover: Moving region 
> b59050cf97aabcef838e3c50e93e6d13 (1 of 330) to 
> server=hor18n37.gq1.ygridcore.net,60020,1374094975268
> ...
> 13/07/17 21:54:20 INFO region_mover: Moving region 
> d00026d7cc396bb3e6ea91106cc6ab55 (329 of 330) to 
> server=hor18n37.gq1.ygridcore.net,60020,1374094975268
> 13/07/17 21:54:20 INFO region_mover: Moving region 
> a722179b33e6ece8c9cee3fba3056acd (330 of 330) to 
> server=hor17n37.gq1.ygridcore.net,60020,1374094975264
> 13/07/17 21:54:21 INFO region_mover: Wrote list of moved regions to 
> /tmp/hor18n39.gq1.ygridcore.net
> Unloaded hor18n39.gq1.ygridcore.net region(s)
> hor18n35.gq1.ygridcore.net: stopping regionserver.
> hor17n39.gq1.ygridcore.net: stopping regionserver.
> hor18n36.gq1.ygridcore.net: stopping regionserver.
> hor17n37.gq1.ygridcore.net: stopping regionserver.
> hor17n34.gq1.ygridcore.net: stopping regionserver.
> hor18n38.gq1.ygridcore.net: stopping regionserver.
> hor18n37.gq1.ygridcore.net: stopping regionserver.
> hor17n36.gq1.ygridcore.net: stopping regionserver.
> hor18n39.gq1.ygridcore.net: stopping regionserver.
> hor18n36.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor18n36.gq1.ygridcore.net.out
> hor17n36.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor17n36.gq1.ygridcore.net.out
> hor17n37.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor17n37.gq1.ygridcore.net.out
> hor18n37.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor18n37.gq1.ygridcore.net.out
> hor18n38.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor18n38.gq1.ygridcore.net.out
> hor17n34.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor17n34.gq1.ygridcore.net.out
> hor18n35.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor18n35.gq1.ygridcore.net.out
> hor18n39.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor18n39.gq1.ygridcore.net.out
> hor17n39.gq1.ygridcore.net: starting regionserver, logging to 
> /grid/0/var/log/hbase/hbase-hbase-regionserver-hor17n39.gq1.ygridcore.net.out
> Reloading hor18n39.gq1.ygridcore.net region(s)
> ...
> 13/07/17 21:54:27 INFO region_mover: Moving 330 regions to 
> hor18n39.gq1.ygridcore.net,60020,1374098064602
> 13/07/17 21:56:47 INFO region_mover: Moving region 
> 7d0a02f452c334a12026b45346a87d36 (1 of 330) to 
> server=hor18n39.gq1.ygridcore.net,60020,1374098064602 in thread 0
> 13/07/17 21:56:54 INFO region_mover: Moving region 
> af5448c90e78a8f0d935efb0b380502e (2 of 330) to 
> server=hor18n39.gq1.ygridcore.net,60020,1374098064602 in thread 1
> ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to