[ 
https://issues.apache.org/jira/browse/HBASE-24243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103591#comment-17103591
 ] 

Rushabh Shah edited comment on HBASE-24243 at 5/10/20, 2:08 AM:
----------------------------------------------------------------

[~dinesh4747] I don't see any exception message in either master or 
regionserver logs that you added in description. Without any additional 
information (like thread dump of both services. more logs), it would be 
difficult to help. Thank you !


was (Author: shahrs87):
[~dinesh4747] I don't see any exception message in either master or 
regionserver logs that you added in description. Without any additional 
information (like thread dump of both services), it would be difficult to help. 
Thank you !

> Unable to start HRegionserver and Master node considers as a dead region
> ------------------------------------------------------------------------
>
>                 Key: HBASE-24243
>                 URL: https://issues.apache.org/jira/browse/HBASE-24243
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: regionserver
>            Reporter: Dinesh Nithyanandam
>            Priority: Blocker
>         Attachments: site.xml
>
>
> Hi Team,
> I am currently using Apache Hbase version - 1.3.6 and I am trying to run 
> Master and region server separately and then join the cluster dynamically but 
> it was region server was not starting and hangs at "*The RegionServer is 
> initializing*!"
> Commands used as below: (Master and region are on separate nodes )
> Node A - Hbase Master - */opt/hbase/bin/hbase-daemon.sh --config 
> /usr/local/bin/hbase/conf start master*
> Node B - Hbase Region - */opt/hbase/bin/hbase-daemon.sh --config 
> /usr/local/bin/hbase/conf start regionserver*
> *{color:#ff0000}Please advice If the above command is the right way to start 
> hbase master and region{color}*
> Environment - *Google Compute Engine (GCE) Instance groups/VM's*
> OS Type - *CentOS -7*
> Master running ports *- 16000.tcp 16010/web* 
> Region server running ports *- 16020/tcp* *16030/web*
> Also not sure on how to enable reverse DNS across both the machines and 
> whether that is the problem and please do advice on how do i achieve it
> *Master logs:*
> From the below master logs it clearly says that master is trying to connect 
> to region and then eventually getting disconnected from the client region 
> server 
>  * *{color:#ff0000}"{color}{color:#ff0000}*DEBUG 
> [RpcServer.reader=1,bindAddress=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,port=16000]
>  ipc.RpcServer: RpcServer.listener,port=16000: DISCONNECTING client 
> 10.148.6.13:45732 because read count=-1. Number of active connections: 
> 1*{color}"*
> *complete logs*
> 2020-04-22 19:38:24,812 DEBUG [RpcServer.listener,port=16000] ipc.RpcServer: 
> RpcServer.listener,port=16000: connection from 10.148.6.13:45732; # active 
> connections: 1
>  2020-04-22 19:38:24,961 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16000] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16000: callId: 0 service: 
> RegionServerStatusService methodName: RegionServerStartup size: 47 
> connection: 10.148.6.13:45732
>  2020-04-22 19:38:30,591 DEBUG 
> [*pinpoint-master-v000-rh5k:16000*.activeMasterManager] ipc.RpcClientImpl: 
> Connecting to 
> *pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020*
>  2020-04-22 19:38:31,268 *DEBUG [hconnection-0x5f02b9cb-shared--pool3-t1] 
> ipc.RpcClientImpl: Connecting to 
> pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020*
>  2020-04-22 19:38:31,478 DEBUG [ProcedureExecutor-3] ipc.RpcClientImpl: 
> Connecting to 
> pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020
>  2020-04-22 19:39:32,714 *DEBUG 
> [RpcServer.reader=1,bindAddress=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,port=16000]
>  ipc.RpcServer: RpcServer.listener,port=16000: DISCONNECTING client 
> 10.148.6.13:45732 because read count=-1. Number of active connections: 1*
>  
> *Region server logs:*
> From the below logs region server discovers the master on it's own but unable 
> to join the cluster with below logs
> ===============================================================
>  
> *{color:#ff0000}2020-04-22 19:38:24,675 INFO 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  regionserver.HRegionServer: reportForDuty to 
> master=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,16000{color}*,1587584303253
>  with port=16020, startcode=1587583634667
>  2020-04-22 19:38:24,801 DEBUG 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  ipc.RpcClientImpl: Connecting to 
> pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal/10.148.6.154:16000
>  2020-04-22 19:38:28,005 INFO 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  regionserver.HRegionServer: reportForDuty to 
> master=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,16000,1587584303253
>  with port=16020, startcode=1587583634667
>  2020-04-22 19:38:28,033 INFO 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  regionserver.HRegionServer: Config from master: 
> hbase.rootdir=hdfs://10.148.6.68:9000/hbase
>  2020-04-22 19:38:28,033 INFO 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  regionserver.HRegionServer: Config from master: 
> fs.defaultFS=hdfs://10.148.6.68:9000
>  2020-04-22 19:38:28,033 INFO 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  regionserver.HRegionServer: Config from master: hbase.master.info.port=16010
> ===============================================================
>  
> 2020-04-22 19:38:24,801 DEBUG 
> [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020]
>  ipc.RpcClientImpl: Connecting to 
> pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal/10.148.6.154:16000
>  2020-04-22 19:38:30,592 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.154:53050; # active 
> connections: 1
>  2020-04-22 19:38:31,269 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.154:53052; # active 
> connections: 2
>  2020-04-22 19:38:31,479 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.154:53056; # active 
> connections: 3
>  2020-04-22 19:39:32,413 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 3 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,440 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 4 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,443 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 5 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,445 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 6 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,447 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 7 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,450 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 8 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,452 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 9 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,454 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 10 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,456 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 11 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
>  2020-04-22 19:39:32,458 DEBUG 
> [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 12 
> service: AdminService methodName: OpenRegion size: 81 connection: 
> 10.148.6.154:53050
> ===============================================================
> 2020-04-23 04:40:07,751 DEBUG 
> [RpcServer.reader=3,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020]
>  ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 
> 10.148.6.13:44272 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:17,751 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.13:44280; # active 
> connections: 1
>  2020-04-23 04:40:17,752 DEBUG 
> [RpcServer.reader=4,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020]
>  ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 
> 10.148.6.13:44280 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:27,752 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.13:44282; # active 
> connections: 1
>  2020-04-23 04:40:27,752 DEBUG 
> [RpcServer.reader=5,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020]
>  ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 
> 10.148.6.13:44282 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:37,752 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: 
> RpcServer.listener,port=16020: connection from 10.148.6.13:44284; # active 
> connections: 1
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to