Ibrar Ahmed created HBASE-27908:
-----------------------------------
Summary: Can't get connection to ZooKeeper
Key: HBASE-27908
URL: https://issues.apache.org/jira/browse/HBASE-27908
Project: HBase
Issue Type: Bug
Components: build
Affects Versions: 1.4.13
Reporter: Ibrar Ahmed
I am using Hbase cluster along with apache kylin, the connection between Edge
node and the Hbase cluster is good.
following are the logs from Kylin side which shows Error exception:
{code:java}
java.net.SocketTimeoutException: callTimeout=1200000, callDuration=1275361:
org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to
ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:178)
at
org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4551)
at
org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:561)
at org.apache.hadoop.hbase.client.HTable.getTableDescriptor(HTable.java:585)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3.configureIncrementalLoad(HFileOutputFormat3.java:328)
at
org.apache.kylin.storage.hbase.steps.CubeHFileJob.run(CubeHFileJob.java:101)
at
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:144)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hbase.MasterNotRunningException:
org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to
ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1618)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1638)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1795)
at
org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:140)
... 13 more
Caused by: org.apache.hadoop.hbase.MasterNotRunningException: Can't get
connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:971)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$400(ConnectionManager.java:566)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1567)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1609)
... 17 more
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1111)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:425)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:960)
... 20 more {code}
Following are the logs from Hbase cluster master NOde which accepts the
connection from Edge NOde(Kylin):
{code:java}
2023-06-05 10:00:30,336 [myid:0] - INFO [CommitProcessor:0:NIOServerCnxn@1056]
- Closed socket connection for client /10.127.2.201:37328 which had sessionid
0x7311c000c
2023-06-05 13:14:48,346 [myid:0] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2023-06-05 13:14:48,346 [myid:0] - INFO
[PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2023-06-05 13:22:39,872 [myid:0] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@222] - Accepted
socket connection from /10.127.2.233:42364
2023-06-05 13:22:39,873 [myid:0] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@949] - Client
attempting to establish new session at /10.127.2.233:42364
2023-06-05 13:22:39,874 [myid:0] - INFO [CommitProcessor:0:ZooKeeperServer@694]
- Established session 0x7311c0022 with negotiated timeout 40000 for client
/10.127.2.233:42364 {code}
have check all the permission in HDFS and S3. Any leads would be really
appreciated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)