Hi Mike,

Disabling ipv6 should be a solution but...

Br,

Bugra

25.04.2024 13:59 tarihinde Mike Mills yazdı:
We have resolved this by:

  - Getting the hbase code and editing ServerShutdownHandler.java to skip splitting for server 
names starting with "0:0:0:0" and ignoring "Skip assigning region in transition on 
other server" logic.
  - We then swapped in the newly built hbase-server-096.1.12.jar and restarted 
the master. The master then reassigned all of the offline servers.

The hardest part was building the old hbase code with java6.

Built in Eclipse 2024-03 version. Had to run Maven with two different 
configurations:

   *
One using java8 or higher to download dependencies because of post java6 TLS 
requirements. Goal: dependency:resolve
   *
One using java6, downloaded from sun, to do the package goal with -o offline 
switch to prevent downloads

Had to comment out a lot of maven dependencies that weren't needed for the 
profiles we were using. Maven wanted to download them and a lot of those jars 
don't exist anymore.

Mike

________________________________
From: Mike Mills <mikecmi...@hotmail.com>
Sent: Wednesday, April 24, 2024 10:45 AM
To: user@hbase.apache.org <user@hbase.apache.org>
Subject: Re: Regions are in State OFFLINE and are Unassignable

Found this in the master log:

2024-04-24 10:29:49,421 DEBUG [MASTER_SERVER_OPERATIONS-master:60000-4] 
master.DeadServer: Finished processing 0:0:0:0:0:0:0:0,60020,1713708572030
2024-04-24 10:29:49,421 ERROR [MASTER_SERVER_OPERATIONS-master:60000-4] 
executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path 
in absolute URI: 0:0:0:0:0:0:0:0,60020,1713708572030-splitting
      at org.apache.hadoop.fs.Path.initialize(Path.java:206)
      at org.apache.hadoop.fs.Path.<init>(Path.java:172)
      at org.apache.hadoop.fs.Path.<init>(Path.java:94)
      at org.apache.hadoop.fs.Path.suffix(Path.java:354)
      at 
org.apache.hadoop.hbase.master.MasterFileSystem.getLogDirs(MasterFileSystem.java:315)
      at 
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:405)
      at 
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:383)
      at 
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:281)
      at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:196)
      at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
      at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
0:0:0:0:0:0:0:0,60020,1713708572030-splitting
      at java.net.URI.checkPath(URI.java:1823)
      at java.net.URI.<init>(URI.java:745)
      at org.apache.hadoop.fs.Path.initialize(Path.java:203)

The bad hostname is probably stopping the handling of re-assigning the region 
at startup. Looks like disabling splitting during repairs is in a future 
version.

We're still researching for a solution. Any thoughts would be appreciated.

Mike
________________________________
From: Mike Mills <mikecmi...@hotmail.com>
Sent: Wednesday, April 24, 2024 8:40 AM
To: user@hbase.apache.org <user@hbase.apache.org>
Subject: Regions are in State OFFLINE and are Unassignable

Hello,

We have a production system down and can't get it back up. It's an older 
version:0.96.1.1. We have 70 OFFLINE unassigned regions.

We were configuring for multiple nics and set both 
hbase.regionserver.ipc.address and hbase.master.ipc.address to 0.0.0.0.

This caused hostname lookup issues which was a known issue years ago (we now 
know).

We now have 3 dead region servers listed with names: 0:0:0:0:0:0:0:0,60020

We've undone our changes and restarted the cluster, but the invalid server 
names,0:0:0:0:0:0:0:0,60020, are still showing up in dead region servers list 
and have OFFLINE regions still assigned to them.

Had no luck with hbck and shell assign, move, unassign.


Status Pages and Logs for a specific Region:

Dead Region Servers:
0:0:0:0:0:0:0:0,60020,1713708572030


Master Status Page:
d31a033cd7810e347639e12833969754    
c,\x92I$\x92I$\x92I$\x92I$\x92I$\x90,1330810355551.d31a033cd7810e347639e12833969754.
 state=OFFLINE, ts=Wed Apr 24 08:02:53 CDT 2024 (376s ago), 
server=0:0:0:0:0:0:0:0,60020,1713708572030

Master Status Page Table Regions:
Region is flagged as "not deployed"

Results From hbck:
ERROR: Region { meta => 
c,\x92I$\x92I$\x92I$\x92I$\x92I$\x90,1330810355551.d31a033cd7810e347639e12833969754., 
hdfs => hdfs://gs1/hbase/data/default/c/d31a033cd7810e347639e12833969754, deployed 
=>  } not deployed on any region server.

Master Log Entries:
2024-04-24 08:02:53,453 INFO  [master:master:60000] master.AssignmentManager: 
Processing d31a033cd7810e347639e12833969754 in state: M_ZK_REGION_OFFLINE
2024-04-24 08:02:53,453 INFO  [master:master:60000] master.RegionStates: 
Transitioned {d31a033cd7810e347639e12833969754 state=OFFLINE, ts=1713963773352, 
server=null} to {d31a033cd7810e347639e12833969754 state=OFFLINE, 
ts=1713963773453, server=0:0:0:0:0:0:0:0,60020,1713708572030}
2024-04-24 08:02:53,695 INFO  [MASTER_SERVER_OPERATIONS-master:60000-0] 
handler.ServerShutdownHandler: Skip assigning region in transition on other 
server{d31a033cd7810e347639e12833969754 state=OFFLINE, ts=1713963773453, 
server=0:0:0:0:0:0:0:0,60020,1713708572030}

 From HBase Shell:
assign 'd31a033cd7810e347639e12833969754'
0 row(s) in 1.1120 seconds

Shell assign Master Log Entry:
2024-04-24 08:15:19,492 INFO  [RpcServer.handler=20,port=60000] 
master.AssignmentManager: Skip assigning 
c,\x92I$\x92I$\x92I$\x92I$\x92I$\x90,1330810355551.d31a033cd7810e347639e12833969754.,
 it's host 0:0:0:0:0:0:0:0,60020,1713708572030 is dead but not processed yet

Any ideas how to get these regions assigned?

Thanks,
Mike

--
Buğra ÇAKIR
BEARTELL - Founder/GM


Reply via email to