Hi guys: I met an issue on one of my RS. After SocketException happend, It should shut down , but after 8 hours , I found it still alive and use kill -9 process to end up it.
Here is my RegionServer log: In 01:58 AM , SocketException Happen, 1. [2017-01-02T01:58:00.469+08:00] [INFO] hdfs.DFSClient : Exception in createBlockOutputStream java.net.SocketException: Too many open files 2. at sun.nio.ch.Net.socket0(Native Method) 3. at sun.nio.ch.Net.socket(Net.java:423) 4. at sun.nio.ch.Net.socket(Net.java:416) 5. at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImp.java:104) And in 01:58 AM, RegionServer aborted itself. And began to close region. 1. [2017-01-02T01:58:00.632+08:00] [INFO] regionserver.HRegionServer : aborting server HBASE-VENUS-149106.hadoop.local,16020,1482236933819 2. [2017-01-02T01:58:00.632+08:00] [INFO] client.ConnectionManager$HConnectionImplementation : Closing zookeeper sessionid=0x456f9b55fda457b 3. [2017-01-02T01:58:00.632+08:00] [INFO] regionserver.HStore : Closed f 1. 2017-01-02T01:59:18.067+08:00] [INFO] regionserver.HRegionServer$MovedRegionsCleaner : Chore: MovedRegionsCleaner for region HBASE-VENUS-149106.hadoop.local,16020,1482236933819 was stopped 2. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication : Normal source for cluster 1: Total replicated edits: 39081044, currently replicating from: hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.local%2C16020%2C1482236933819.default.1483293299516 at position: 0 1. [2017-01-02T01:59:18.225+08:00] [INFO] regionserver.Replication : Sink: age in ms of last applied edit: 0, total replicated edits: 160769427 After one Hour, It still log 1. [2017-01-02T02:04:18.225+08:00] [INFO] regionserver.Replication : Normal source for cluster 1: Total replicated edits: 39081044, currently replicating from: hdfs://venus/hbase/oldWALs/HBASE-VENUS-149106.hadoop.local%2C16020%2C1482236933819.default.1483293299516 at position: 0 At 8 AM 1. [2017-01-02T08:09:18.225+08:00] [INFO] regionserver.Replication : Sink: age in ms of last applied edit: 0, total replicated edits: 160769427 2. [2017-01-02T08:14:18.225+08:00] [INFO] regionserver.Replication : Normal source for cluster 1: Total replicated edits: 39081044, currently replicating Is anyone can give me some tips to find it out . thanks .
