shankarlingayya created HBASE-9635:
--------------------------------------

             Summary: HBase Table regions are not getting re-assigned to the 
new region server when it comes up (when the existing region server not able to 
handle the load) 
                 Key: HBASE-9635
                 URL: https://issues.apache.org/jira/browse/HBASE-9635
             Project: HBase
          Issue Type: Bug
          Components: master, regionserver
    Affects Versions: 0.94.11
         Environment: SuSE11
            Reporter: shankarlingayya


{noformat}
HBase Table regions are not getting assigned to the new region server for a 
period of 30 minutes (when the existing region server not able to handle the 
load)


Procedure:
1. Setup Non HA Hadoop Cluster with two nodes (Node1-XX.XX.XX.XX,  
Node2-YY.YY.YY.YY)
2. Install Zookeeper & HRegionServer in Node-1
3. Install HMaster & HRegionServer in Node-2
4. From Node2 create HBase Table ( table name 't1' with one column family 'cf1' 
)
5. Perform addrecord 99649 rows 
6. kill both the node Region Server and limit the Node1 Region Server FD to 600
7. Start only the Node1 Region server ==> so that FD exhaust can happen in 
Node1 Region Server
8. After some 5-10 minuites start the Node2 Region Server

===> Huge number of regions of table 't1' are in OPENING state, which are not 
getting re assigned to the Node2 region server which is free. 

===> When the new region server comes up then the master should detect and 
allocate the open failed regions to the region server (here it is staying the 
OPENINING state for 30 minutes which will have huge impcat user app which makes 
use of this table)



2013-09-23 18:46:12,160 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Instantiated t1,row507465,1379937224590.2d9fad2aee78103f928d8c7fe16ba6cd.
2013-09-23 18:46:12,160 ERROR 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of 
region=t1,row507465,1379937224590.2d9fad2aee78103f928d8c7fe16ba6cd., starting 
to roll back the global memstore size.

2013-09-23 18:50:55,284 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
renew lease for 
[DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 309 
seconds.  Will retry shortly ...
java.io.IOException: Failed on local exception: java.net.SocketException: Too 
many open files; Host Details : local host is: "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; 
destination host is: "HOST-XX.XX.XX.XX":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
        at org.apache.hadoop.ipc.Client.call(Client.java:1351)
        at org.apache.hadoop.ipc.Client.call(Client.java:1300)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy13.renewLease(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:188)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at $Proxy13.renewLease(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:522)
        at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:679)
        at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
        at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
        at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
        at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:97)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
        at 
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
        at 
org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:523)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
        at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
        at org.apache.hadoop.ipc.Client.call(Client.java:1318)
        ... 16 more
2013-09-23 18:50:56,285 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
renew lease for 
[DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 310 
seconds.  Will retry shortly ...
java.io.IOException: Failed on local exception: java.net.SocketException: Too 
many open files; Host Details : local host is: "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; 
destination host is: "HOST-XX.XX.XX.XX":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
        at org.apache.hadoop.ipc.Client.call(Client.java:1351)
        at org.apache.hadoop.ipc.Client.call(Client.java:1300)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
       at $Proxy13.renewLease(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:188)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at $Proxy13.renewLease(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:522)
        at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:679)
        at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
        at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
        at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
        at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:97)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
        at 
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
        at 
org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:523)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
        at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
        at org.apache.hadoop.ipc.Client.call(Client.java:1318)
        ... 16 more
2013-09-23 18:50:57,287 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
renew lease for 
[DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 311 
seconds.  Will retry shortly ...
java.io.IOException: Failed on local exception: java.net.SocketException: Too 
many open files; Host Details : local host is: "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; 
destination host is: "HOST-XX.XX.XX.XX":8020;







{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to