[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table

nkeywal (JIRA) Mon, 06 Aug 2012 10:52:04 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429295#comment-13429295
 ]


nkeywal commented on HBASE-6364:
--------------------------------

bq. Fix formatting before commit N.
Ok.

Before committing, I would be interested by a feedback from Suraj. There are 
just a few lines of code, so rebasing won't be complicated if he needs some 
time to test it.

bq. I think I understand the notifying that is going on on the end of the 
addCall method. They line up w/ waits on Call and waits on the calls data 
member?
It's mainly playing with the synchronized: Connection#addCall & 
Connection#setupIOstreams are both synchronized, so, on an exception during 
setupIOstreams, either:
- you were waiting just before setupIOstreams, and in this case you've been 
clean up during setupIOstreams exception management
- you were waiting before the addCall, and in this case you won't be added to 
the calls list.
- in both case when you enter yourself in setupIOstreams you are filtered by 
the test on shouldCloseConnection 

bq. Would it be hard making a test of this bit of code?
It's a difficult question, because there are both the behavior of this jira and 
both the generic behavior to be tested.
1) Just for this jira, when I tested it I added a sleep to simulate a 
connection timeout. I will provide soon a (small) set of utility functions to 
better simulated this, with real timeouts. This type of test (more in the 
category of regression tests than unit tests) could be added to the integration 
tests may be. I had various issues during the tests, it was more difficult than 
expected.
2) Testing the HBaseClient itself would be useful, but the interesting path is 
the multithreaded one.

bq. What speed up around recovery are you seeing N? Should we change the 
default timeout too as Suraj does above?

For the speed up, it's arbitrary, as it depends on the number of RS. on my 
tests, 20% of the calls were serialized. I.e. with an operation on 20 rs, the 
fix makes it 3 times faster. But it seems that Suraj had a much worse 
serialization, on a bigger cluster, so for him we could expect much better 
results, likely 20 times faster or better.
Another point is that in this fix we don't keep a list of dead rs, so we cut 
the connection attempts only of they are happening while another is taking 
place.  So if he could try it would be great.

For the default timeout, I think we can cut down the connect timeout. But I 
think it's safer to make it to 5 seconds, so this fix remains important. I will 
work on this on another jira.


bq. Do we need a 'synchronized (call)' block for the above notification ?
It's a notification on the connection itself, and addCall is synchronized, so 
it's ok. Then 'calls' is a 'ConcurrentSkipListMap' so we can access it 
concurrently.
                
> Powering down the server host holding the .META. table causes HBase Client to 
> take excessively long to recover and connect to reassigned .META. table
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6364
>                 URL: https://issues.apache.org/jira/browse/HBASE-6364
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.6, 0.92.1, 0.94.0
>            Reporter: Suraj Varma
>            Assignee: nkeywal
>              Labels: client
>             Fix For: 0.96.0
>
>         Attachments: 6364-host-serving-META.v1.patch, 6364.v1.patch, 
> 6364.v1.patch, 6364.v2.patch, 6364.v3.patch, 6364.v3.patch, stacktrace.txt
>
>
> When a server host with a Region Server holding the .META. table is powered 
> down on a live cluster, while the HBase cluster itself detects and reassigns 
> the .META. table, connected HBase Client's take an excessively long time to 
> detect this and re-discover the reassigned .META. 
> Workaround: Decrease the ipc.socket.timeout on HBase Client side to a low  
> value (default is 20s leading to 35 minute recovery time; we were able to get 
> acceptable results with 100ms getting a 3 minute recovery) 
> This was found during some hardware failure testing scenarios. 
> Test Case:
> 1) Apply load via client app on HBase cluster for several minutes
> 2) Power down the region server holding the .META. server (i.e. power off ... 
> and keep it off)
> 3) Measure how long it takes for cluster to reassign META table and for 
> client threads to re-lookup and re-orient to the lesser cluster (minus the RS 
> and DN on that host).
> Observation:
> 1) Client threads spike up to maxThreads size ... and take over 35 mins to 
> recover (i.e. for the thread count to go back to normal) - no client calls 
> are serviced - they just back up on a synchronized method (see #2 below)
> 2) All the client app threads queue up behind the 
> oahh.ipc.HBaseClient#setupIOStreams method http://tinyurl.com/7js53dj
> After taking several thread dumps we found that the thread within this 
> synchronized method was blocked on  NetUtils.connect(this.socket, 
> remoteId.getAddress(), getSocketTimeout(conf));
> The client thread that gets the synchronized lock would try to connect to the 
> dead RS (till socket times out after 20s), retries, and then the next thread 
> gets in and so forth in a serial manner.
> Workaround:
> -------------------
> Default ipc.socket.timeout is set to 20s. We dropped this to a low number 
> (1000 ms,  100 ms, etc) on the client side hbase-site.xml. With this setting, 
> the client threads recovered in a couple of minutes by failing fast and 
> re-discovering the .META. table on a reassigned RS.
> Assumption: This ipc.socket.timeout is only ever used during the initial 
> "HConnection" setup via the NetUtils.connect and should only ever be used 
> when connectivity to a region server is lost and needs to be re-established. 
> i.e it does not affect the normal "RPC" actiivity as this is just the connect 
> timeout.
> During RS GC periods, any _new_ clients trying to connect will fail and will 
> require .META. table re-lookups.
> This above timeout workaround is only for the HBase client side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6364) Powering down the server host holding the .META. table causes HBase Client to take excessively long to recover and connect to reassigned .META. table

Reply via email to