[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160038#comment-14160038
 ] 

Lars Hofhansl commented on HBASE-12075:
---------------------------------------

>From the description this is an implementation of the circuit breaker pattern, 
>right?
Anecdotically we had that implemented here as well, but found it problematic 
for various reason and have since replaced with a resource counter/limiter. 
I.e. via a simple semaphore and an acquire/release protocol we simply limit the 
number of threads that use a resource (HTable, HConnection, PhoenixConnection) 
to a number that is acceptable to us.

CircuitBreaker was problematic for various reasons:
# needed to absolutely sure this is a non-recoverable problem
# what if only a few region servers have issues (a) (now need to group 
exception by region server) in order to decide we need to fail other connection
# what if only a few region servers have issues (b) - cluster is not down, yet, 
client threads will hang
# apps created grouping constructs over HTable/HConnection (Phoenix in our 
case), now the circuit breaker got in the way at the wrong times, we need to 
pull it up higher
# (there were more issues, these are off the top of my head)


> Preemptive Fast Fail
> --------------------
>
>                 Key: HBASE-12075
>                 URL: https://issues.apache.org/jira/browse/HBASE-12075
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>    Affects Versions: 0.99.0, 2.0.0, 0.98.6.1
>            Reporter: Manukranth Kolloju
>            Assignee: Manukranth Kolloju
>         Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch
>
>
> In multi threaded clients, we use a feature developed on 0.89-fb branch 
> called Preemptive Fast Fail. This allows the client threads which would 
> potentially fail, fail fast. The idea behind this feature is that we allow, 
> among the hundreds of client threads, one thread to try and establish 
> connection with the regionserver and if that succeeds, we mark it as a live 
> node again. Meanwhile, other threads which are trying to establish connection 
> to the same server would ideally go into the timeouts which is effectively 
> unfruitful. We can in those cases return appropriate exceptions to those 
> clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to