[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150055#comment-14150055
 ] 

Manukranth Kolloju commented on HBASE-12075:
--------------------------------------------

I will change it to getFailureInfo()
I used HostAndPort because the ServerName will also include the start code. The 
original implementation user HServerAddress instead of HServerName.
ServerName would probably hurt here because the code tries to clear PFFE for 
the servers who have come back from death/near death. So, now when the server 
has come back from death, then the servername would be different and so the 
earlier servername would remain in the failuresMap. 
But, on the bright side, having the dead server name in the failuresMap is not 
going to be harmful because we have a periodic cleanup that goes and deletes 
the servers listed in the failuresMap.

So, let me change the code to reflect ServerName so that I don't have to 
convert ServerName to HostAndPort every we enter PFFInterceptor. I will add a 
couple of unit tests for the PreemptiveFastFailInterceptor and resubmit the 
patch.

> Preemptive Fast Fail
> --------------------
>
>                 Key: HBASE-12075
>                 URL: https://issues.apache.org/jira/browse/HBASE-12075
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>    Affects Versions: 1.0.0
>            Reporter: Manukranth Kolloju
>            Assignee: Manukranth Kolloju
>             Fix For: 1.0.0
>
>         Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch
>
>
> In multi threaded clients, we use a feature developed on 0.89-fb branch 
> called Preemptive Fast Fail. This allows the client threads which would 
> potentially fail, fail fast. The idea behind this feature is that we allow, 
> among the hundreds of client threads, one thread to try and establish 
> connection with the regionserver and if that succeeds, we mark it as a live 
> node again. Meanwhile, other threads which are trying to establish connection 
> to the same server would ideally go into the timeouts which is effectively 
> unfruitful. We can in those cases return appropriate exceptions to those 
> clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to