[ 
https://issues.apache.org/jira/browse/HBASE-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15820720#comment-15820720
 ] 

Yu Li commented on HBASE-17449:
-------------------------------

bq. if we get a rpc timeout at client side, then usually we will get an 
OutOfOrderScannerException for the next call and it does not make any 
difference than UnknownScannerException...
I'm not sure about this. AFAIK in HBASE-13662 we tried to rollback the seqId 
for recoverable failure then in HBASE-16604 we will close the current scanner 
and ask client to open a new one? See codes in {{RsRpcServices#scan}}:
{code}
        } catch (IOException e) {
          // The scanner state might be left in a dirty state, so we will tell 
the Client to
          // fail this RPC and close the scanner while opening up another one 
from the start of
          // row that the client has last seen.
          closeScanner(region, scanner, scannerName, context);
          // scanner is closed here
          scannerClosed = true;
{code}
And please correct me if I misunderstand anything here, thanks.

bq. Will open a new issue to discuss this. I think we need to reconsider the 
timeout configs for scan.
Yeah, maybe a sub-task will make the discussion more focused for scan while we 
could still discuss other parts here.

bq. Reading above, seems like timeout needs big cleanup
Yep, mainly on scan timeouts per code level, and documents on others.

bq. Make a list of configs and statements about them and then I could write 
tests to prove they do as they say (or fix)?
Thanks for offering the help sir [~stack]. Yes, let's do this and make the 
document more clear (smile).

> Add explicit document on different timeout settings
> ---------------------------------------------------
>
>                 Key: HBASE-17449
>                 URL: https://issues.apache.org/jira/browse/HBASE-17449
>             Project: HBase
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Yu Li
>            Priority: Critical
>
> Currently we have more than one timeout settings, mainly includes:
> * hbase.rpc.timeout
> * hbase.client.operation.timeout
> * hbase.client.scanner.timeout.period
> And in latest branch-1 or master branch code, we will have two other 
> properties:
> * hbase.rpc.read.timeout
> * hbase.rpc.write.timeout
> However, in current refguid we don't have explicit instruction on the 
> difference of these timeout settings (there're explanations for each 
> property, but no instruction on when to use which)
> In my understanding, for RPC layer timeout, or say each rpc call:
> * Scan (openScanner/next): controlled by hbase.client.scanner.timeout.period
> * Other operations:
>    1. For released versions: controlled by hbase.rpc.timeout
>    2. For 1.4+ versions: read operation controlled by hbase.rpc.read.timeout, 
> write operation controlled by hbase.rpc.write.timeout, or hbase.rpc.timeout 
> if the previous two are not set.
> And hbase.client.operation.timeout is a higher-level control counting retry 
> in, or say the overall control for one user call.
> After this JIRA, I hope when users ask questions like "What settings I should 
> use if I don't want to wait for more than 1 second for a single 
> put/get/scan.next call", we could give a neat answer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to