[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Sergey Shelukhin (JIRA) Tue, 21 Jan 2014 17:16:59 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878106#comment-13878106
 ]


Sergey Shelukhin commented on HBASE-10277:
------------------------------------------

94 compat... HTable put is currently async but does not have any means to 
return errors. flushCommits can flush multiple puts. Errors are eventually 
thrown thru some put call or flushCommits. We can either break HTable::put 
interface (doesn't seem viable), make put-s sync and add separate async put 
(that is possible but may also be surprising), or remove old pattern from AP, 
but keep track of all the puts inside HTable itself, and aggregate all errors 
only when flushCommits is called, for example (with some client performance 
loss because multiple requests will be tracked on higher level than in AP). 
Overall, I can see merit in scenario where you do bunch of puts and then 
flush... it could be replaced with user issuing multi-puts explicitly, but now 
that API is such as it is, we cannot simply remove it I think. Maybe the 3rd 
approach above is viable, if we add some javadocs/notes.
What do you think?

> refactor AsyncProcess
> ---------------------
>
>                 Key: HBASE-10277
>                 URL: https://issues.apache.org/jira/browse/HBASE-10277
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HBASE-10277.patch
>
>
> AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
> callback and with reuse, and one from HCM/HTable batch call, with callback 
> and w/o reuse. In the former case (but not the latter), it also does some 
> throttling of actions on initial submit call, limiting the number of 
> outstanding actions per server.
> The latter case is relatively straightforward. The former appears to be error 
> prone due to reuse - if, as javadoc claims should be safe, multiple submit 
> calls are performed without waiting for the async part of the previous call 
> to finish, fields like hasError become ambiguous and can be used for the 
> wrong call; callback for success/failure is called based on "original index" 
> of an action in submitted list, but with only one callback supplied to AP in 
> ctor it's not clear to which submit call the index belongs, if several are 
> outstanding.
> I was going to add support for HBASE-10070 to AP, and found that it might be 
> difficult to do cleanly.
> It would be nice to normalize AP usage patterns; in particular, separate the 
> "global" part (load tracking) from per-submit-call part.
> Per-submit part can more conveniently track stuff like initialActions, 
> mapping of indexes and retry information, that is currently passed around the 
> method calls.
> -I am not sure yet, but maybe sending of the original index to server in 
> "ClientProtos.MultiAction" can also be avoided.- Cannot be avoided because 
> the API to server doesn't have one-to-one correspondence between requests and 
> responses in an individual call to multi (retries/rearrangement have nothing 
> to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

Reply via email to