[
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated HBASE-10277:
-------------------------------------
Attachment: HBASE-10277.patch
Here's the initial patch... that took much longer than I thought.
The main goal was moving out the context for one call, while preserving the
strange semantics for HTable::put calls.
So most of AP logic is moved into AsyncRequestSet, which is created per
submit/submitAll call. AP itself can now be reused not just for puts but also
for regular batch calls from HTable, and even for multiple tables potentially;
however, it cannot be reused in HCM where custom pool is used, and in some
generic methods.
The (ugly) behavior for HTable where e.g. next put will give you errors from
previous put was lovingly preserved.
Also got rid of callback that was mostly used for tests, tests can check
results without it.
I ran some perf test using YCSB and table with write-dropping coproc (so the
measured perf is client only), and see a bit of perf regression for put-only
workload. I am guessing this is due to allocation that was added. I wasn't able
to get much useful info from YourKit though, it claims the impact of AP on
either run in negligible, both CPU and wall clock. I may investigate further.
Probably this perf difference will not be noticeable on real requests (remains
to be tested).
> refactor AsyncProcess
> ---------------------
>
> Key: HBASE-10277
> URL: https://issues.apache.org/jira/browse/HBASE-10277
> Project: HBase
> Issue Type: Improvement
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HBASE-10277.patch
>
>
> AsyncProcess currently has two patterns of usage, one from HTable flush w/o
> callback and with reuse, and one from HCM/HTable batch call, with callback
> and w/o reuse. In the former case (but not the latter), it also does some
> throttling of actions on initial submit call, limiting the number of
> outstanding actions per server.
> The latter case is relatively straightforward. The former appears to be error
> prone due to reuse - if, as javadoc claims should be safe, multiple submit
> calls are performed without waiting for the async part of the previous call
> to finish, fields like hasError become ambiguous and can be used for the
> wrong call; callback for success/failure is called based on "original index"
> of an action in submitted list, but with only one callback supplied to AP in
> ctor it's not clear to which submit call the index belongs, if several are
> outstanding.
> I was going to add support for HBASE-10070 to AP, and found that it might be
> difficult to do cleanly.
> It would be nice to normalize AP usage patterns; in particular, separate the
> "global" part (load tracking) from per-submit-call part.
> Per-submit part can more conveniently track stuff like initialActions,
> mapping of indexes and retry information, that is currently passed around the
> method calls.
> I am not sure yet, but maybe sending of the original index to server in
> "ClientProtos.MultiAction" can also be avoided.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)