[
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646465#comment-13646465
]
Nicolas Liochon commented on HBASE-6295:
----------------------------------------
bq. Also, what happens to rows that are not added in AsyncProcess::submit? Not
clear on that
[~sershe] Thanks for having a look. I wrote a short summary that I will put in
the javadoc or in the hbase ref guide to explain what the code is supposed to
do.
{panel}
The puts are sent asynchronously. The interface is 100% compatible with the
HTable interface that we had in 0.94 and before.
If autoflush is set to false, writes are buffered in HTable. When the buffer
size goes beyond the value defined in "hbase.client.write.buffer", the buffer
is sent asynchronously to the server. Retries will be also be managed
independently. We block only:
- if the users code calls HTable#flushCommit
- if the user code calls HTable#close, because it implies a flushCommit
- if we run out of retries for an operation: in this case we finish all the
writes in progress, an raise a single aggregated error.
- if we met one of the flow control condition detailed below.
It's possible to control the client stream with two parameters:
- "hbase.client.max.total.tasks": number of task that we can run
simultaneously. If the buffer goes beyond "hbase.client.write.buffer" and the
number of tasks currently in progress is greater then
"hbase.client.max.total.tasks", we block until some of the tasks finishes. This
parameter must be set accordingly with the cluster size: if there are 1000
machines in the cluster, it may make sense to have a few thousand conccurrent
tasks for some tables.
- "hbase.client.max.perregion.tasks": number of tasks in progress for the same
region. When doing a background flush, puts for a region that has already
"hbase.client.max.perregion.tasks" or more tasks in progress are skipped, and
remain in the HTable write buffer. They will be sent into a later background
flush. If, when doing a background flush, all entries are skipped, we block
until a slot becomes available.
{panel}
Now that I wrote this, I think I have a bug in the way I manage errors and
clearBufferOnFail: may be the write buffer should contain only failes puts. I
will check this.
bq. Lots of the code seems to be copied from other parts of HCM, and the
original is not removed, will it be removed? Otherwise there's duplication.
I really don't know. The problem I have is that this API is public. So while
it's transparent in HTable (I don't change the interface nor its contract),
it's not the case for the methods in HConnectionManager. That's why I added
some methods: it allows to keep the existing interface of HConnectionManager
while adding the background flush. I thought about implementing the previous
synchronous interface with the new asynchronous methods, but I feel it can make
them more fragile. I don't have any real opinion here, the whole existing code
could be refactored quite a lot. That's why the patch is not final, but I can't
say if the final patch will/should remove the duplication.
bq. RegionTooBusyException is a new one on me (I'll work on the ugly pb message
in another issue)
Thanks, [[email protected]]. In my tests, it seems the servers hangs at a
point. I can stop the client and restart it, the server does not accept any new
operation (for something like 5 minutes). I don't know if it's related to my
changes, but it's fishy. I will do a test with a server without 6295.
> Possible performance improvement in client batch operations: presplit and
> send in background
> --------------------------------------------------------------------------------------------
>
> Key: HBASE-6295
> URL: https://issues.apache.org/jira/browse/HBASE-6295
> Project: HBase
> Issue Type: Improvement
> Components: Client, Performance
> Affects Versions: 0.95.2
> Reporter: Nicolas Liochon
> Assignee: Nicolas Liochon
> Labels: noob
> Attachments: 6295.v1.patch, 6295.v2.patch, 6295.v3.patch,
> 6295.v4.patch, 6295.v5.patch
>
>
> today batch algo is:
> {noformat}
> for Operation o: List<Op>{
> add o to todolist
> if todolist > maxsize or o last in list
> split todolist per location
> send split lists to region servers
> clear todolist
> wait
> }
> {noformat}
> We could:
> - create immediately the final object instead of an intermediate array
> - split per location immediately
> - instead of sending when the list as a whole is full, send it when there is
> enough data for a single location
> It would be:
> {noformat}
> for Operation o: List<Op>{
> get location
> add o to todo location.todolist
> if (location.todolist > maxLocationSize)
> send location.todolist to region server
> clear location.todolist
> // don't wait, continue the loop
> }
> send remaining
> wait
> {noformat}
> It's not trivial to write if you add error management: retried list must be
> shared with the operations added in the todolist. But it's doable.
> It's interesting mainly for 'big' writes
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira