[
https://issues.apache.org/jira/browse/HBASE-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634658#action_12634658
]
Jim Kellerman commented on HBASE-748:
-------------------------------------
Jean-Daniel Cryans - 24/Sep/08 03:59 PM
{quote}
HTable commits 23 rows to HRS against a region. Let's say that the the first
one in the 23 is the 1000th in the whole batch to commit.
The region gets split after 10 rows.
At row 11, HRS will handle a NSRE.
HRS returns index 10
Back in client, the current index in the batch was at 23.
It receives 10 from HRS so it backs the index to the row that failed (index =
1010).
Client refreshes cache for that row.
Process resumes at that index eg. rows from 1010 to 1022 will be retried using
a fresh location.
{quote}
Ok, now I get it. I missed that part. Sorry for being dense.
{quote}
This actually works really well but it's not atomic if a row fails, for
example, if a value was too long.
{quote}
Well, aside from the transactional region server, I would not expect it to be
atomic across rows.
Were you thinking that there may be multiple BatchUpdates for the same row? Not
the best way for a client to behave in my opinion.
A couple of comments though.
- HTable.flushCommits() seems to ignore the row lock that can be passed to
HTable.commit(BatchUpdate, RowLock)
- Should the RowLock be associated with the BatchUpdate rather than being
supplied on commit? That would allow us to remove one commit overload, and
allow the client to associate the row lock with multiple BatchUpdates for the
same row.
+1 on moving checks into commit (or flushCommits). We still fail early,
although not as early as we would if the checks were done in BatchUpdate. But
as Stack points out, having BatchUpdate require a HTable or HTD would be ugly.
At least the request won't be partially processed before failing.
Last comment on patch. Remove code that is commented out in
HTable.commit(BatchUpdate, RowLock)
> Add an efficient way to batch update many rows
> ----------------------------------------------
>
> Key: HBASE-748
> URL: https://issues.apache.org/jira/browse/HBASE-748
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: client
> Affects Versions: 0.1.3, 0.2.0
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.19.0
>
> Attachments: hbase-748-v1.patch
>
>
> HBASE-747 introduced a simple way to batch update many rows. The goal of this
> issue is to have an enhanced version that will send many rows in a single RPC
> to each region server. To do this, the client code will have to figure which
> rows goes to which server, group them accordingly and then send them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.