[ 
https://issues.apache.org/jira/browse/HBASE-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924737#action_12924737
 ] 

HBase Review Board commented on HBASE-2946:
-------------------------------------------

Message from: "Ryan Rawson" <[email protected]>


bq.  On 2010-10-24 20:05:55, Ryan Rawson wrote:
bq.  > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, 
line 495
bq.  > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line495>
bq.  >
bq.  >     yeah you cant compare against memstoreTS because if you have this in 
here you wont be able to ever increment values that were inserted into the 
future. you'd just leave them there and continually see it in the 'get' part 
and then in this code bit leave it in place and create a new KV that is masked 
by the future KV.
bq.  >     
bq.  >     It won't be possible for that insert to be part of an uncommitted 
change because of the rowlock however. So no atomic-rules will have been broken.
bq.  
bq.  Jonathan Gray wrote:
bq.      i'm not sure i follow.  am i doing it wrong?  this should be exactly 
the same as what's there.
bq.      
bq.      also, we are breaking the atomic-rules across the Increment.  Each 
column is not broken but across them it is (for reads not writes).
bq.      
bq.      It seems like we could use RWCC though for increments.  I think it's 
fine that if you're using increment you can never write data into it w/ another 
API (or really, with manual timestamps).

there is no easy way to include RWCC right now, because you have to be careful 
about 'committing' the new values before removing the old ones.  This is a bit 
tricky because normally the RWCC 'commit' happens at the HRegion level, and 
right now we are inside memstore/store.  Hopefully people won't need atomic 
reads of multi incremented values.  If one is doing counters this would be the 
case.  Javadoc should definitely yell about this.


bq.  On 2010-10-24 20:05:55, Ryan Rawson wrote:
bq.  > trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, 
line 377
bq.  > <http://review.cloudera.org/r/1088/diff/2/?file=15909#file15909line377>
bq.  >
bq.  >     I'm not sure how this code is optional for your new 'upsert', here 
are some use cases that I found painful:
bq.  >     
bq.  >     - when the KV is in snapshot we can create new KVs in memstore with 
the same TS.  This means you have a dup and before we had this new 'dup' code 
it ruined counts badly.
bq.  >     - the Math.max() bit it to ensure the new KV isnt being created in 
the past a bit and accidently duplicating  a timestamp inside the snapshot.
bq.  
bq.  Jonathan Gray wrote:
bq.      what do you mean by optional?  there shouldn't be any real difference. 
 this code is basically the exact same code that was there but now pulled into 
a method that can be reused.

The original updateColumnValue code was structured like this:

- look in snapshot
- loop over kvset

Create new KV, insert it

- loop over kvset

But you only included the 2nd half.  Your new 'upsert' only includes half the 
implementation of updateColumnValue. 

This comes up in this scenario:

- increment at TS=100
- snapshot, now we have kvset={} and a KV, TS=100 in snapshot
- increment at TS=100

Without the code i have above we'll end up with 2 KVs, both of TS=100, one in 
kvset, one in snapshot, which then becomes a HFile. 

When future reads happen, if they don't see the 'correct' KV, you will end up 
with lost updates.  

In theory the 'read memstore first' and 'order duplicate TS KVs' patches should 
resolve this.  Lost updates reallllllly suck though.  I'd probably recommend 
the conservative course, because I already went thru this and it sucks.  If you 
decide that you are willing to risk it, I might recommend this particular test 
case:

- increment some data at TS=X
- snapshot at TS=X
- increment some data at TS=X, now you have 2 * KV, TS=X
- flush to hfile
- increment at TS=X, ensure you dont lose any updates
- snapshot/flush with TS=X in kvset->snapshot->hfile
- compact the 2 hfiles both with TS=X in them
- do read and make sure the value is what you expect.

By using the manual environment edge you can control the currentTimeMillis() 
returned inside HRegion and down.

This covers the worst case scenario I think, since normally one would get 
increments and you shouldn't get TS=X * 2 in a single HFile. 


- Ryan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1088/#review1641
-----------------------------------------------------------





> Increment multiple columns in a row at once
> -------------------------------------------
>
>                 Key: HBASE-2946
>                 URL: https://issues.apache.org/jira/browse/HBASE-2946
>             Project: HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>
> Currently there is no way to do multiple increments to a single row in one 
> RPC.  This jira is about adding an HTable and HRegionInterface method to 
> increment multiple columns within a single row at once.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to