> On 2010-07-22 06:57:10, stack wrote:
> > src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 384
> > <http://review.hbase.org/r/354/diff/1/?file=3014#file3014line384>
> >
> >     Could this first kv on the row in memstore have a ts in advance of 
> > 'now'?  I suppose it can't -- least it shouldn't be possible, right?
> >     
> >     If  thousands of updates a second, could this be a prob?  This logic?

so what was happening previously:
- snapshot happens
- we do an increment which found there was no 'existing' TS in memstore to 
'update' so we make a new one with 'now'.
- we end up with two KVs, one in snapshot one in memstore both with the same 
timestamp.

Another issue (which the unit test perfectly tests for) is so:
- time=1 snapshot occurs with a KV ts=1 into snapshot
- time=1 ICV happens, but now the KV in memstore _must_ have a TS=2 (or else we 
get duplicate TS - bad!)
- time=1 ICV happens, but the KV in memstore is TS=2, but now=1, so we need to 
keep the max(now,TS in memstore) or else we get potential duplicates

In both these cases if we have a Put with ts=Now+X we have problems:
- in HRegion we do a Get and we see Value=100 ts=Now+X
- in Store we do updateColumnValue with Value=101, TS=now
- If we dont clear that ts=Now+X we will put a shadowed KV, and the _next_ ICV 
will see value '100' not value '101' and we will never actually 'increment' 
until we get past Now+X.  The compromise is to clear out all KVs in memstore 
period.

We can end up with a situation where the KV in memstore leads 'real time' by a 
millisecond during snapshots.  If we had a way of comprehensively deduping 
equivalent TSs during post-hoc reads perhaps we wouldnt need this.


> On 2010-07-22 06:57:10, stack wrote:
> > src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java, line 405
> > <http://review.hbase.org/r/354/diff/1/?file=3018#file3018line405>
> >
> >     Nice test

thanks, the EnvironmentEdge stuff is paying off already.


- Ryan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/354/#review450
-----------------------------------------------------------


On 2010-07-22 01:07:55, Ryan Rawson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.hbase.org/r/354/
> -----------------------------------------------------------
> 
> (Updated 2010-07-22 01:07:55)
> 
> 
> Review request for hbase.
> 
> 
> Summary
> -------
> 
> There are tricky edge cases that were removed by HBASE-2553 (oopsy!)... flaky 
> tests have been illustrating them. This patch fixes those flaky tests to be 
> not flaky (using the EnvironmentEdgeManager thing) and also fixes them, and 
> introduces tests that cover the particular use cases slightly better as well. 
> Oh yes and and fixes the actual bug.
> 
> Without these fixes we would end up with KVs with different values with the 
> same Timestamp which causes problems.  This can happen when we get more than 
> 1 increment/millisecond and especially during a snapshot.
> 
> 
> This addresses bug HBASE-2863.
>     http://issues.apache.org/jira/browse/HBASE-2863
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/hadoop/hbase/KeyValue.java e32d683 
>   src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 44fa0c3 
>   src/main/java/org/apache/hadoop/hbase/regionserver/Store.java c1ff9f2 
>   src/main/java/org/apache/hadoop/hbase/util/ManualEnvironmentEdge.java 
> PRE-CREATION 
>   src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 4ead02d 
>   src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java a32eed6 
> 
> Diff: http://review.hbase.org/r/354/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ryan
> 
>

Reply via email to