I think there are some hacky fixes in there to prevent duplicate timestamps.

With the new seqid, this should not be an issue.  I think we should then be 
able to generate a single now() value and use it everywhere (MemStore, HLog).  
Duplicate timestamps won't be a problem as the seqid ordering will take care of 
it and we can still use the same logic to cleanup old values.

A separate issue is how "atomic" we want counters to be.  Currently the 
multi-column Increment operation does not utilize any of the RWCC stuff.  I 
guess here we should just use seqid now?  Is it one seqid per KV or per "row 
transaction"?

JG

> -----Original Message-----
> From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack
> Sent: Wednesday, January 12, 2011 11:54 AM
> To: HBase Dev List
> Subject: incrementColumnValue, WAL and timestamp
> 
> This is how incrementColumnValue looks currently.  I have a few questions
> on it (I'm trying to add a sequence number to KeyValue -- hbase-2856).
> 
>   public long incrementColumnValue(byte [] row, byte [] family,
>       byte [] qualifier, long amount, boolean writeToWAL)
>   throws IOException {
>     ...
>     try {
>       ...
>         // Get the old value:
>         Get get = new Get(row);
>         get.addColumn(family, qualifier);
> 
> ..
>         // build the KeyValue now:
>         KeyValue newKv = new KeyValue(row, family,
>             qualifier, EnvironmentEdgeManager.currentTimeMillis(),
>             Bytes.toBytes(result));
> 
>         // Now log it:
>         if (writeToWAL) {
>           long now = EnvironmentEdgeManager.currentTimeMillis();
>           WALEdit walEdit = new WALEdit();
>           walEdit.add(newKv);
>           this.log.append(regionInfo, regionInfo.getTableDesc().getName(),
>             walEdit, now);
>         }
> 
>         // Now request the ICV to the store, this will set the timestamp
>         // appropriately depending on if there is a value in memcache or not.
>         // returns the change in the size of the memstore from operation
>         long size = store.updateColumnValue(row, family, qualifier, result);
> 
> ....
>       } finally {
>         releaseRowLock(lid);
>       }
>     } finally {
>       closeRegionOperation();
>     }
> 
> ....
> 
>     return result;
>   }
> 
> Ignoring stuff like the double
> EnvironmentEdgeManager.currentTimeMillis() call -- i'll fix that -- the thing
> that is of interest to me is this allowance that what is written to WAL is not
> necessarily the edit that makes it out to the persisted store file (because 
> the
> timestamp for the ICV is set later up in the memstore and the timestamp
> may not be 'now' but rather will be the largest timestamp currently in
> memstore).
> 
> This seems off to me.  What do you lot see as repercussions of my writing
> into memstore the newKv we made above?  Are increments going to come
> in out of order, is that what we're trying to protect against?
>  Or are we trying to protect against some errant edit that has an inflated
> timestamp up in memstore?
> 
> Thanks,
> St.Ack

Reply via email to