I think there are some hacky fixes in there to prevent duplicate timestamps.
With the new seqid, this should not be an issue. I think we should then be able to generate a single now() value and use it everywhere (MemStore, HLog). Duplicate timestamps won't be a problem as the seqid ordering will take care of it and we can still use the same logic to cleanup old values. A separate issue is how "atomic" we want counters to be. Currently the multi-column Increment operation does not utilize any of the RWCC stuff. I guess here we should just use seqid now? Is it one seqid per KV or per "row transaction"? JG > -----Original Message----- > From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack > Sent: Wednesday, January 12, 2011 11:54 AM > To: HBase Dev List > Subject: incrementColumnValue, WAL and timestamp > > This is how incrementColumnValue looks currently. I have a few questions > on it (I'm trying to add a sequence number to KeyValue -- hbase-2856). > > public long incrementColumnValue(byte [] row, byte [] family, > byte [] qualifier, long amount, boolean writeToWAL) > throws IOException { > ... > try { > ... > // Get the old value: > Get get = new Get(row); > get.addColumn(family, qualifier); > > .. > // build the KeyValue now: > KeyValue newKv = new KeyValue(row, family, > qualifier, EnvironmentEdgeManager.currentTimeMillis(), > Bytes.toBytes(result)); > > // Now log it: > if (writeToWAL) { > long now = EnvironmentEdgeManager.currentTimeMillis(); > WALEdit walEdit = new WALEdit(); > walEdit.add(newKv); > this.log.append(regionInfo, regionInfo.getTableDesc().getName(), > walEdit, now); > } > > // Now request the ICV to the store, this will set the timestamp > // appropriately depending on if there is a value in memcache or not. > // returns the change in the size of the memstore from operation > long size = store.updateColumnValue(row, family, qualifier, result); > > .... > } finally { > releaseRowLock(lid); > } > } finally { > closeRegionOperation(); > } > > .... > > return result; > } > > Ignoring stuff like the double > EnvironmentEdgeManager.currentTimeMillis() call -- i'll fix that -- the thing > that is of interest to me is this allowance that what is written to WAL is not > necessarily the edit that makes it out to the persisted store file (because > the > timestamp for the ICV is set later up in the memstore and the timestamp > may not be 'now' but rather will be the largest timestamp currently in > memstore). > > This seems off to me. What do you lot see as repercussions of my writing > into memstore the newKv we made above? Are increments going to come > in out of order, is that what we're trying to protect against? > Or are we trying to protect against some errant edit that has an inflated > timestamp up in memstore? > > Thanks, > St.Ack