Thanks for the info lars! In the potential use case I have for writing at the same timestamp, the values would always be the same anyways so I should be good.
On Sat, Aug 25, 2012 at 9:12 PM, lars hofhansl <[email protected]> wrote: > I checked the code to be sure... > > > In ScanWildcardColumnTracker we have this: > > if (sameAsPreviousTSAndType(timestamp, type)) { > return ScanQueryMatcher.MatchCode.SKIP; > } > > > And in ExplicitColumnTracker there is this: > > if (sameAsPreviousTS(timestamp)) { > //If duplicate, skip this Key > return ScanQueryMatcher.MatchCode.SKIP; > } > > > I.e. the first KV is kept and the subsequent ones (with the same TS) are > skipped. > > My point remains, though: Do not rely on this. > (Though it will probably stay the way it is, because that is the most > efficient way to handle this in forward only scanners.) > > -- Lars > > > > ________________________________ > From: Tom Brown <[email protected]> > To: "[email protected]" <[email protected]>; lars hofhansl > <[email protected]> > Sent: Saturday, August 25, 2012 4:54 PM > Subject: Re: MemStore and prefix encoding > > > I thought when multiple values with the same key, family, qualifier and > timestamps were written, the one that was written latest (as determined by > position in the store) would be read. Is that not the case? > > --Tom > > On Saturday, August 25, 2012, lars hofhansl <[email protected]> wrote: >> The prefix encoding applies to blocks in the HFiles and in the block cache, >> but not to the memstore. >> >> >> #1 Yes. Each column family is its own store. All stores are flushed >> together, so have many add overhead (especially if a few tend to hold a lot >> of data, but the others don't, leading to very many small store files that >> need to be compacted). >> #2 There is only one key with the same key, column family, qualifier, and >> timestamp (if you write multiple with the same timestamp it is undefined >> which one you'll get back when you read the next time). So that does not >> make sense. Writes with the same key, column family, qualifier (each with a >> different timestamp) count towards the version limit. >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Eric Czech <[email protected]> >> To: user <[email protected]> >> Cc: >> Sent: Saturday, August 25, 2012 2:44 PM >> Subject: MemStore and prefix encoding >> >> Hi everyone, >> >> Does prefix encoding apply to rows in MemStores or does it only apply >> to rows on disk in HFiles? I'm trying to decide if I should still >> favor larger values in order to not repeat keys, column families, and >> qualifiers more than necessary and while prefix encoding seems to >> negate that concern for storage on disk, I'm not sure if it's still >> applicable to in-memory storage. >> >> Also, I had two other quick (unrelated) questions and I assume it'd be >> less annoying if I put them all in one email: >> >> 1. Do column families defined for a table introduce any overhead for >> rows that don't put any values in them? I don't think that's the case >> but I wanted to be sure. >> >> 2. Do writes with the same key, column family, qualifier, and >> timestamp count towards the version limit? >> >> Thanks for the help! >> >>
