Thanks for the info lars!

In the potential use case I have for writing at the same timestamp,
the values would always be the same anyways so I should be good.

On Sat, Aug 25, 2012 at 9:12 PM, lars hofhansl <[email protected]> wrote:
> I checked the code to be sure...
>
>
> In ScanWildcardColumnTracker we have this:
>
>       if (sameAsPreviousTSAndType(timestamp, type)) {
>         return ScanQueryMatcher.MatchCode.SKIP;
>       }
>
>
> And in ExplicitColumnTracker there is this:
>
>         if (sameAsPreviousTS(timestamp)) {
>           //If duplicate, skip this Key
>           return ScanQueryMatcher.MatchCode.SKIP;
>         }
>
>
> I.e. the first KV is kept and the subsequent ones (with the same TS) are 
> skipped.
>
> My point remains, though: Do not rely on this.
> (Though it will probably stay the way it is, because that is the most 
> efficient way to handle this in forward only scanners.)
>
> -- Lars
>
>
>
> ________________________________
>  From: Tom Brown <[email protected]>
> To: "[email protected]" <[email protected]>; lars hofhansl 
> <[email protected]>
> Sent: Saturday, August 25, 2012 4:54 PM
> Subject: Re: MemStore and prefix encoding
>
>
> I thought when multiple values with the same key, family, qualifier and 
> timestamps were written, the one that was written latest (as determined by 
> position in the store) would be read. Is that not the case?
>
> --Tom
>
> On Saturday, August 25, 2012, lars hofhansl <[email protected]> wrote:
>> The prefix encoding applies to blocks in the HFiles and in the block cache, 
>> but not to the memstore.
>>
>>
>> #1 Yes. Each column family is its own store. All stores are flushed 
>> together, so have many add overhead (especially if a few tend to hold a lot 
>> of data, but the others don't, leading to very many small store files that 
>> need to be compacted).
>> #2 There is only one key with the same key, column family, qualifier, and 
>> timestamp (if you write multiple with the same timestamp it is undefined 
>> which one you'll get back when you read the next time). So that does not 
>> make sense. Writes with the same key, column family, qualifier (each with a 
>> different timestamp) count towards the version limit.
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Eric Czech <[email protected]>
>> To: user <[email protected]>
>> Cc:
>> Sent: Saturday, August 25, 2012 2:44 PM
>> Subject: MemStore and prefix encoding
>>
>> Hi everyone,
>>
>> Does prefix encoding apply to rows in MemStores or does it only apply
>> to rows on disk in HFiles?  I'm trying to decide if I should still
>> favor larger values in order to not repeat keys, column families, and
>> qualifiers more than necessary and while prefix encoding seems to
>> negate that concern for storage on disk, I'm not sure if it's still
>> applicable to in-memory storage.
>>
>> Also, I had two other quick (unrelated) questions and I assume it'd be
>> less annoying if I put them all in one email:
>>
>> 1.  Do column families defined for a table introduce any overhead for
>> rows that don't put any values in them?  I don't think that's the case
>> but I wanted to be sure.
>>
>> 2.  Do writes with the same key, column family, qualifier, and
>> timestamp count towards the version limit?
>>
>> Thanks for the help!
>>
>>

Reply via email to