Oops. The KVs are sorties in reverse chronological order. So I was wrong. It'll 
return newest version.

Sorry about that confusion. The book is correct.

-- Lars


Tom Brown <[email protected]> schrieb:

>Lars,
>
>I have been relying on the expected behavior (if I write another cell
>with the same {key, family, qualifier, version} it won't return the
>previous one) so you're answer was confusing to me. I did more
>research and I found that the HBase guide specifies that behavior (see
>section 5.8.1 of http://hbase.apache.org/book.html).
>
>Have I misunderstood something? Can I rely on behavior that is
>specified in the guide?
>
>Thanks again!
>
>--Tom
>
>On Sun, Aug 26, 2012 at 6:43 AM, Eric Czech <[email protected]> wrote:
>> Thanks for the info lars!
>>
>> In the potential use case I have for writing at the same timestamp,
>> the values would always be the same anyways so I should be good.
>>
>> On Sat, Aug 25, 2012 at 9:12 PM, lars hofhansl <[email protected]> wrote:
>>> I checked the code to be sure...
>>>
>>>
>>> In ScanWildcardColumnTracker we have this:
>>>
>>>       if (sameAsPreviousTSAndType(timestamp, type)) {
>>>         return ScanQueryMatcher.MatchCode.SKIP;
>>>       }
>>>
>>>
>>> And in ExplicitColumnTracker there is this:
>>>
>>>         if (sameAsPreviousTS(timestamp)) {
>>>           //If duplicate, skip this Key
>>>           return ScanQueryMatcher.MatchCode.SKIP;
>>>         }
>>>
>>>
>>> I.e. the first KV is kept and the subsequent ones (with the same TS) are 
>>> skipped.
>>>
>>> My point remains, though: Do not rely on this.
>>> (Though it will probably stay the way it is, because that is the most 
>>> efficient way to handle this in forward only scanners.)
>>>
>>> -- Lars
>>>
>>>
>>>
>>> ________________________________
>>>  From: Tom Brown <[email protected]>
>>> To: "[email protected]" <[email protected]>; lars hofhansl 
>>> <[email protected]>
>>> Sent: Saturday, August 25, 2012 4:54 PM
>>> Subject: Re: MemStore and prefix encoding
>>>
>>>
>>> I thought when multiple values with the same key, family, qualifier and 
>>> timestamps were written, the one that was written latest (as determined by 
>>> position in the store) would be read. Is that not the case?
>>>
>>> --Tom
>>>
>>> On Saturday, August 25, 2012, lars hofhansl <[email protected]> wrote:
>>>> The prefix encoding applies to blocks in the HFiles and in the block 
>>>> cache, but not to the memstore.
>>>>
>>>>
>>>> #1 Yes. Each column family is its own store. All stores are flushed 
>>>> together, so have many add overhead (especially if a few tend to hold a 
>>>> lot of data, but the others don't, leading to very many small store files 
>>>> that need to be compacted).
>>>> #2 There is only one key with the same key, column family, qualifier, and 
>>>> timestamp (if you write multiple with the same timestamp it is undefined 
>>>> which one you'll get back when you read the next time). So that does not 
>>>> make sense. Writes with the same key, column family, qualifier (each with 
>>>> a different timestamp) count towards the version limit.
>>>>
>>>> -- Lars
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Eric Czech <[email protected]>
>>>> To: user <[email protected]>
>>>> Cc:
>>>> Sent: Saturday, August 25, 2012 2:44 PM
>>>> Subject: MemStore and prefix encoding
>>>>
>>>> Hi everyone,
>>>>
>>>> Does prefix encoding apply to rows in MemStores or does it only apply
>>>> to rows on disk in HFiles?  I'm trying to decide if I should still
>>>> favor larger values in order to not repeat keys, column families, and
>>>> qualifiers more than necessary and while prefix encoding seems to
>>>> negate that concern for storage on disk, I'm not sure if it's still
>>>> applicable to in-memory storage.
>>>>
>>>> Also, I had two other quick (unrelated) questions and I assume it'd be
>>>> less annoying if I put them all in one email:
>>>>
>>>> 1.  Do column families defined for a table introduce any overhead for
>>>> rows that don't put any values in them?  I don't think that's the case
>>>> but I wanted to be sure.
>>>>
>>>> 2.  Do writes with the same key, column family, qualifier, and
>>>> timestamp count towards the version limit?
>>>>
>>>> Thanks for the help!
>>>>
>>>>

Reply via email to