Re: Custom preCompact RegionObserver crashes entire cluster on OOME: Heap Space

Mesika, Asaf Tue, 12 Feb 2013 06:41:53 -0800

I'm seeing a very strange behavior:

If I run a scan during major compaction, I can see both the modified Delta Key 
Value (which contains the aggregated values - e.g. 9) and the other two delta 
columns that were used for this aggregated column (e.g, 3, 3) - as if Scan is 
exposed to the key values produced in mid scan.
Could it be related to Cache somehow?


I am modifying the KeyValue object received from the InternalScanner in 
preCompact (modifying its value).

On Feb 12, 2013, at 11:22 AM, Anoop Sam John wrote:

>> The question is: is it "legal" to change a KV I received from the 
>> InternalScanner before adding it the Result - i..e returning it from my own 
>> InternalScanner?
> 
> You can change as per your need IMO
> 
> -Anoop-
> 
> ________________________________________
> From: Mesika, Asaf [[email protected]]
> Sent: Tuesday, February 12, 2013 2:43 PM
> To: [email protected]
> Subject: Re: Custom preCompact RegionObserver crashes entire cluster on OOME: 
> Heap Space
> 
> I am trying to reduce the amount of KeyValue generated during the preCompact, 
> but I'm getting some weird behaviors.
> 
> Let me describe what I am doing in short:
> 
> We have a counters table, with the following structure:
> 
> RowKey =  A combination of field values representing group by key.
> CF = time span aggregate (Hour, Day, Month). Currently we have only for Hour.
> CQ = Round-to-Hour timestamp (long).
> Value = The count
> 
> We collect raw data, and updates the counters table for the matched group by 
> key, hour.
> We tried using Increment, but discovered its very very slow.
> Instead we've decided to update the counters upon compaction. We write the 
> deltas into the same row-key, but a longer column qualifier: 
> <RoundedToTheHourTS><Type><UniqueId>.
> <Type> is: Delta or Aggregate.
> Delta stands for a delta column qualifier we send from our client.
> 
> in the preCompact, I create an InternalScanner which aggregates the delta 
> column qualifier values and generates a new key value with Type Aggregate: 
> <TS><A><UniqueID>
> 
> The problem with this implementation that it consumes more memory.
> 
> Now, I've tried avoiding the creation of the Aggregate type KV, by simply 
> re-using the 1st delta column qualifier: simply changing its value in the 
> KeyValue.
> But from some reason, after a couple of minor / major compactions, I see data 
> loss, when I count the values and compare them to the expected.
> 
> 
> The question is: is it "legal" to change a KV I received from the 
> InternalScanner before adding it the Result - i..e returning it from my own 
> InternalScanner?
> 
> 
> 
> 
> 
> 
> On Feb 12, 2013, at 8:44 AM, Anoop Sam John wrote:
> 
>> Asaf,
>>          You have created a wrapper around the original InternalScanner 
>> instance created by the compaction flow?
>> 
>>> Where do the KV generated during the compaction process queue up before 
>>> being written to the disk? Is this buffer configurable?
>> When I wrote the Region Observer my assumption was the the compaction 
>> process works in Streaming fashion, thus even if I decide to generate a KV 
>> per KV I see, it still shouldn't be a problem memory wise.
>> 
>> There is no queuing. Your assumption is correct only. It is written to the 
>> writer as and when. (Just like how memstore flush doing the HFile write) As 
>> Lars said a look at your code can tell if some thing is going wrong.  Do you 
>> have blooms being used?
>> 
>> -Anoop-
>> ________________________________________
>> From: Mesika, Asaf [[email protected]]
>> Sent: Tuesday, February 12, 2013 11:16 AM
>> To: [email protected]
>> Subject: Custom preCompact RegionObserver crashes entire cluster on OOME: 
>> Heap Space
>> 
>> Hi,
>> 
>> I wrote a RegionObserver which does preCompact.
>> I activated in pre-production, and then entire cluster dropped dead: One 
>> RegionServer after another crashed on OutOfMemoryException: Heap Space.
>> 
>> My preCompact method generates a KeyValue per each set of Column Qualifiers 
>> it sees.
>> When I remove the coprocessor and restart the cluster, cluster remains 
>> stable.
>> I have 8 RS, each has 4 GB Heap. There about 9 regions (from a specific 
>> table I'm working on) per Region Server.
>> Running HBase 0.94.3
>> 
>> The crash occur when the major compaction fires up, apparently cluster wide.
>> 
>> 
>> My question is this: Where do the KV generated during the compaction process 
>> queue up before being written to the disk? Is this buffer configurable?
>> When I wrote the Region Observer my assumption was the the compaction 
>> process works in Streaming fashion, thus even if I decide to generate a KV 
>> per KV I see, it still shouldn't be a problem memory wise.
>> 
>> Of course I'm trying to improve my code so it will generate much less new KV 
>> (by simply altering the existing KVs received from the InternalScanner).
>> 
>> Thank you,
>> 
>> Asaf

Re: Custom preCompact RegionObserver crashes entire cluster on OOME: Heap Space

Reply via email to