Re: Bulk loading (and/or major compaction) causing OOM

Marcos Ortiz Sat, 08 Dec 2012 12:42:54 -0800


On 12/08/2012 11:50 AM, Bryan Beaudreault wrote:

Thanks for the responses guys.  Responses inline

When you are doing the bulk load, are you pre-split your regions?
What OS are you using and what version of Java?

Yes, regions are pre-split.  We calculated them using M/R before attempting
to bulk load the data.  We've done this before with smaller sizes and it
has worked fine.

Centos5, java 1.6.0_27

Yes, my friend. You should know all the benefits in the new stable

release (0.94.3), so

this is the first advice.

We use CDH currently, so we are working to move to cdh4.1.2, which is 92.x
branch.

Great to hear.


On Fri, Dec 7, 2012 at 4:48 PM, Stack <[email protected]> wrote:

On Fri, Dec 7, 2012 at 1:01 PM, Bryan Beaudreault
<[email protected]>wrote:

We have a couple tables that had thousands of regions due to the size of
the day in them.  We recently changed them to have larger regions (nearly
4GB).  We are trying to bulk load these in now, but every time we do our
servers die with OOM.

You mean, you are reloading the data that once was in thousands of regions
instead into new regions of 4GB in size?

I'd be surprised if the actual bulk load brings on the OOME.

That's correct.  The exact same data is currently live in an older table
with thousands of smaller regions.  Once we get these loaded we will swap
in the new table and delete the old.

The logs seem to show that there is always a major compaction happening
when the OOM happens.  This is among other normal usage from a variety of
apps in our product, so the memstores, block cache, etc are all active
during this time.

Could you turn off major compaction during the bulk load to see if that
helps?

Automatic major compactions are actually off for our cluster, it looks

like they start doing minor compactions as data is loaded in, and that is
where we first saw the OOM issues.  So we tried forcing major compactions
earlier instead.

I was reading through the compaction code and it doesn't look like it
should take up much memory (depending on how the Reader class works) .


Yes.

Are there lots of storefiles under each region?

Yes actually, the bulk loaded data usually seems to contain approximately

5-10 files per region.  Likely due to the output settings of the M/R job
that creates this data.

  Does anyone with more knowledge of these internals know how it bulk load
and major compaction works with regard to memory?

We are running on ec2 c1.xlarge servers with 5GB of heap, and on hbase
version 0.90.4 (I know, I know, we're working to upgrade).

How much have you given hbase?

If you look at your cluster monitoring, are you swapping?

The regionservers are carrying how many regions per server?

The RegionServers have 5GB of heap (7.5GB total memory on a c1.xlarge, of
which 1GB goes to DN and rest to OS)
Swapping is disabled.
We have around 350 regions per RS currently. What we're doing now with this
table is part of our effort to decrease the number of regions across all
tables.  We need to do it with minimal downtime though so it is slow going.
  We are aiming for around 200 regions per RS.

Yes, It would be nice to see less regions by servers. Have youconsidered to merge some adjacent

regions?

St.Ack


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


--

Marcos Luis Ortíz Valmaseda
about.me/marcosortiz <http://about.me/marcosortiz>
@marcosluis2186 <http://twitter.com/marcosluis2186>



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Bulk loading (and/or major compaction) causing OOM

Reply via email to