I won't say you're crazy but .5 GB per mapper?

I would say tune conservatively like you are suggesting 1GB for OS, but also 
I'd suggest tuning to 80% utilization instead of 100% utilization.

> From: [email protected]
> To: [email protected]
> Date: Tue, 23 Aug 2011 16:35:22 -0700
> Subject: RE: how to make tuning for hbase (every couple of days hbase region 
> sever/s crashe)
> 
> So, if you use 0.5 GB / mapper and 1 GB / reducer, your total memory 
> consumption (minus hbase) on a slave node should be:
> 4 GB M/R tasks
> 1 GB OS -- just a guess
> 1 GB datanode
> 1 GB tasktracker 
> Leaving you with up to 9 GB for your region servers.  I would suggest bumping 
> your region server ram up to 8GB, and leave a GB for OS caching. [I am sure 
> someone out there will tell me I am crazy]
> 
> 
> However, it is the log that is the most useful part of your email.  
> Unfortunately I haven't seen that error before.
> Are you using the Multi methods a lot in your code?
> 
> Dave
> 
> -----Original Message-----
> From: Oleg Ruchovets [mailto:[email protected]] 
> Sent: Tuesday, August 23, 2011 1:38 PM
> To: [email protected]
> Subject: Re: how to make tuning for hbase (every couple of days hbase region 
> sever/s crashe)
> 
> Thank you for detailed response,
> 
> On Tue, Aug 23, 2011 at 7:49 PM, Buttler, David <[email protected]> wrote:
> 
> > Have you looked at the logs of the region servers?  That is a good first
> > place to look.
> 
> How many regions are in your system?
> 
> 
>          Region Servers
> 
> Address Start Code Load
> hadoop01 1314007529600 requests=0, regions=212, usedHeap=3171, maxHeap=3983
> hadoop02 1314007496109 requests=0, regions=207, usedHeap=2185, maxHeap=3983
> hadoop03 1314008874001 requests=0, regions=208, usedHeap=1955, maxHeap=3983
> hadoop04 1314008965432 requests=0, regions=209, usedHeap=2034, maxHeap=3983
> hadoop05 1314007496533 requests=0, regions=208, usedHeap=1970, maxHeap=3983
> hadoop06 1314008874036 requests=0, regions=208, usedHeap=1987, maxHeap=3983
> hadoop07 1314007496927 requests=0, regions=209, usedHeap=2118, maxHeap=3983
> hadoop08 1314007497034 requests=0, regions=211, usedHeap=2568, maxHeap=3983
> hadoop09 1314007497221 requests=0, regions=209, usedHeap=2148, maxHeap=3983
> master            1314008873765 requests=0, regions=208, usedHeap=2007,
> maxHeap=3962
> Total: servers: 10  requests=0, regions=2089
> 
> most of the  time GC succeeded to clean up but every 3/4 days used memory
> become close to 4G
> 
> and there are alot of Exceptions like this:
> 
>    org.apache.hadoop.ipc.*HBase*Server: IPC Server
> Responder, call multi(org.apache.hadoop.*hbase*.client.MultiAction@491fb2f4)
> from 10.11.87.73:33737: output error
> 2011-08-14 18:37:36,264 WARN org.apache.hadoop.ipc.*HBase*Server: IPC Server
> handler 24 on 8041 caught: java.nio.channels.ClosedChannelException
>         at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>         at
> org.apache.hadoop.*hbase.ipc.HBaseServer.channelIO(HBase*Server.java:1387)
>         at
> org.apache.hadoop.*hbase.ipc.HBaseServer.channelWrite(HBase*
> Server.java:1339)
>         at
> org.apache.hadoop.*hbase.ipc.HBaseServer$Responder.processResponse(HBase*
> Server.java:727)
>         at
> org.apache.hadoop.*hbase.ipc.HBaseServer$Responder.doRespond(HBase*
> Server.java:792)
>         at
> org.apache.hadoop.*hbase.ipc.HBaseServer$Handler.run(HBase*Server.java:1083)
> 
> 
> 
> 
> 
> 
> >  If you are using MSLAB, it reserves 2MB/region as a buffer -- that can add
> > up when you have lots of regions.
> >
> >
> 
> 
> 
> > Given so little information all my guesses are going to be wild, but they
> > might help:
> > 4GB may not be enough for your current load.
> 
> Have you considered changing your memory allocation, giving less to your
> > map/reduce jobs and more to HBase?
> >
> >
>   Interesting point , can you advice relation between m/r memory allocation
> related to hbase region?
> 
>   currently we have 512m for map (4 map per machine) and 1024m for reduce(2
> reducers per machine)
> 
> 
> > What is your key distribution like?
> 
> Are you writing to all regions equally, or are you hotspotting on one
> > region?
> >
> 
> every day before running job we manually allocates regions
> with lexicographically start and end key to get good distribution and
> prevent hot-spots.
> 
> 
> >
> > Check your cell/row sizes.  Are they really large (e.g. cells > 1 MB; rows
> > > 100 MB)?  Increasing region size should help here, but there may be an
> > issue with your RAM allocation for HBase.
> >
> >
> I'll check but I almost sure that we have no row > 100MB, we changed region
> size for 500Mb to prevent automatic splits (after successfully inserted job
> we have ~ 200-250 mb files per region)
> and for the next day we allocate a new one.
> 
> 
> > Are you sure that you are not overloading the machine memory? How much RAM
> > do you allocate for map reduce jobs?
> >
> >
>     512M -- map
>     1024 -- reduce
> 
> 
> > How do you distribute your processes over machines?  Does your master run
> > namenode, hmaster, jobtracker, and zookeeper, while your slaves run
> > datanode, tasktracker, and hregionserver?
> 
> 
> Exactly , we have such process distribution.
> we have 16G ordinary machines
> and 48G ram for maser , so I am not sure that I  understand your calculation
> , please clarify
> 
>  If so, then your memory allocation is:
> > 4 GB for regionserver
> > 1 GB for OS
> > 1 GB for datanode
> > 1 GB for tasktracker
> > 9/6 GB for M/R
> > So, are you sure that all of your m/r tasks take less than 1 GB?
> >
> > Dave
> >
> > -----Original Message-----
> > From: Oleg Ruchovets [mailto:[email protected]]
> > Sent: Tuesday, August 23, 2011 2:15 AM
> > To: [email protected]
> > Subject: how to make tuning for hbase (every couple of days hbase region
> > sever/s crashe)
> >
> > Hi ,
> >
> >  Our environment
> > hbase 90.2 (10 machine)
> >    We have 10 machine grid:
> >    master has 48G ram
> >    slaves machine has 16G ram.
> >    Region Server process has 4G ram
> >    Zookeeper process has 2G ram
> >     We have 4map/2reducer per machine
> >
> >
> > We write from m/r job to hbase (2 jobs a day).  3 months system works
> > without any problem , but now  every 3/4 days region server crashes.
> >   What we done so far:
> >   1) We running major compaction manually once a day
> >   2) We increases regions size to prevent automatic split.
> >
> > Question:
> >   What is the way to make a HBase tuning ?
> >   How to debug such problem , because it is still not clear for me what is
> > the root  cause of region's crashes?
> >
> >
> >
> >   We started from this post.
> >
> > http://search-hadoop.com/m/HDoK22ikTCI/M%252FR+vs+hbase+problem+in+production&subj=M+R+vs+hbase+problem+in+production
> >
> >
> > <
> > http://search-hadoop.com/m/HDoK22ikTCI/M%252FR+vs+hbase+problem+in+production&subj=M+R+vs+hbase+problem+in+production
> > >
> > Regards
> > Oleg.
> >
                                          

Reply via email to