Re: RegionServer crash without any errors (compaction?)

Ishan Chhabra Fri, 08 Nov 2013 16:54:35 -0800

Even if there is zookeeper timeout due to GC, there should be logging
related to that, right?
Check your ‘/var/log/messages’, it might be that the kernel killed it due
to OOM or something else.



On Thu, Nov 7, 2013 at 8:21 AM, Dhaval Shah <[email protected]>wrote:

> Operation too slow is generally in the .log file while the GC logs (if you
> enabled GC logging) is in the .out file. You have a very small heap for a
> 1GB HFIle size. You are probably running your region server out of memory.
> Try increasing the heap size and see if that helps
>
> Regards,
> Dhaval
>
>
> ________________________________
>  From: John <[email protected]>
> To: [email protected]; Dhaval Shah <[email protected]>
> Sent: Thursday, 7 November 2013 11:09 AM
> Subject: Re: RegionServer crash without any errors (compaction?)
>
>
>
> there are no really other logs before. There are a "operationTooSlow"
> message before, but that log is ~50 mins bofre the other:
> http://pastebin.com/EAAubqGB
>
>
>
>
> 2013/11/7 John <[email protected]>
>
> Hi,
> >
> >thanks for your fast answer. If I take a look at the cloudera manager at
> this time the %-time of using the GC increase at this time, so I think you
> are right. The max heap size is 1GB for this node. The
> hbase.hregion.max.filesize is also 1GB.
> >
> >regards
> >
> >
> >
> >
> >2013/11/7 Dhaval Shah <[email protected]>
> >
> >Did you look at your GC logs? Probably the compaction process is running
> your region server out of memory. Can you provide more details on your
> setup? Max heap size? Max Region HFile size?
> >>
> >>Regards,
> >>Dhaval
> >>
> >>
> >>________________________________
> >> From: John <[email protected]>
> >>To: [email protected]
> >>Sent: Thursday, 7 November 2013 10:51 AM
> >>Subject: RegionServer crash without any errors (compaction?)
> >>
> >>
> >>
> >>Hi,
> >>
> >>I have a cluster with 7 regionserver. Some of them are crashing from time
> >>to time wihtout any error message in the hbase log. If I take a look at
> the
> >>log at the time I found this:
> >>
> >>2013-11-07 15:29:02,511 INFO org.apache.hadoop.hbase.regionserver.Store:
> >>Starting compaction of 2 file(s) in 1 of P_SO,<
> >>http://xmlns.com/foaf/0.1/homepage
> >,1383188177383.59d0259c87c07dc666a5600ba4d6c916.
> >>i$
> >>2013-11-07 15:29:10,471 INFO
> >>org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom
> filter
> >>type for hdfs://
> >>
> pc08.pool.ifis.uni-luebeck.de:8020/hbase/P_SO/59d0259c87c07dc666a5600ba4d6c916/.tmp/f$
> >>2013-11-07 15:31:05,944 INFO org.apache.hadoop.hbase.util.VersionInfo:
> >>HBase 0.94.6-cdh4.4.0
> >>.... restart
> >>
> >>At this time 2 of the 7 RS crashed, both has this compaction message
> before
> >>they crashed. I don't know exactly what compaction is, but it seems that
> >>this compaction has to do with the crash. What can I do to avoid this
> >>restart/crash?
> >>
> >>best regards
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: RegionServer crash without any errors (compaction?)

Reply via email to