Re: Region servers going down under heavy write load

Ted Yu Wed, 05 Jun 2013 19:59:06 -0700

bq.  I just dont find this "hbase.zookeeper.property.tickTime" anywhere in
the code base.


Neither do I. Mind filing a JIRA to correct this in troubleshooting.xml ?

bq.  increase tickTime in zoo.cfg?

For shared zookeeper quorum, the above should be done.

On Wed, Jun 5, 2013 at 5:45 PM, Ameya Kantikar <[email protected]> wrote:

> One more thing. I just dont find this "hbase.zookeeper.property.tickTime"
> anywhere in the code base.
> Also, I could not find ZooKeeper API that takes tickTime from client.
>
> http://zookeeper.apache.org/doc/r3.3.3/api/org/apache/zookeeper/ZooKeeper.html
> It takes sessionTime out value, but not tickTime.
>
> Is this even relevant anymore? hbase.zookeeper.property.tickTime ?
>
> So whats the solution, increase tickTime in zoo.cfg? (and not
> hbase.zookeeper.property.tickTime
> in hbase-site.xml?)
>
> Ameya
>
>
> On Wed, Jun 5, 2013 at 3:18 PM, Ameya Kantikar <[email protected]> wrote:
>
> > Which tickTime is honored?
> >
> > One in zoo.cfg or hbase.zookeeper.property.tickTime in hbase-site.xml?
> >
> > My understanding now is, whichever tickTime is honored, session time can
> > not be more than 20 times the value.
> >
> > I think this is whats happening on my cluster:
> >
> > My hbase.zookeeper.property.tickTime value is 6000 ms. However my timeout
> > value is 300000 ms which is outside of 20 times tickTime. Hence ZooKeeper
> > uses its syncLimit of 5, to generate 6000*5 = 30000 as timeout value for
> my
> > RS sessions.
> >
> > I'll try increasing hbase.zookeeper.property.tickTime value in
> > hbase-site.xml and will monitor my cluster over next few days.
> >
> > Thanks Kevin & Ted for your help.
> >
> > Ameya
> >
> >
> >
> >
> > On Wed, Jun 5, 2013 at 2:45 PM, Ted Yu <[email protected]> wrote:
> >
> >> bq. I thought this property in hbase-site.xml takes care of that:
> >> zookeeper.session.timeout
> >>
> >> From
> >>
> >>
> http://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#ch_zkSessions
> >> :
> >>
> >> The client sends a requested timeout, the server responds with the
> timeout
> >> that it can give the client. The current implementation requires that
> the
> >> timeout be a minimum of 2 times the tickTime (as set in the server
> >> configuration) and a maximum of 20 times the tickTime. The ZooKeeper
> >> client
> >> API allows access to the negotiated timeout.
> >> The above means the shared zookeeper quorum may return timeout value
> >> different from that of zookeeper.session.timeout
> >>
> >> Cheers
> >>
> >> On Wed, Jun 5, 2013 at 2:34 PM, Ameya Kantikar <[email protected]>
> wrote:
> >>
> >> > In zoo.cfg I have not setup this value explicitly. My zoo.cfg looks
> >> like:
> >> >
> >> > tickTime=2000
> >> > initLimit=10
> >> > syncLimit=5
> >> >
> >> > We use common zoo keeper cluster for 2 of our HBase clusters. I'll try
> >> > increasing this value from zoo.cfg.
> >> > However is it possible to set this value cluster specific?
> >> > I thought this property in hbase-site.xml takes care of that:
> >> > zookeeper.session.timeout
> >> >
> >> >
> >> > On Wed, Jun 5, 2013 at 1:49 PM, Kevin O'dell <
> [email protected]
> >> > >wrote:
> >> >
> >> > > Ameya,
> >> > >
> >> > >   What does your zoo.cfg say for your timeout value?
> >> > >
> >> > >
> >> > > On Wed, Jun 5, 2013 at 4:47 PM, Ameya Kantikar <[email protected]>
> >> > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > We have heavy map reduce write jobs running against our cluster.
> >> Every
> >> > > once
> >> > > > in a while, we see a region server going down.
> >> > > >
> >> > > > We are on : 0.94.2-cdh4.2.0, r
> >> > > >
> >> > > > We have done some tuning for heavy map reduce jobs, and have
> >> increased
> >> > > > scanner timeouts, lease timeouts, have also tuned memstore as
> >> follows:
> >> > > >
> >> > > > hbase.hregion.memstore.block.multiplier: 4
> >> > > > hbase.hregion.memstore.flush.size: 134217728
> >> > > > hbase.hstore.blockingStoreFiles: 100
> >> > > >
> >> > > > So now, we are still facing issues. Looking at the logs it looks
> >> like
> >> > due
> >> > > > to zoo keeper timeout. We have tuned zookeeper settings as follows
> >> on
> >> > > > hbase-sie.xml:
> >> > > >
> >> > > > zookeeper.session.timeout: 300000
> >> > > > hbase.zookeeper.property.tickTime: 6000
> >> > > >
> >> > > >
> >> > > > The actual log looks like:
> >> > > >
> >> > > >
> >> > > > 2013-06-05 11:46:40,405 WARN org.apache.hadoop.ipc.HBaseServer:
> >> > > > (responseTooSlow):
> >> > > > {"processingtimems":13468,"call":"next(6723331143689528698, 1000),
> >> rpc
> >> > > > version=1, client version=29,
> >> methodsFingerPrint=54742778","client":"
> >> > > > 10.20.73.65:41721
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> ","starttimems":1370432786933,"queuetimems":1,"class":"HRegionServer","responsesize":39611416,"method":"next"}
> >> > > >
> >> > > > 2013-06-05 11:46:54,988 INFO
> >> org.apache.hadoop.io.compress.CodecPool:
> >> > Got
> >> > > > brand-new decompressor [.snappy]
> >> > > >
> >> > > > 2013-06-05 11:48:03,017 WARN org.apache.hadoop.hdfs.DFSClient:
> >> > > > DFSOutputStream ResponseProcessor exception  for block
> >> > > >
> >> BP-53741567-10.20.73.56-1351630463427:blk_9026156240355850298_8775246
> >> > > > java.io.EOFException: Premature EOF: no length prefix available
> >> > > >         at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
> >> > > >         at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
> >> > > >         at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:656)
> >> > > >
> >> > > > 2013-06-05 11:48:03,020 WARN org.apache.hadoop.hbase.util.Sleeper:
> >> *We
> >> > > > slept 48686ms instead of 3000ms*, this is likely due to a long
> >> garbage
> >> > > > collecting pause and it's usually bad, see
> >> > > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> >> > > >
> >> > > > 2013-06-05 11:48:03,094 FATAL
> >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING
> region
> >> > > server
> >> > > > smartdeals-hbase14-snc1.snc1,60020,1370373396890: Unhandled
> >> exception:
> >> > > > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT
> rejected;
> >> > > > currently processing
> >> smartdeals-hbase14-snc1.snc1,60020,1370373396890
> >> > as
> >> > > > dead server
> >> > > >
> >> > > > (Not sure why it says 3000ms when we have timeout at 300000ms)
> >> > > >
> >> > > > We have done some GC tuning as well. Wondering what I can tune
> from
> >> > > making
> >> > > > RS going down? Any ideas?
> >> > > > This is batch heavy cluster, and we care less about read latency.
> We
> >> > can
> >> > > > increase RAM bit more but not much (Already RS has 20GB memory)
> >> > > >
> >> > > > Thanks in advance.
> >> > > >
> >> > > > Ameya
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Kevin O'Dell
> >> > > Systems Engineer, Cloudera
> >> > >
> >> >
> >>
> >
> >
>

Re: Region servers going down under heavy write load

Reply via email to