Hello J-D,
  >4 CPUs seems ok, unless you are running 2-3 MR tasks at the same time.
  I think it never happened that we are running 3 mr tasks at the same time
in one server, maybe sometimes 2, but not 3. And with our monitor tools, the
cpu is always not busy.
  I didn't change the tick value, and I will do it right now. But I wanna
know why the timeout value can only be 20 times bigger than ticktime, can
you tell me?
  Thank you,
  Regards,
    LvZheng


2010/3/26 Jean-Daniel Cryans <[email protected]>

> 4 CPUs seems ok, unless you are running 2-3 MR tasks at the same time.
>
> So your value for the timeout is 240000, but did you change the tick
> time? The GC pause you got seemed to last almost a minute which, if
> you did not change the tick value, matches 3000*20 (disregard your
> session timeout).
>
> J-D
>
> On Thu, Mar 25, 2010 at 1:07 AM, Zheng Lv <[email protected]>
> wrote:
> > Hello J-D,
> >  Thank you for your reply first.
> >  >How many CPUs do you have?
> >  Every server has 2 Dual-Core cpus.
> >  >Are you swapping?
> >  Now I'm not sure about it with our monitor tools, but now we have
> written
> > a script to record vmstat log every 2 seconds. If something wrong happen
> > again, we can take it.
> >  >Also if the only you are using this system currently to batch load
> >  >data or as an analytics backend, you probably want to set the timeout
> >  >higher:
> >  But our value of this property is already 240000.
> >
> >  We will try to optimize our garbage collector and we will see what will
> > happen.
> >  Thanks again, J-D,
> >    LvZheng
> >
> > 2010/3/25 Jean-Daniel Cryans <[email protected]>
> >
> >> 2010-03-24 11:33:52,331 WARN org.apache.hadoop.hbase.util.Sleeper: We
> >> slept 54963ms, ten times longer than scheduled: 3000
> >>
> >> You had an important garbage collector pause (aka pause of the world
> >> in java-speak) and your region server's session with zookeeper expired
> >> (it literally stopped responding for too long, so long it was
> >> considered dead). Are you swapping? How many CPUs do you have? If you
> >> are slowing down the garbage collecting process, it will take more
> >> time.
> >>
> >> Also if the only you are using this system currently to batch load
> >> data or as an analytics backend, you probably want to set the timeout
> >> higher:
> >>
> >>  <property>
> >>    <name>zookeeper.session.timeout</name>
> >>    <value>60000</value>
> >>    <description>ZooKeeper session timeout.
> >>      HBase passes this to the zk quorum as suggested maximum time for a
> >>      session.  See
> >>
> >>
> http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
> >>      "The client sends a requested timeout, the server responds with the
> >>      timeout that it can give the client. The current implementation
> >>      requires that the timeout be a minimum of 2 times the tickTime
> >>      (as set in the server configuration) and a maximum of 20 times
> >>      the tickTime." Set the zk ticktime with
> >> hbase.zookeeper.property.tickTime.
> >>      In milliseconds.
> >>    </description>
> >>  </property>
> >>
> >> This value can only be 20 times bigger than this:
> >>
> >>  <property>
> >>    <name>hbase.zookeeper.property.tickTime</name>
> >>    <value>3000</value>
> >>    <description>Property from ZooKeeper's config zoo.cfg.
> >>    The number of milliseconds of each tick.  See
> >>    zookeeper.session.timeout description.
> >>    </description>
> >>  </property>
> >>
> >>
> >> So you could set tick to 6000, timeout to 120000 for a 2min timeout.
> >>
>

Reply via email to