Zookeeper session timeouts during RAID Checks

Srikanth R Mon, 07 Oct 2013 23:49:21 -0700

hi zookeepers,

 I am using zookeeper 3.4.5 in a 3 server ensemble mode. And its datadir is
in a dedicated 6 disk 2.5TB  Raid10 Volume. Only HDFS namenode/journal txns
and Zookeeper txnlog/snapshots are written to this volume. The issue is
whenever the weekly raid check is running, clients that have 5 Sec Timeouts
are timing out randomly. Has anyone seen issues like this with datadir on
Raid before ?


Also there isnt much writes going into ZK, only hadoop-ha and hbase master
are using the ZK services.

1. There are no cpu bottlenecks or memory/swapping issues on the boxes.
2.  In ZK strace output, there are a few random 2-3 secs intervals where no
system calls are recorded, which is weird. And most of the timeouts
correspond to this time period. But not able to figure out what ZK does
during that intervals.
3. Enabled GC logs, no traces of full GC during timeouts. Though there were
full GCs recorded over period of time, the pause is only for 0.3-0.4 secs.
Also tried the ConcMarkSweep GC without any improvement.
4. There are not network errors/timeouts.
5. At times I see a max latency of 3-4 secs in connection stats, but avg
and min latency are 0.
6. ran zk-latencies.py and latency seems to be same with and without raid
check.

Here's the zookeeper config

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=xyz1:2888:3888
server.2=xyz2:2888:3888
server.3=xyz3:2888:3888
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
jaasLoginRenew=3600000
kerberos.removeHostFromPrincipal=true

Partition:

-bash-4.1$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md2              116G   79G   32G  72% /
tmpfs                  12G     0   12G   0% /dev/shm
/dev/md0               97M   31M   61M  34% /boot
/dev/md3              2.6T  297M  2.5T   1% /data

-bash-4.1$ cat /proc/mdstat
Personalities : [raid10] [raid1]
md3 : active raid10 sdc5[2] sdd5[3] sda5[0] sdf5[5] sdb5[1] sde5[4]
      2782511616 blocks super 1.1 512K chunks 2 near-copies [6/6] [UUUUUU]
      [===================>.]  check = 95.3% (2654099584/2782511616)
finish=41.5min speed=51516K/sec
      bitmap: 0/21 pages [0KB], 65536KB chunk

Here are my queries,
1. what is the best way to find out what the Zookeeper threads are doing
(strace hasnt helped much)
2. There isnt much data written to/read from ZK. why would ZK fail ?
3. Is it possible to trace all the requests that come in to ZK ?

Please let me know if you need more info. Any help is greatly appreciated.

Thanks.
Srikanth

Zookeeper session timeouts during RAID Checks

Reply via email to