Thanks for sharing, Akshay.

I think the solution should be part of hbase reference guide.

On Fri, Jan 18, 2013 at 7:55 AM, Akshay Singh <[email protected]> wrote:

> I found the problem, so I thought I would post it here for future
> reference.
>
> The problem was IPv6 enabled network. Though IPv6 in HDFS (
> HADOOP_OPTS=-Djava.net.preferIPv4Stack=true), and in HBase (
> -Djava.net.preferIPv4Stack=true) was already disabled, but for some of the
> machines in cluster IPv6 was not disabled in kernel (through sysctl).
>
> So hbase was using IPv6 for its services on some of the hosts. So I am
> guessing at start of every workload, HBase tries to resolve AAAA records,
> which eventually times out. And then it resolves to IPv4 address, and thats
> when operations start at normal rate.
>
> On the same note, surprisingly, in one of the host disabling IPv6 through
> sysctl (persisted in sysctl.conf) was not enough to discourage HBase to use
> IPv6 communication. I had to disable IPv6 in grub (default grub cmdline in
> /etc/default/grub) on this host.
>
> After there was *no IPv6 whatsoever* in the cluster, YCSB clients start
> doing operation on HBase immediately.
>
> Thanks,
> Akshay
>
>
>
> ________________________________
>  From: Akshay Singh <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Tuesday, 15 January 2013 10:36 AM
> Subject: Re: Slow start of HBase operations with YCSB, possibly because of
> zookeeper ?
>
> Thanks Samar.
>
> You are right YCSB writes data to a single table 'usertable', but I see
> very slow operations (in order of 1-2 operations/second) even for
> read/update workload and not only for inserts. So, the region is already
> split in to multiple RS before I start my transaction workload.
>
> And keys are fairly random in YCSB, so I doubt if the slow operations are
> owing to the fact that table is initially limited to one region.
>
> To my knowledge this should have something to do with Zookeeper, as (said
> in the original mail) if I increase the
> "hbase.zookeeper.watcher.sync.connected.wait" (to 10 sec) I dont see the
> exceptions thrown by ZookeeperWatcher, which I see with default value of
> 2s. I have a stand-alone zookeeper instance, to which all RS connects to.
>
> Any other component I should closely monitor ?
>
> Thanks,
> Akshay
>
>
>
> ________________________________
> From: samar kumar <[email protected]>
> To: [email protected]
> Sent: Tuesday, 15 January 2013 3:58 AM
> Subject: Re: Slow start of HBase operations with YCSB, possibly because of
> zookeeper ?
>
> YCSB would be writing all data to one table.. So initially when the table
> is small or just created all the writes would go to one RS.. As the table
> grows the Region is split into different RS. The would allow parallel
> writes, if the keys are random and could possibly make the writes faster.
> Samar
>
> On 15/01/13 6:34 AM, "Akshay Singh" <[email protected]> wrote:
>
> >
> >Hi hbase users,
> >
> >I am running HBase (on top of HDFS) in
> >distributed mode (on 8 VMs), and things like JPS look fine on all the
> >machines in the cluster. I am also able to run hbase shell and
> >interact with HBase though it. But when I want to benchmark my HBase
> >cluster with YCSB (Yahoo! Cloud System Benchmark,
> >https://github.com/brianfrankcooper/YCSB/) I see this weird problem
> >of slow start of the HBase operations and then picking up later.
> >
> >Basically when I start the YCSB
> >workload from a client machine, I see these problems in chronological
> >order :
> >
> >1) ERROR zookeeper.ZooKeeperWatcher: ZK
> >is null on connection event
> >
> >###########
> >ERROR zookeeper.ZooKeeperWatcher: ZK is
> >null on connection event -- see stack trace for the stack trace when
> >constructor was called on this zkw
> >java.lang.Exception: ZKW CONSTRUCTOR
> >STACK TRACE FOR DEBUGGING
> >at
> >org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher
> >.java:142)
> >at
> >org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher
> >.java:126)
> >at
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementatio
> >n.getZooKeeperWatcher(HConnectionManager.java:1322)
> >at
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementatio
> >n.ensureZookeeperTrackers(HConnectionManager.java:584)
> >at
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementatio
> >n.locateRegion(HConnectionManager.java:827)
> >at
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementatio
> >n.locateRegion(HConnectionManager.java:810)
> >at
> >org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:232)
> >at
> >org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:172)
> >at
> >org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:131)
> >at
> >com.yahoo.ycsb.db.HBaseClient.getHTable(HBaseClient.java:155)
> >###########
> >
> >2) org.apache.zookeeper.ClientCnxn -
> >Error while calling watcher
> >java.lang.NullPointerException: ZK
> >is null
> >
> >############
> >ERROR org.apache.zookeeper.ClientCnxn -
> >Error while calling watcher
> >java.lang.NullPointerException: ZK is
> >null
> >at
> >org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeep
> >erWatcher.java:334)
> >at
> >org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatche
> >r.java:271)
> >at
> >org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:5
> >21)
> >at
> >org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
> >############
> >
> >3) And then finally it starts the
> >operation on HBase (which means Zookeeper is running fine and can be
> >connected to )
> >
> >4) The operations remains below 10
> >ops/sec for first 60-70 sec, and then grow gradually to reach aroun
> >1300 ops/sec (normally expected number)
> >
> >Here are the actual logs :: http://pastebin.com/NC1zKwRF
> >
> >I am running
> >1) Hadoop-1.0.1
> >2) HBase-0.94.1
> >3) Zookeeper-3.3.6
> >4) Java 1.6.0_24 (openJDK-6)
> >5) OS : Ubuntu-11.10
> >6) YCSB-0.14
> >
> >What I have already tried :
> >
> >1) Checked my DNS setting (just to be
> >sure .. using synced /etc/hosts file) .. no luck
> >2) Increasing
> >"hbase.zookeeper.watcher.sync.connected.wait" to 10000
> >(default:2000), this get rid of "ZK is null ****" errors,
> >but slow start is still the issue with no improvement.
> >
> >I am clueless as to what may be the
> >reason behind this 'slowly picking up' behavior of my set-up.
> >Please advise.
> >
> >Thanks,
> >Akshay
>

Reply via email to