As Lars mentioned admin apis like flush and compact will also slow down the client. As part of restart of HBase cluster, clients are also restarted?
Regards Ram -----Original Message----- From: Lars H [mailto:[email protected]] Sent: Tuesday, December 27, 2011 10:02 PM To: [email protected] Cc: [email protected] Subject: Re: Read speed down after long running When you restart HBase are you also restarting the client process? Are you using HBaseAdmin.tableExists? If so you might be running into HBASE-5073 -- Lars Yi Liang <[email protected]> schrieb: >Hi all, > >We're running hbase 0.90.3 for one read intensive application. > >We find after long running(2 weeks or 1 month or longer), the read speed >will become much lower. > >For example, a get_rows operation of thrift to fetch 20 rows (about 4k size >every row) could take >2 second, sometimes even >5 seconds. When it >happens, we can see cpu_wio keeps at about 10. > >But if we restart hbase(only master and regionservers) with stop-hbase.sh >and start-hbase.sh, we can see the read speed back to normal immediately, >which is <200 ms for every get_rows operation, and the cpu_wio drops to >about 2. > >When the problem appears, there's no exception in logs, and no >flush/compaction, nothing abnormal except a few warning logs sometimes like >below: >2011-12-27 15:50:20,307 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: >IPC Server handler 52 on 60020 took 1546 ms appending an edit to hlog; >editcount=1, len~=9.8k > >Our cluster has 10 region servers, each with 25g heap size, 64% of which >used for cache. The're some m/r jobs keep running in another cluster to >feed data into the this hbase. Every night, we do flush and major >compaction. Usually there's no flush or compaction in the daytime. > >Could anybody explain why the read speed could become lower after long >running, and why it back to normal immediately after restarting hbase? > >Every advice will be highly appreciated. > >Thanks, >Yi
