Hi folks, I just want to follow-up on this one more time. Is there anything funky happening in the client that "slows things down" when these methods are called, or is it a reflection of RS activity?
On 1/11/12 4:37 PM, "Doug Meil" <[email protected]> wrote: > >Hi dev-list, > >With respect to HBASE-5073 and invoking the admin API and producing >slowdowns, was the workaround (without the patch) that the client be >restarted, or the entire cluster? I see the patch has been back-ported to >90.6 but I wanted to doc this if it was warranted. > >Also, regarding... > >"As Lars mentioned admin apis like flush and compact will also slow down >the client." > >... in terms of "slowing down the client", is this referring to the fact >that subsequent requests will have to content with increased activity on >RegionServers (e.g., due to compaction and the file-writing) will >experience? Or is there something else going on? > >Again, wanted to doc this if it was warranted. > > > >On 12/27/11 9:20 PM, "Ramkrishna S Vasudevan" ><[email protected]> wrote: > >>As Lars mentioned admin apis like flush and compact will also slow down >>the client. >>As part of restart of HBase cluster, clients are also restarted? >> >>Regards >>Ram >> >>-----Original Message----- >>From: Lars H [mailto:[email protected]] >>Sent: Tuesday, December 27, 2011 10:02 PM >>To: [email protected] >>Cc: [email protected] >>Subject: Re: Read speed down after long running >> >>When you restart HBase are you also restarting the client process? >>Are you using HBaseAdmin.tableExists? >>If so you might be running into HBASE-5073 >> >>-- Lars >> >>Yi Liang <[email protected]> schrieb: >> >>>Hi all, >>> >>>We're running hbase 0.90.3 for one read intensive application. >>> >>>We find after long running(2 weeks or 1 month or longer), the read speed >>>will become much lower. >>> >>>For example, a get_rows operation of thrift to fetch 20 rows (about 4k >>>size >>>every row) could take >2 second, sometimes even >5 seconds. When it >>>happens, we can see cpu_wio keeps at about 10. >>> >>>But if we restart hbase(only master and regionservers) with >>>stop-hbase.sh >>>and start-hbase.sh, we can see the read speed back to normal >>>immediately, >>>which is <200 ms for every get_rows operation, and the cpu_wio drops to >>>about 2. >>> >>>When the problem appears, there's no exception in logs, and no >>>flush/compaction, nothing abnormal except a few warning logs sometimes >>>like >>>below: >>>2011-12-27 15:50:20,307 WARN >>>org.apache.hadoop.hbase.regionserver.wal.HLog: >>>IPC Server handler 52 on 60020 took 1546 ms appending an edit to hlog; >>>editcount=1, len~=9.8k >>> >>>Our cluster has 10 region servers, each with 25g heap size, 64% of which >>>used for cache. The're some m/r jobs keep running in another cluster to >>>feed data into the this hbase. Every night, we do flush and major >>>compaction. Usually there's no flush or compaction in the daytime. >>> >>>Could anybody explain why the read speed could become lower after long >>>running, and why it back to normal immediately after restarting hbase? >>> >>>Every advice will be highly appreciated. >>> >>>Thanks, >>>Yi >> >> > > >
