[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290452#comment-14290452 ] Hudson commented on HBASE-10499: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #776 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/776/]) HBASE-10499 In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException (Ram and Ted) (tedyu: rev 5484f7958f8ce929c619c377f07917f05cab0db6) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Ted Yu >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer:
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290429#comment-14290429 ] Ted Yu commented on HBASE-10499: Enjoy your trip, Ram. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Ted Yu >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regi
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290426#comment-14290426 ] ramkrishna.s.vasudevan commented on HBASE-10499: Ted, was not able to +1 it on time. I tried +1 from my mobile while on travel from the mail that I received for this JIRA and found that the same got updated only today morning. Thanks for your work and time Ted. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.peri
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290410#comment-14290410 ] Hudson commented on HBASE-10499: SUCCESS: Integrated in HBase-0.98 #815 (See [https://builds.apache.org/job/HBase-0.98/815/]) HBASE-10499 In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException (Ram and Ted) (tedyu: rev 5484f7958f8ce929c619c377f07917f05cab0db6) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserve
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290311#comment-14290311 ] ramkrishna vasudevan commented on HBASE-10499: -- https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] causing RegionTooBusyException -- 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, workloada_0.98.dat to this version. Doesn't seem so to me. has 200 regions. In one of the run with 0.98 server and 0.98 client I faced this problem like the hlogs became more and the system requested flushes for those many regions. remained unflushed. The ripple effect of this on the client side org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 54 actions: RegionTooBusyException: 54 times, org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 54 actions: RegionTooBusyException: 54 times, org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) wal.FSHLog: Too many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, acc43e4b42c1a041078774f4f20a3ff5 wal.FSHLog: Too many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 16689 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a delay of 15868 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 20847 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a delay of 20099 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 8677 wal.FSHLog: Too many hlogs: logs=54, maxlogs=32; forcing flush of 1 regions(s): fdbb3242d3b673bbe4790a47bc30576f regions but this region stays with the RS that has this issue. One important observation is that in HRegion.internalflushCache() we need to add a debug log here does not happen and no logs related to flush are printed in the logs. so due to some reason this memstore.size() has become 0( I assume this). The earlier bugs were also due to similar reason. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.l
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290308#comment-14290308 ] ramkrishna vasudevan commented on HBASE-10499: -- +1 on patch https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] RegionTooBusyException -- 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, workloada_0.98.dat this version. Doesn't seem so to me. 200 regions. In one of the run with 0.98 server and 0.98 client I faced this problem like the hlogs became more and the system requested flushes for those many regions. remained unflushed. The ripple effect of this on the client side org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 54 actions: RegionTooBusyException: 54 times, org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 54 actions: RegionTooBusyException: 54 times, org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) Too many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, acc43e4b42c1a041078774f4f20a3ff5 Too many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 16689 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a delay of 15868 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 20847 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a delay of 20099 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a delay of 8677 Too many hlogs: logs=54, maxlogs=32; forcing flush of 1 regions(s): fdbb3242d3b673bbe4790a47bc30576f regions but this region stays with the RS that has this issue. One important observation is that in HRegion.internalflushCache() we need to add a debug log here not happen and no logs related to flush are printed in the logs. so due to some reason this memstore.size() has become 0( I assume this). The earlier bugs were also due to similar reason. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.du
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290226#comment-14290226 ] Hudson commented on HBASE-10499: FAILURE: Integrated in HBase-1.0 #679 (See [https://builds.apache.org/job/HBase-1.0/679/]) HBASE-10499 In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException (Ram and Ted) (tedyu: rev 973961b23fc6f3e4748ae7a213c9a89ff89dbb33) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver6
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290066#comment-14290066 ] Andrew Purtell commented on HBASE-10499: Thanks [~te...@apache.org], I applied the 0.98 patch and checked the new test and other flush tests, looks ok. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, > 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, > 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertab
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289919#comment-14289919 ] Andrew Purtell commented on HBASE-10499: This was reported against 0.98, shouldn't this be fixed in/for 0.98? > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 201
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289859#comment-14289859 ] Hudson commented on HBASE-10499: FAILURE: Integrated in HBase-1.1 #102 (See [https://builds.apache.org/job/HBase-1.1/102/]) HBASE-10499 In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException (Ram and Ted) (tedyu: rev 3a529c04cebb4f3debdfd42fb00d3736dc2ea2fd) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for re
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289775#comment-14289775 ] Hudson commented on HBASE-10499: FAILURE: Integrated in HBase-TRUNK #6049 (See [https://builds.apache.org/job/HBase-TRUNK/6049/]) HBASE-10499 In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException (Ram and Ted) (tedyu: rev 74adb11f4c504abbb6a52de72b53883ad7b952b4) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289396#comment-14289396 ] Ted Yu commented on HBASE-10499: Thanks for the reviews - they're really helpful. Will commit later today. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288938#comment-14288938 ] Hadoop QA commented on HBASE-10499: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694093/10499-v8.txt against master branch at commit 5fbf80ee5ecb288804d2d2d042199dcd834ae848. ATTACHMENT ID: 12694093 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12567//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288818#comment-14288818 ] Anoop Sam John commented on HBASE-10499: +1 > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, > HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] >
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288336#comment-14288336 ] Hadoop QA commented on HBASE-10499: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693983/10499-v7.txt against master branch at commit 319f9bb7918af8cfe7e65f97b654f37f0d5983f3. ATTACHMENT ID: 12693983 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12559//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, 10499-v7.txt, HBASE-10499.patch, HBASE-10499_v5.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288155#comment-14288155 ] Ted Yu commented on HBASE-10499: >From QA run #12551 : {code} /x1/jenkins/jenkins-home/jobs/PreCommit-HBASE-Build/builds/2015-01-22_18-10-22/archive/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat-output.txt may be corrupted at jsync.protocol.FileSequenceReader.read(FileSequenceReader.java:45) at com.cloudbees.jenkins.plugins.jsync.archiver.JSyncArtifactManager.remoteSync(JSyncArtifactManager.java:145) at com.cloudbees.jenkins.plugins.jsync.archiver.JSyncArtifactManager.archive(JSyncArtifactManager.java:68) at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:140) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:756) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:720) at hudson.model.Build$BuildExecution.post2(Build.java:182) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:669) at hudson.model.Run.execute(Run.java:1731) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Recording test results [description-setter] Could not determine description. Finished: FAILURE {code} The above indicated problem with build infrastructure. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288136#comment-14288136 ] Hadoop QA commented on HBASE-10499: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693941/10499-v6.txt against master branch at commit 319f9bb7918af8cfe7e65f97b654f37f0d5983f3. ATTACHMENT ID: 12693941 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.coprocessor.TestMasterObserver.testRegionTransitionOperations(TestMasterObserver.java:1604) at org.apache.hadoop.hbase.regionserver.TestJoinedScanners.testJoinedScanners(TestJoinedScanners.java:92) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12551//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287872#comment-14287872 ] Ted Yu commented on HBASE-10499: The tests reported hanging actually passed: {code} Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 82.363 sec - in org.apache.hadoop.hbase.TestAcidGuarantees {code} See also the summary: {code} +1 core tests. The patch passed unit tests in . {code} Local run of TestDataBlockEncoders, TestCacheOnWrite, TestScannerSelectionUsingTTL and TestAcidGuarantees passed too. Let me attach patch v6 again. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287856#comment-14287856 ] Hadoop QA commented on HBASE-10499: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693910/10499-v6.txt against master branch at commit 833feefbf977a8208f8f71f5f3bd9b027d54961f. ATTACHMENT ID: 12693910 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders.testSeekingOnSample(TestDataBlockEncoders.java:205) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompactionInternals(TestCacheOnWrite.java:458) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompaction(TestCacheOnWrite.java:410) at org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection(TestScannerSelectionUsingTTL.java:118) at org.apache.hadoop.hbase.TestAcidGuarantees.testScanAtomicity(TestAcidGuarantees.java:354) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12548//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287418#comment-14287418 ] Ted Yu commented on HBASE-10499: Thanks for taking a look, Anoop. Comparable, Delayed and FlushQueueEntry are all interfaces. So hashcode method has to be added - it is just a matter of the method name. If there is no strong opinion, I'd like to keep the current formation. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region >
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287402#comment-14287402 ] Anoop Sam John commented on HBASE-10499: bq.int getHashcode(); Ted, need to add such a new method to interface? Or just call hasCode() from compareTo() and make sure to implement hashCode() in FlushRegionEntry? Just asking. Patch looks good. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,13
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287092#comment-14287092 ] John King commented on HBASE-10499: --- +1 for patch "10499-v4.txt" > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver6002
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283277#comment-14283277 ] John King commented on HBASE-10499: --- Hi Ted, your point is very reasonable. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regio
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282590#comment-14282590 ] Ted Yu commented on HBASE-10499: Please also take a look at http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/DelayQueue.java#116 > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282574#comment-14282574 ] Ted Yu commented on HBASE-10499: The contract of the Comparable interface recommends that natural orderings be consistent with equals. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserve
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282362#comment-14282362 ] John King commented on HBASE-10499: --- I think Ram's patch is simpler. And simple is always good. The regression failed because one test case in /hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java need to be modified. The test case was designed for the old equals implementation. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver6002
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282319#comment-14282319 ] ramkrishna.s.vasudevan commented on HBASE-10499: Ya, but I don't think we need to calculate the hashCode after finding the delay. We should be sure that we are calculating the equals/compareTo for the same region and then find the delay. Also finding hashCode() would involve more operations also. This is my thought. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: reg
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282301#comment-14282301 ] Ted Yu commented on HBASE-10499: Thanks for the comment Ram. FlushQueueEntry#equals() calls FlushQueueEntry#compareTo(). So I think patch v4 should achieve the goal of making sure the correct region is removed from the queue. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282240#comment-14282240 ] ramkrishna.s.vasudevan commented on HBASE-10499: For the test case failure- we may need to add the change as in Ted's patch. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionse
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282236#comment-14282236 ] Hadoop QA commented on HBASE-10499: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693025/HBASE-10499_v5.patch against master branch at commit 03e17168c3feab765fec26693318f4b8ae6a9468. ATTACHMENT ID: 12693025 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12508//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In o
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281241#comment-14281241 ] Hadoop QA commented on HBASE-10499: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692899/10499-v4.txt against master branch at commit 092c91eb0fc2a6b4044183e9ece71dd03711045d. ATTACHMENT ID: 12692899 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12500//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, > HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {co
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281208#comment-14281208 ] John King commented on HBASE-10499: --- Hi,Ted In your v3 patch, you used a strong type conversion, which converted from Delay to FRE, I think it should be FQE. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicF
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281211#comment-14281211 ] John King commented on HBASE-10499: --- Hi,Ted In your v3 patch, you used a strong type conversion, which converted from Delay to FRE, I think it should be FQE. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicF
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281212#comment-14281212 ] John King commented on HBASE-10499: --- Hi,Ted In your v3 patch, you used a strong type conversion, which converted from Delay to FRE, I think it should be FQE. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicF
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281209#comment-14281209 ] John King commented on HBASE-10499: --- Hi,Ted In your v3 patch, you used a strong type conversion, which converted from Delay to FRE, I think it should be FQE. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicF
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281210#comment-14281210 ] John King commented on HBASE-10499: --- Hi,Ted In your v3 patch, you used a strong type conversion, which converted from Delay to FRE, I think it should be FQE. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, > compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicF
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281204#comment-14281204 ] Hadoop QA commented on HBASE-10499: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692895/10499-v2.txt against master branch at commit 092c91eb0fc2a6b4044183e9ece71dd03711045d. ATTACHMENT ID: 12692895 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12498//console This message is automatically generated. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: 10499-v2.txt, HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280050#comment-14280050 ] John King commented on HBASE-10499: --- And for the problematic region, only when Region server's low water mark reached, can cause region in (MemStoreFlusher).regionsInQueue dequeue. i.e. type "flush" in HBase shell cannot resolve this issue. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 20
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280038#comment-14280038 ] John King commented on HBASE-10499: --- Hi, All I also hit the issue yesterday. I think Honghua Feng's analysis is totally right. The fqe entry in flushQueue may have been removed before being polled out by the MemStoreFlusher. Here are some code from MemStoreFlusher: [CODE-BEGIN] class MemStoreFlusher implements FlushRequester { ... private boolean flushRegion(final HRegion region, final boolean emergencyFlush) { synchronized (this.regionsInQueue) { FlushRegionEntry fqe = this.regionsInQueue.remove(region); if (fqe != null && emergencyFlush) { // Need to remove from region from delay queue. When NOT an // emergencyFlush, then item was removed via a flushQueue.poll. flushQueue.remove(fqe); } } ... } ... static class FlushRegionEntry implements FlushQueueEntry { ... @Override public int compareTo(Delayed other) { return Long.valueOf(getDelay(TimeUnit.MILLISECONDS) - other.getDelay(TimeUnit.MILLISECONDS)).intValue(); } @Override public boolean equals(Object obj) { if (this == obj) { return true; } if (obj == null || getClass() != obj.getClass()) { return false; } Delayed other = (Delayed) obj; return compareTo(other) == 0; } ... } } [CODE-END] >From the code, we can tell: 1. Two FlushRegionEntry instances were thought be equal only when they have the same delay time. 2. When emergencyFlush was true, an entry may be removed from the flushQueue. Then, if two different fqe (say "A" and "B") have exactly the same delay time, and "B" was flushed as an emergencyFlush, "A" may be removed from the fulshQueue instead of "B". Because RegionA.writestate.isFlushRequested() always return true and RegionA was still in (MemStoreFlusher).regionsInQueue, both (MemStoreFlusher).requestFlush (HRegion r) and (MemStoreFlusher).requestDelayedFlush(HRegion r, long randomDelay) cannot submit RegionA into flushQueue. That's why PeriodicMemstoreFlusher and (HRegion).requestFlush () cannot work. The problematic region can only be flushed when 1. Region server's low water mark reached; 2. Client side RPC (Flush/Split/Merge); 3. Buckload completting; 4. Snapshot taking; 5. Region closing. ... which may cause the online region refused to serve writting request for a long time. PS: Sorry, it's my first time use confluence, my comment may seems a little mess. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) >
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081442#comment-14081442 ] Andrew Purtell commented on HBASE-10499: bq. Would setting hbase.hstore.flusher.count to 2 help? (I'm running with default 1). That seems a reasonable stab in the dark [~matvey14]. It won't address the underlying issue I suspect but may move it out for you. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > d
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076315#comment-14076315 ] Matt Kapilevich commented on HBASE-10499: - I've attached https://issues.apache.org/jira/secure/attachment/12658147/compaction-queue.png It shows that once the RS hits the condition, the compactions just begin to queue up. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-10499.patch, compaction-queue.png, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076303#comment-14076303 ] Matt Kapilevich commented on HBASE-10499: - We're also hitting this issue. The RegionServer is unable to flush one of the regions. I am seeing this in the logs: {code} 2014-07-24 14:58:42,970 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: Request = regionName=users,534000,1404907131059.6eae263f1f0fd698cca73964118e16b0., storeName=ids, fileCount=3, fileSize=185.0 M, priority=6, time=2961355376528391; duration=7sec 2014-07-24 14:58:44,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 9557 2014-07-24 14:58:54,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 9316 2014-07-24 14:59:04,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 7454 2014-07-24 14:59:14,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 11197 2014-07-24 14:59:24,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 19777 2014-07-24 14:59:34,778 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 20814 2014-07-24 14:59:44,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 22542 2014-07-24 14:59:54,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 16096 2014-07-24 15:00:04,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 21794 2014-07-24 15:00:14,779 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15. after a delay of 20873 {code} The unfortunate thing here is that the RegionServer never recovers, even after we stop doing Puts. We're on 0.96.1.1-cdh5.0.2 Are there any workarounds for this? Would setting hbase.hstore.flusher.count to 2 help? (I'm running with default 1). > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004827#comment-14004827 ] Andrew Purtell commented on HBASE-10499: Where are we with this issue? > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0, 0.98.3 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910245#comment-13910245 ] ramkrishna.s.vasudevan commented on HBASE-10499: [~fenghh] The number of flushers should be the default value. I did not change that. Sorry for the late reply. bq.But if you want to raise a JIRA to replace System.currentMilllis with EnvirnonmentEdge.currentMillis Yes better to change. I am not saying this JIRA is because of this, but just wanted to ensure we change it. Will raise one. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.p
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903273#comment-13903273 ] Feng Honghua commented on HBASE-10499: -- bq.How many FlushHandlers were there ? Should be 1, from the log we can see next flush was started only after the previous one finished, and considering this occurred under heavy write scenario, sounds impossible for multiple concurrent FlushHandlers to run strictly serially by chance all the time. Certainly [~ram_krish] can confirm this :-) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher req
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903245#comment-13903245 ] Himanshu Vashishtha commented on HBASE-10499: - Yeah, doesn't look like a memstore size being 0 issue. How many FlushHandlers were there ? > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903197#comment-13903197 ] Feng Honghua commented on HBASE-10499: -- bq.One thing I wanted to raise a JIRA is that should flushregionentry use EnvirnonmentEdge.currentMillis instead of System.currentMilllis. I don't know how far it benefits but I think that can be done. Wonder if this JIRA is related to this. But if you want to raise a JIRA to replace System.currentMilllis with EnvirnonmentEdge.currentMillis, it's better to do such replace for all occurrences in HBase in a systematic way, since such replace can be done unconditionally without difference, right? :-) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. af
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903181#comment-13903181 ] Feng Honghua commented on HBASE-10499: -- bq.So there is some reason why flushQueue is not picking up that region? I think *yes*. It's sure RegionTooBusyException is thrown due to 'memstoreSize > blockingMemStoreSize' (see my above deduction process), that means it can't be the case that 'memstoreSize <= 0' (and the previously problematic region 6b788c498503ddd3e1433a4cd3fb4e39 has memstore > 0 (256.2M) when flushing) Another two things for sure are: # internalFlushcache was never called; # writestate.flushRequested was never reset to false after it's set to true And one thing is highly likely (but can't prove directly by code and log at hand:-() is regionsInQueue holds entry for region fdbb3242d3b673bbe4790a47bc30576f, since all later flush requests generated by rollWriter and periodicFlusher are rejected by (!regionsInQueue.containsKey(r)) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user365
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903168#comment-13903168 ] ramkrishna.s.vasudevan commented on HBASE-10499: bq.If we suppose these two regions have the same problem regarding non-flush-able internally by regionserver, and 6b788c498503ddd3e1433a4cd3fb4e39 is successfully flushed by a sub-phase of close operation because flush as part of close doesn't bother flushQueue This is one thing I have not noticed that while flush happens via close then flushQueue is not used. By design that makes sense. One thing I wanted to raise a JIRA is that should flushregionentry use EnvirnonmentEdge.currentMillis instead of System.currentMilllis. I don't know how far it benefits but I think that can be done. So there is some reason why flushQueue is not picking up that region? > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, > master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.pe
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903067#comment-13903067 ] Feng Honghua commented on HBASE-10499: -- Thanks [~ram_krish] There are *two* problematic regions : fdbb3242d3b673bbe4790a47bc30576f and 6b788c498503ddd3e1433a4cd3fb4e39 --- both are requested to flush from time to time but no flush ever happened within *[2014-02-11 08:44, 2014-02-11 09:51]* But at 2014-02-11 09:51:11, 6b788c498503ddd3e1433a4cd3fb4e39 is flushed due to a close operation, and I confirm in the master log you provided just now that the close is part of a move operation triggered by master when it does balance --- 6b788c498503ddd3e1433a4cd3fb4e39 is flushed because it is chosen as a move target for balance and flushed during closing. While fdbb3242d3b673bbe4790a47bc30576f is not chosen as a move target for balance by master(also can be confirmed in the master log), that's why it doesn't be flushed all along. If we suppose these two regions have the same problem regarding non-flush-able internally by regionserver, and 6b788c498503ddd3e1433a4cd3fb4e39 is successfully flushed by a sub-phase of close operation because flush as part of close doesn't bother flushQueue, and the flush log shows that the memstore of 6b788c498503ddd3e1433a4cd3fb4e39 is 256.2M when flushing it. That's a side evidence that fdbb3242d3b673bbe4790a47bc30576f's memstore is also > 0, it's not flushed just because its flush entry can't be polled out of flushQueue... > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-master-ip-10-157-0-229.zip, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > ..
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902978#comment-13902978 ] Feng Honghua commented on HBASE-10499: -- bq.The HRegion.shouldFlush() does not check for the size instead only checks for the time of the oldest edit. Also the HLog rollWriter() also checks if there was an oldest edit If my understanding is correct, these are by design. These two are both to flush regions with old enough edits even their memstoreSize is below flush threshold(to reduce the number of log files at the cost of hfiles with small sizes), so it's natural they don't check whether memstoreSize > someValue(they assume and tolerate memstoreSize with small size). And a region will be selected to flush by rollWriter/periodicalFlusher only when they (do) have some 'old' edits, so the region's memstoreSize *must not* <= 0(this is an implicit invariant and can be asserted, but not be check-and-exit) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher]
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902971#comment-13902971 ] Feng Honghua commented on HBASE-10499: -- bq.mainly due to the equals and compareTo method in FlushRegionEntry I agree with you on this: equals/compareTo should use things such as region.hashCode() to make entries unique when getDelay of two entries is equal(make equals/compareTo not to return 0/true), since flushQueue(DelayQueue) uses priority queue internally hosting flush entry. bq.I will try to get the master logs. The cluster is stopped now but i think the logs should be available. Thanks. I just want master logs to confirm another problematic region *6b788c498503ddd3e1433a4cd3fb4e39* was closed due to master selected it to move out when do balance. Below are facts for region fdbb3242d3b673bbe4790a47bc30576f I can be sure after digging yesterday: # Its memstoreSize > 0 (RegionTooBusyException is thrown due to this fact) # internalFlushcache never be called # writestate.flushRequested is true (prevent printing 'Flush requested on' log for each write operation before throwing RegionTooBusyException) Below are high possible but not confirmed yet: # regionsInQueue contains map entry for this region (reject flush entries generated by rollWriter/periodicFlusher) # no flush entry ever polled out for this region from flushQueue, or polled out once but failed to trigger internalFlushcache and remove corresponding map entry from regionsInQueue (the former is more likely) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 >
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902948#comment-13902948 ] ramkrishna.s.vasudevan commented on HBASE-10499: bq.Can you enable "hbase.regionserver.servlet.show.queuedump" so that you can get dump of flushQueue ? This cluster is stopped now. So next time when i run i can enable this. [~fenghh] Thanks for the analysis. I was actually analysing another part of the code to see if there is any chance where the flushQueue is not able to pick up the region as you said and mainly due to the equals and compareTo method in FlushRegionEntry. But could not find any proper evidence. What I was thiniking was from time 2014-02-11 08:44:32,433 to 2014-02-11 08:44:34,033 there were lot of flush requests, so may be some problem with the way the region was picked up based on the time based comparsion in the queue. But later felt that because every time there was a flush request. The HRegion.shouldFlush() does not check for the size instead only checks for the time of the oldest edit. Also the HLog rollWriter() also checks if there was an oldest edit. So this is what made me think may be the flush size is calculated wrong. I will try to get the master logs. The cluster is stopped now but i think the logs should be available. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b7
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902830#comment-13902830 ] Ted Yu commented on HBASE-10499: @Ram: Can you enable "hbase.regionserver.servlet.show.queuedump" so that you can get dump of flushQueue ? > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39.
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902732#comment-13902732 ] Feng Honghua commented on HBASE-10499: -- bq.But a manual designated move for this problematic region can succeed since preflush/flush of closing trigger flush directly without polling from flushQueue A manual flush (from client/shell) is also OK here(without polling from flushQueue) and more lightweight since without closing and re-opening > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902724#comment-13902724 ] Feng Honghua commented on HBASE-10499: -- Some progress and conclusion till now for the problematic region*fdbb3242d3b673bbe4790a47bc30576f*(lack some log/info, can deduce only by code and log at hand): # internalFlushcache() is never called, so root cause won't be its memstoreSize is mis-calculated ('memstoreSize <= 0' check is within internalFlushcache) # On contrary memstoreSize should be > 0, and there is flushEntry in regionsInQueue, but never pop up to be executed # Why it wasn't moved to other RS is not because it can't be flushed, but because HMaster didn't choose it to move out during balance... (this needs master log to confirm) Below are the deduction process, a bit long... # HRegion's private field 'updatesLock': its writeLock().lock() is only called in internalFlushcache(), and *after* below log. {code} LOG.debug("Started memstore flush for " + this + ", current region memstore size " +... {code} For the problematic region, above log never be printed, so writeLock().lock() is never called => the call of lock(updatesLock.readLock(), xxx) in write operations won't fail, this means RegionTooBusyException *can't* be thrown in below method {code} private void lock(final Lock lock, final int multiplier) throws RegionTooBusyException, InterruptedIOException { try { final long waitTime = Math.min(maxBusyWaitDuration, busyWaitDuration * Math.min(multiplier, maxBusyWaitMultiplier)); if (!lock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { throw new RegionTooBusyException( "failed to get a lock in " + waitTime + " ms. " + "regionName=" + (this.getRegionInfo() == null ? "unknown" : {code} # RegionTooBusyException occurs in two places, so it *must* be thrown here: {code} private void checkResources() throws RegionTooBusyException { // If catalog region, do not impose resource constraints or block updates. if (this.getRegionInfo().isMetaRegion()) return; if (this.memstoreSize.get() > this.blockingMemStoreSize) { requestFlush(); throw new RegionTooBusyException("Above memstore limit, " + "regionName=" + (this.getRegionInfo() == null ? "unknown" : {code} *this means memstoreSize > 0(also means it shouldn't <= 0 in internalFlushcache())* # In above code, requestFlush() is called each time before RegionTooBusyException is thrown, the requestFlush()'s implementation: {code} private void requestFlush() { if (this.rsServices == null) { return; } synchronized (writestate) { if (this.writestate.isFlushRequested()) { return; } writestate.flushRequested = true; } // Make request outside of synchronize block; HBASE-818. this.rsServices.getFlushRequester().requestFlush(this); if (LOG.isDebugEnabled()) { LOG.debug("Flush requested on " + this); } } {code} requestFlush is called the same times as RegionTooBusyException is thrown, but "Flush requested on" log for this region is printed only once...so in this method, it must return *before* the LOG.debug(), since rsServices is not null, so it must by *this.writestate.isFlushRequested() is true* => *writestate.flushRequested should always be true* each time RegionTooBusyException is thrown # "Flush requested on" log does be printed once for the problematic region, means 'flushRequested' ever be set to true. And HRegion.writestate.flushRequested is set back to false only in below code besides region initialization {code} public boolean flushcache() throws IOException { ... try { boolean result = internalFlushcache(status); if (coprocessorHost != null) { status.setStatus("Running post-flush coprocessor hooks"); coprocessorHost.postFlush(); } status.markComplete("Flush successful"); return result; } finally { synchronized (writestate) { writestate.flushing = false; this.writestate.flushRequested = false; writestate.notifyAll(); } } } {code} This means above 'try' block never be called, otherwise the flushRequested in finally block should be set to false, above HRegion.flushcache method is the critical path for flush task coming out of flushQueue(another places calling internalFlushcache are replay split log and when closing region) ==> *internalFlushcache() is never called for flush task of this problematic region from flushQueue*, this also means the root cause is not 'memstoreSize <=0' (which is checked within internalFlushcache), so adding log before this statement won't help for diagnose even it reproduces next time:-) # 'forcing flush of' / 'periodicFlusher requesting flush for' means requestFlush/requestDelayedFlush are called, they all d
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902679#comment-13902679 ] Feng Honghua commented on HBASE-10499: -- bq.'hbase.hstore.flusher.count', if you did configure it It can be deduced from the log that this configuration is 1 (default, not configured) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902670#comment-13902670 ] Feng Honghua commented on HBASE-10499: -- [~ram_krish], would you help provide below information? Thanks. # The (active) master log file with log from '2014-02-11 08:40:00' to '2014-02-11 10:00:00' # The start time(from client side log) when client received 'RegionTooBusyException' # 'hbase.hstore.flusher.count', if you did configure it > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901287#comment-13901287 ] ramkrishna.s.vasudevan commented on HBASE-10499: bq.The problematic region can't be moved to other RS is because region need to be flushed before moving out, but the problematic region can't be successfully flushed Yes that is the reason. I like your idea in your previous comment using HBASE-9873. May be worth pursuing that. Should be useful. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 201
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901286#comment-13901286 ] Feng Honghua commented on HBASE-10499: -- bq.I restarted another RS and there were region movements with other regions but this region stays with the RS that has this issue The problematic region can't be moved to other RS is because region need to be flushed before moving out, but the problematic region can't be successfully flushed...:-) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901279#comment-13901279 ] Feng Honghua commented on HBASE-10499: -- Some additional thought triggered by this jira: A single problematic region which fails to flush all along can result in too many hlog files since its edits scatter in almost all hlog files, so even though other regions are successfully flushed and their edits are stale in hlog files, those hlog files still can't be archived because they are 'polluted' by the edits of that single problematic region. 'Polluted' means most of the edits, say maybe 99%, have been flushed to hfiles and are stale, but still non-archiveable due to the remained 1% edits from the problematic region are not flushed to hfile and still 'effective'. For this case, the periodicFlusher can't help though by design it aims to eliminate 'polluated' hlog files resulted from region with sparse but steady write(its writes scatter in almost all hlog files but flush can't be triggered for this region due to its total size doesn't exceed the flush threshold) by checking if it's not flushed for a long time. The reason is the region not flushed can be a problematic region, hence can't be flushed by periodicFlusher as this jira shows, so it can't eliminate 'polluated' hlog files as desired. Too many hlog files can further have bad effect such as more frequent unnecessary flush-check(always)... Once the number of hlog files exceeds the threshold to trigger forceful flush, that number will remain without any chance to be lower than that threshold, no matter how frequently other regions are flushed, or whether periodicFlusher triggers flush for other regions or this problematic regions, even though eventually this problematic region refuses/blocks writes to it... This is all because the older hlog files(once exceeds the threshold number) all contain some edits from this problematic region. But the idea of background hlog compaction thread introduced by HBASE-9873 can help to reduce number of hlog files even there are problematic un-flushable regions: it reads original hlog files, only keeps the edits of these problematic regions(and of regions not flushed yet) and write them to the new hlog files, then archive the original hlog files, which reduces the number of hlog files. Since the problematic regions will refuse writes after some time(as this jira shows), so the eventual data size of this region left in hlog files won't increase infinitely. And such problematic regions won't have that bad ripple effect for the total system. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBase
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901269#comment-13901269 ] Anoop Sam John commented on HBASE-10499: {code} 2014-02-11 08:44:32,881 DEBUG [RpcServer.handler=1,port=60020] regionserver.HRegion: Flush requested on usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. {code} Almost sure that this is because of flush request from batchMutate() as we reached flush size after doing the puts. This request for flush itself was not continued. No trace of this region getting started to flush even.So possibility is memstoreSize got miscalculated some how ? !! > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionser
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901260#comment-13901260 ] ramkrishna.s.vasudevan commented on HBASE-10499: That is where i doubt that the memstoreSize is getting calculated wrong. Because once a flush is requested the flusher thread would try to handle it and if it is not able to flush then it would just ignore and there are no logs in that flow if you see. Again it is a guess. Or is there any reason why the region is not flushed from the queue, may be not possible. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,13921078069
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901256#comment-13901256 ] Feng Honghua commented on HBASE-10499: -- Thanks for check. Yes I noticed regions are selected and forced to flush due to too many hlog files. But that can only confirm the write is really heavy. The problematic region was also triggered to flush by periodicFlusher since it contains old-enough edits, which double confirms this region had not been flushed for quite a long time.:-( The question is why a flush is triggered but doesn't complete all along... > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, > workloada_0.98.dat > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > us
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901181#comment-13901181 ] Feng Honghua commented on HBASE-10499: -- bq.Because we can see that the region is requsted for a flush but it does not happen and no logs related to flush are printed in the logs. so due to some reason this memstore.size() has become 0( I assume this) Reasonable guess, but it can only justify why no flush happens, but not for RegionTooBusyException is thrown by this same region(it should be the same region, right?), actually ' memstoreSize.get() <= 0' contradicts the condition for RegionTooBusyException to be thrown. This exception is thrown here(another place is when failed to lock the region for write). It's impossible for both "memstoreSize.get() <= 0" and "this.memstoreSize.get() > this.blockingMemStoreSize" hold at the same time... {code} if (this.memstoreSize.get() > this.blockingMemStoreSize) { requestFlush(); throw new RegionTooBusyException("Above memstore limit, " + ...memstoreSize=" + memstoreSize.get() {code} It would help if you can attach the detailed client side log for RegionTooBusyException stack/info? would be great if with region Info and memstoreSize :-) > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodic
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900447#comment-13900447 ] Hudson commented on HBASE-10499: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #143 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/143/]) HBASE-10499-In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException- Adding a log alone (ramkrishna: rev 1567894) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > d
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900428#comment-13900428 ] Hudson commented on HBASE-10499: SUCCESS: Integrated in HBase-0.98 #155 (See [https://builds.apache.org/job/HBase-0.98/155/]) HBASE-10499-In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException- Adding a log alone (ramkrishna: rev 1567894) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 0
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900253#comment-13900253 ] ramkrishna.s.vasudevan commented on HBASE-10499: committed to 0.98 alone. Will try reproducing this. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1, 0.99.0 > > Attachments: HBASE-10499.patch, > hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 20099 > 2014-02-11 09:4
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898838#comment-13898838 ] ramkrishna.s.vasudevan commented on HBASE-10499: Not able to reproduce this further again. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1 > > Attachments: hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, > t2.dump > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 20099 > 2014-02-11 09:43:04,238 INFO [regionserver60020.peri
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898030#comment-13898030 ] Ted Yu commented on HBASE-10499: bq. I have logs and thread dumps taken during this time Please attach them. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1 > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 20099 > 2014-02-11 09:43:04,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897787#comment-13897787 ] ramkrishna.s.vasudevan commented on HBASE-10499: In HBASE-5568 and HBASE-5312 there were multiple flushes on the same region and splitting of regions were happening. Here no such things happen. The region in discussion has not been flushed even once and there are no splits or compactions that has happened on it. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1 > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,us
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897780#comment-13897780 ] ramkrishna.s.vasudevan commented on HBASE-10499: I have logs and thread dumps taken during this time. If needed can attach them here. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.1 > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 20099 > 2014-02-11 09:43:04,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: reg
[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException
[ https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897779#comment-13897779 ] ramkrishna.s.vasudevan commented on HBASE-10499: Am not sure if this could come in 0.96 and trunk also. I feel 0.96 it is possible but with trunk (recent HLog disruptor) changes am not sure. Also may be possible in 0.94. I don't have any soln in hand except for adding log msgs in such a case where memstoreSize could be zero. Will check more on this. > In write heavy scenario one of the regions does not get flushed causing > RegionTooBusyException > -- > > Key: HBASE-10499 > URL: https://issues.apache.org/jira/browse/HBASE-10499 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.98.0, 0.98.1 > > > I got this while testing 0.98RC. But am not sure if it is specific to this > version. Doesn't seem so to me. > Also it is something similar to HBASE-5312 and HBASE-5568. > Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 > regions. In one of the run with 0.98 server and 0.98 client I faced this > problem like the hlogs became more and the system requested flushes for those > many regions. > One by one everything was flushed except one and that one thing remained > unflushed. The ripple effect of this on the client side > {code} > com.yahoo.ycsb.DBException: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245) > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73) > at com.yahoo.ycsb.ClientThread.run(Client.java:307) > Caused by: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed > 54 actions: RegionTooBusyException: 54 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171) > at > org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225) > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232) > ... 2 more > {code} > On one of the RS > {code} > 2014-02-11 08:45:58,714 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): > 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, > 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, > 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, > 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, > acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, > 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, > d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, > 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, > bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, > cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, > 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, > acc43e4b42c1a041078774f4f20a3ff5 > .. > 2014-02-11 08:47:49,580 INFO [regionserver60020.logRoller] wal.FSHLog: Too > many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): > fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39 > {code} > {code} > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 16689 > 2014-02-11 09:42:44,237 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a > delay of 15868 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region > usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a > delay of 20847 > 2014-02-11 09:42:54,238 INFO [regionserver60020.periodicFlusher] > regionserver.HRegionServer: regionserver60020.periodicFlusher requesting > flush for region