subject:"\[jira\] \[Commented\] \(HBASE\-10499\) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException"

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290452#comment-14290452
 ] 

Hudson commented on HBASE-10499:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #776 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/776/])
HBASE-10499 In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException (Ram and Ted) (tedyu: rev 
5484f7958f8ce929c619c377f07917f05cab0db6)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer:

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290429#comment-14290429
 ] 

Ted Yu commented on HBASE-10499:


Enjoy your trip, Ram.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regi

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290426#comment-14290426
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


Ted, was not able to +1 it on time. I tried +1 from my mobile while on travel 
from the mail that I received for this JIRA and found that the same got updated 
only today morning.  Thanks for your work and time Ted.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.peri

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290410#comment-14290410
 ] 

Hudson commented on HBASE-10499:


SUCCESS: Integrated in HBase-0.98 #815 (See 
[https://builds.apache.org/job/HBase-0.98/815/])
HBASE-10499 In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException (Ram and Ted) (tedyu: rev 
5484f7958f8ce929c619c377f07917f05cab0db6)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserve

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread ramkrishna vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290311#comment-14290311
 ] 

ramkrishna vasudevan commented on HBASE-10499:
--

https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
causing RegionTooBusyException
--
10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch,
compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip,
hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log,
master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump,
workloada_0.98.dat
to this version.  Doesn't seem so to me.
has 200 regions.  In one of the run with 0.98 server and 0.98 client I
faced this problem like the hlogs became more and the system requested
flushes for those many regions.
remained unflushed.  The ripple effect of this on the client side
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
wal.FSHLog: Too many hlogs: logs=38, maxlogs=32; forcing flush of 23
regions(s): 97d8ae2f78910cc5ded5fbb1ddad8492,
d396b8a1da05c871edcb68a15608fdf2, 01a68742a1be3a9705d574ad68fec1d7,
1250381046301e7465b6cf398759378e, 127c133f47d0419bd5ab66675aff76d4,
9f01c5d25ddc6675f750968873721253, 29c055b5690839c2fa357cd8e871741e,
ca4e33e3eb0d5f8314ff9a870fc43463, acfc6ae756e193b58d956cb71ccf0aa3,
187ea304069bc2a3c825bc10a59c7e84, 0ea411edc32d5c924d04bf126fa52d1e,
e2f9331fc7208b1b230a24045f3c869e, d9309ca864055eddf766a330352efc7a,
1a71bdf457288d449050141b5ff00c69, 0ba9089db28e977f86a27f90bbab9717,
fdbb3242d3b673bbe4790a47bc30576f, bbadaa1f0e62d8a8650080b824187850,
b1a5de30d8603bd5d9022e09c574501b, cc6a9fabe44347ed65e7c325faa72030,
313b17dbff2497f5041b57fe13fa651e, 6b788c498503ddd3e1433a4cd3fb4e39,
3d71274fe4f815882e9626e1cfa050d1, acc43e4b42c1a041078774f4f20a3ff5
wal.FSHLog: Too many hlogs: logs=53, maxlogs=32; forcing flush of 2
regions(s): fdbb3242d3b673bbe4790a47bc30576f,
6b788c498503ddd3e1433a4cd3fb4e39
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 16689
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a
delay of 15868
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 20847
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a
delay of 20099
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 8677
wal.FSHLog: Too many hlogs: logs=54, maxlogs=32; forcing flush of 1
regions(s): fdbb3242d3b673bbe4790a47bc30576f
regions but this region stays with the RS that has this issue.  One
important observation is that in HRegion.internalflushCache() we need to
add a debug log here
does not happen and no logs related to flush are printed in the logs. so
due to some reason this memstore.size() has become 0( I assume this).  The
earlier bugs were also due to similar reason.


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.l

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread ramkrishna vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290308#comment-14290308
 ] 

ramkrishna vasudevan commented on HBASE-10499:
--

+1 on patch
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
RegionTooBusyException
--
10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch,
compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip,
hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log,
master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump,
workloada_0.98.dat
this version.  Doesn't seem so to me.
200 regions.  In one of the run with 0.98 server and 0.98 client I faced
this problem like the hlogs became more and the system requested flushes
for those many regions.
remained unflushed.  The ripple effect of this on the client side
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
Too many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s):
97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2,
01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e,
127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253,
29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463,
acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84,
0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e,
d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69,
0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f,
bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b,
cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e,
6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1,
acc43e4b42c1a041078774f4f20a3ff5
Too many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s):
fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 16689
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a
delay of 15868
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 20847
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a
delay of 20099
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 8677
Too many hlogs: logs=54, maxlogs=32; forcing flush of 1 regions(s):
fdbb3242d3b673bbe4790a47bc30576f
regions but this region stays with the RS that has this issue.  One
important observation is that in HRegion.internalflushCache() we need to
add a debug log here
not happen and no logs related to flush are printed in the logs. so due to
some reason this memstore.size() has become 0( I assume this).  The earlier
bugs were also due to similar reason.


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.du

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290226#comment-14290226
 ] 

Hudson commented on HBASE-10499:


FAILURE: Integrated in HBase-1.0 #679 (See 
[https://builds.apache.org/job/HBase-1.0/679/])
HBASE-10499 In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException (Ram and Ted) (tedyu: rev 
973961b23fc6f3e4748ae7a213c9a89ff89dbb33)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver6

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290066#comment-14290066
 ] 

Andrew Purtell commented on HBASE-10499:


Thanks [~te...@apache.org], I applied the 0.98 patch and checked the new test 
and other flush tests, looks ok. 

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertab

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289919#comment-14289919
 ] 

Andrew Purtell commented on HBASE-10499:


This was reported against 0.98, shouldn't this be fixed in/for 0.98?

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 201

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289859#comment-14289859
 ] 

Hudson commented on HBASE-10499:


FAILURE: Integrated in HBase-1.1 #102 (See 
[https://builds.apache.org/job/HBase-1.1/102/])
HBASE-10499 In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException (Ram and Ted) (tedyu: rev 
3a529c04cebb4f3debdfd42fb00d3736dc2ea2fd)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for re

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289775#comment-14289775
 ] 

Hudson commented on HBASE-10499:


FAILURE: Integrated in HBase-TRUNK #6049 (See 
[https://builds.apache.org/job/HBase-TRUNK/6049/])
HBASE-10499 In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException (Ram and Ted) (tedyu: rev 
74adb11f4c504abbb6a52de72b53883ad7b952b4)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289396#comment-14289396
 ] 

Ted Yu commented on HBASE-10499:


Thanks for the reviews - they're really helpful.

Will commit later today.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288938#comment-14288938
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12694093/10499-v8.txt
  against master branch at commit 5fbf80ee5ecb288804d2d2d042199dcd834ae848.
  ATTACHMENT ID: 12694093

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12567//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288818#comment-14288818
 ] 

Anoop Sam John commented on HBASE-10499:


+1

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
>

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288336#comment-14288336
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693983/10499-v7.txt
  against master branch at commit 319f9bb7918af8cfe7e65f97b654f37f0d5983f3.
  ATTACHMENT ID: 12693983

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12559//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, HBASE-10499.patch, HBASE-10499_v5.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288155#comment-14288155
 ] 

Ted Yu commented on HBASE-10499:


>From QA run #12551 :
{code}
 
/x1/jenkins/jenkins-home/jobs/PreCommit-HBASE-Build/builds/2015-01-22_18-10-22/archive/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat-output.txt
 may be corrupted
at jsync.protocol.FileSequenceReader.read(FileSequenceReader.java:45)
at 
com.cloudbees.jenkins.plugins.jsync.archiver.JSyncArtifactManager.remoteSync(JSyncArtifactManager.java:145)
at 
com.cloudbees.jenkins.plugins.jsync.archiver.JSyncArtifactManager.archive(JSyncArtifactManager.java:68)
at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:140)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:756)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:720)
at hudson.model.Build$BuildExecution.post2(Build.java:182)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:669)
at hudson.model.Run.execute(Run.java:1731)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:232)
Recording test results
[description-setter] Could not determine description.
Finished: FAILURE
{code}
The above indicated problem with build infrastructure.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288136#comment-14288136
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693941/10499-v6.txt
  against master branch at commit 319f9bb7918af8cfe7e65f97b654f37f0d5983f3.
  ATTACHMENT ID: 12693941

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hbase.coprocessor.TestMasterObserver.testRegionTransitionOperations(TestMasterObserver.java:1604)
at 
org.apache.hadoop.hbase.regionserver.TestJoinedScanners.testJoinedScanners(TestJoinedScanners.java:92)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12551//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287872#comment-14287872
 ] 

Ted Yu commented on HBASE-10499:


The tests reported hanging actually passed:
{code}
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 82.363 sec - in 
org.apache.hadoop.hbase.TestAcidGuarantees
{code}
See also the summary:
{code}
+1 core tests. The patch passed unit tests in .
{code}
Local run of TestDataBlockEncoders, TestCacheOnWrite, 
TestScannerSelectionUsingTTL and TestAcidGuarantees passed too.

Let me attach patch v6 again.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287856#comment-14287856
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693910/10499-v6.txt
  against master branch at commit 833feefbf977a8208f8f71f5f3bd9b027d54961f.
  ATTACHMENT ID: 12693910

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 5 zombie test(s):   
at 
org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders.testSeekingOnSample(TestDataBlockEncoders.java:205)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompactionInternals(TestCacheOnWrite.java:458)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompaction(TestCacheOnWrite.java:410)
at 
org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection(TestScannerSelectionUsingTTL.java:118)
at 
org.apache.hadoop.hbase.TestAcidGuarantees.testScanAtomicity(TestAcidGuarantees.java:354)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12548//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287418#comment-14287418
 ] 

Ted Yu commented on HBASE-10499:


Thanks for taking a look, Anoop.
Comparable, Delayed and FlushQueueEntry are all interfaces. So hashcode method 
has to be added - it is just a matter of the method name.

If there is no strong opinion, I'd like to keep the current formation.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
>

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287402#comment-14287402
 ] 

Anoop Sam John commented on HBASE-10499:


bq.int getHashcode();
Ted, need to add such a new method to interface? Or just call hasCode() from 
compareTo() and make sure to implement hashCode() in FlushRegionEntry?  Just 
asking.

Patch looks good.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,13

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287092#comment-14287092
 ] 

John King commented on HBASE-10499:
---

+1 for patch "10499-v4.txt"

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver6002

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283277#comment-14283277
 ] 

John King commented on HBASE-10499:
---

Hi Ted, your point is very reasonable.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regio

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282590#comment-14282590
 ] 

Ted Yu commented on HBASE-10499:


Please also take a look at 
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/DelayQueue.java#116

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282574#comment-14282574
 ] 

Ted Yu commented on HBASE-10499:


The contract of the Comparable interface recommends that natural orderings be 
consistent with equals.



> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserve

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282362#comment-14282362
 ] 

John King commented on HBASE-10499:
---

I think Ram's patch is simpler. And simple is always good.
The regression failed because one test case in 
/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFlushRegionEntry.java
 need to be modified.
The test case was designed for the old equals implementation.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver6002

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282319#comment-14282319
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


Ya, but I don't think we need to calculate the hashCode after finding the 
delay.  We should be sure that we are calculating the equals/compareTo for the 
same region and then find the delay.  Also finding hashCode() would involve 
more operations also. This is my thought. 

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: reg

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282301#comment-14282301
 ] 

Ted Yu commented on HBASE-10499:


Thanks for the comment Ram.

FlushQueueEntry#equals() calls FlushQueueEntry#compareTo().
So I think patch v4 should achieve the goal of making sure the correct region 
is removed from the queue.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282240#comment-14282240
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


For the test case failure- we may need to add the change as in Ted's patch.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionse

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282236#comment-14282236
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693025/HBASE-10499_v5.patch
  against master branch at commit 03e17168c3feab765fec26693318f4b8ae6a9468.
  ATTACHMENT ID: 12693025

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12508//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In o

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281241#comment-14281241
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692899/10499-v4.txt
  against master branch at commit 092c91eb0fc2a6b4044183e9ece71dd03711045d.
  ATTACHMENT ID: 12692899

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12500//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {co

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281208#comment-14281208
 ] 

John King commented on HBASE-10499:
---

Hi,Ted
In your v3 patch, you used a strong type conversion, which converted from Delay 
to FRE, I think it should be FQE.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicF

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281211#comment-14281211
 ] 

John King commented on HBASE-10499:
---

Hi,Ted
In your v3 patch, you used a strong type conversion, which converted from Delay 
to FRE, I think it should be FQE.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicF

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281212#comment-14281212
 ] 

John King commented on HBASE-10499:
---

Hi,Ted
In your v3 patch, you used a strong type conversion, which converted from Delay 
to FRE, I think it should be FQE.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicF

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281209#comment-14281209
 ] 

John King commented on HBASE-10499:
---

Hi,Ted
In your v3 patch, you used a strong type conversion, which converted from Delay 
to FRE, I think it should be FQE.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicF

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281210#comment-14281210
 ] 

John King commented on HBASE-10499:
---

Hi,Ted
In your v3 patch, you used a strong type conversion, which converted from Delay 
to FRE, I think it should be FQE.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicF

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281204#comment-14281204
 ] 

Hadoop QA commented on HBASE-10499:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692895/10499-v2.txt
  against master branch at commit 092c91eb0fc2a6b4044183e9ece71dd03711045d.
  ATTACHMENT ID: 12692895

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12498//console

This message is automatically generated.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280050#comment-14280050
 ] 

John King commented on HBASE-10499:
---

And for the problematic region, only when Region server's low water mark 
reached, can cause region in  (MemStoreFlusher).regionsInQueue dequeue.
i.e. type "flush" in HBase shell cannot resolve this issue.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 20

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread John King (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280038#comment-14280038
 ] 

John King commented on HBASE-10499:
---

Hi, All

I also hit the issue yesterday. I think Honghua Feng's analysis is totally 
right.
The fqe entry in flushQueue may have been removed before being polled out by 
the MemStoreFlusher.

Here are some code from MemStoreFlusher:

[CODE-BEGIN]
class MemStoreFlusher implements FlushRequester {
  
  ...
  
  private boolean flushRegion(final HRegion region, final boolean 
emergencyFlush) {
synchronized (this.regionsInQueue) {
  FlushRegionEntry fqe = this.regionsInQueue.remove(region);
  if (fqe != null && emergencyFlush) {
// Need to remove from region from delay queue.  When NOT an
// emergencyFlush, then item was removed via a flushQueue.poll.
flushQueue.remove(fqe);
 }
}

...

  }
  
  ...
  
  static class FlushRegionEntry implements FlushQueueEntry {

...

@Override
public int compareTo(Delayed other) {
  return Long.valueOf(getDelay(TimeUnit.MILLISECONDS) -
other.getDelay(TimeUnit.MILLISECONDS)).intValue();
}

@Override
public boolean equals(Object obj) {
  if (this == obj) {
return true;
  }
  if (obj == null || getClass() != obj.getClass()) {
return false;
  }
  Delayed other = (Delayed) obj;
  return compareTo(other) == 0;
}

...
  }
}
[CODE-END]

>From the code, we can tell:

1. Two FlushRegionEntry instances were thought be equal only when they have the 
same delay time.
2. When emergencyFlush was true, an entry may be removed from the flushQueue.

Then, if two different fqe (say "A" and "B") have exactly the same delay time, 
and "B" was flushed as an emergencyFlush,
"A" may be removed from the fulshQueue instead of "B".

Because RegionA.writestate.isFlushRequested() always return true and RegionA 
was still in (MemStoreFlusher).regionsInQueue,
both (MemStoreFlusher).requestFlush (HRegion r) and 
(MemStoreFlusher).requestDelayedFlush(HRegion r, long randomDelay) cannot 
submit RegionA
into flushQueue. That's why PeriodicMemstoreFlusher and (HRegion).requestFlush 
() cannot work.

The problematic region can only be flushed when

1. Region server's low water mark reached;
2. Client side RPC (Flush/Split/Merge);
3. Buckload completting;
4. Snapshot taking;
5. Region closing.
...

which may cause the online region refused to serve writting request for a long 
time.

PS: Sorry, it's my first time use confluence, my comment may seems a little 
mess.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
>

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-07-31 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081442#comment-14081442
 ] 

Andrew Purtell commented on HBASE-10499:


bq. Would setting hbase.hstore.flusher.count to 2 help? (I'm running with 
default 1).

That seems a reasonable stab in the dark [~matvey14]. It won't address the 
underlying issue I suspect but may move it out for you.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> d

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-07-28 Thread Matt Kapilevich (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076315#comment-14076315
 ] 

Matt Kapilevich commented on HBASE-10499:
-

I've attached 
https://issues.apache.org/jira/secure/attachment/12658147/compaction-queue.png 

It shows that once the RS hits the condition, the compactions just begin to 
queue up.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-07-28 Thread Matt Kapilevich (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076303#comment-14076303
 ] 

Matt Kapilevich commented on HBASE-10499:
-

We're also hitting this issue. The RegionServer is unable to flush one of the 
regions. I am seeing this in the logs:

{code}
2014-07-24 14:58:42,970 INFO 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed compaction: 
Request = 
regionName=users,534000,1404907131059.6eae263f1f0fd698cca73964118e16b0.,
 storeName=ids, fileCount=3, fileSize=185.0 M, priority=6, 
time=2961355376528391; duration=7sec
2014-07-24 14:58:44,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 9557
2014-07-24 14:58:54,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 9316
2014-07-24 14:59:04,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 7454
2014-07-24 14:59:14,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 11197
2014-07-24 14:59:24,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 19777
2014-07-24 14:59:34,778 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 20814
2014-07-24 14:59:44,779 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 22542
2014-07-24 14:59:54,779 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 16096
2014-07-24 15:00:04,779 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 21794
2014-07-24 15:00:14,779 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
regionserver60020.periodicFlusher requesting flush for region 
users,870800,1404907131062.2a8dd55c1011e92f4e32d705196f1e15.
 after a delay of 20873
{code}

The unfortunate thing here is that the RegionServer never recovers, even after 
we stop doing Puts.

We're on 0.96.1.1-cdh5.0.2

Are there any workarounds for this? Would setting hbase.hstore.flusher.count to 
2 help? (I'm running with default 1).

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-05-21 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004827#comment-14004827
 ] 

Andrew Purtell commented on HBASE-10499:


Where are we with this issue?

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0, 0.98.3
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-24 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910245#comment-13910245
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


[~fenghh]
The number of flushers should be the default value.  I did not change that.  
Sorry for the late reply.
bq.But if you want to raise a JIRA to replace System.currentMilllis with 
EnvirnonmentEdge.currentMillis
Yes better to change.  I am not saying this JIRA is because of this, but just 
wanted to ensure we change it.  Will raise one.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.p

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903273#comment-13903273
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.How many FlushHandlers were there ?
Should be 1, from the log we can see next flush was started only after the 
previous one finished, and considering this occurred under heavy write 
scenario, sounds impossible for multiple concurrent FlushHandlers to run 
strictly serially by chance all the time. Certainly [~ram_krish] can confirm 
this :-)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher req

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Himanshu Vashishtha (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903245#comment-13903245
 ] 

Himanshu Vashishtha commented on HBASE-10499:
-

Yeah, doesn't look like a memstore size being 0 issue. 

How many FlushHandlers were there ?

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903197#comment-13903197
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.One thing I wanted to raise a JIRA is that should flushregionentry use 
EnvirnonmentEdge.currentMillis instead of System.currentMilllis. I don't know 
how far it benefits but I think that can be done.
Wonder if this JIRA is related to this. But if you want to raise a JIRA to 
replace System.currentMilllis with EnvirnonmentEdge.currentMillis, it's better 
to do such replace for all occurrences in HBase in a systematic way, since such 
replace can be done unconditionally without difference, right? :-)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. af

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903181#comment-13903181
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.So there is some reason why flushQueue is not picking up that region?
I think *yes*. It's sure RegionTooBusyException is thrown due to 'memstoreSize 
> blockingMemStoreSize' (see my above deduction process), that means it can't 
be the case that 'memstoreSize <= 0' (and the previously problematic region 
6b788c498503ddd3e1433a4cd3fb4e39 has memstore > 0 (256.2M) when flushing)

Another two things for sure are:
# internalFlushcache was never called;
# writestate.flushRequested was never reset to false after it's set to true

And one thing is highly likely (but can't prove directly by code and log at 
hand:-() is regionsInQueue holds entry for region 
fdbb3242d3b673bbe4790a47bc30576f, since all later flush requests generated by 
rollWriter and periodicFlusher are rejected by (!regionsInQueue.containsKey(r))

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user365

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903168#comment-13903168
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


bq.If we suppose these two regions have the same problem regarding 
non-flush-able internally by regionserver, and 6b788c498503ddd3e1433a4cd3fb4e39 
is successfully flushed by a sub-phase of close operation because flush as part 
of close doesn't bother flushQueue
This is one thing I have not noticed that while flush happens via close then 
flushQueue is not used.  By design that makes sense.  
One thing I wanted to raise a JIRA is that should flushregionentry use 
EnvirnonmentEdge.currentMillis instead of System.currentMilllis.  I don't know 
how far it benefits but I think that can be done.
So there is some reason why flushQueue is not picking up that region? 

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.pe

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13903067#comment-13903067
 ] 

Feng Honghua commented on HBASE-10499:
--

Thanks [~ram_krish]

There are *two* problematic regions : fdbb3242d3b673bbe4790a47bc30576f and 
6b788c498503ddd3e1433a4cd3fb4e39 --- both are requested to flush  from time to 
time but no flush ever happened within *[2014-02-11 08:44,  2014-02-11 09:51]*

But at 2014-02-11 09:51:11, 6b788c498503ddd3e1433a4cd3fb4e39 is flushed due to 
a close operation, and I confirm in the master log you provided just now that 
the close is part of a move operation triggered by master when it does balance 
--- 6b788c498503ddd3e1433a4cd3fb4e39 is flushed because it is chosen as a move 
target for balance and flushed during closing.

While fdbb3242d3b673bbe4790a47bc30576f is not chosen as a move target for 
balance by master(also can be confirmed in the master log), that's why it 
doesn't be flushed all along.

If we suppose these two regions have the same problem regarding non-flush-able 
internally by regionserver, and 6b788c498503ddd3e1433a4cd3fb4e39 is 
successfully flushed by a sub-phase of close operation because flush as part of 
close doesn't bother flushQueue, and the flush log shows that the memstore of 
6b788c498503ddd3e1433a4cd3fb4e39 is 256.2M when flushing it. That's a side 
evidence that fdbb3242d3b673bbe4790a47bc30576f's memstore is also > 0, it's not 
flushed just because its flush entry can't be polled out of flushQueue...

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902978#comment-13902978
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.The HRegion.shouldFlush() does not check for the size instead only checks 
for the time of the oldest edit. Also the HLog rollWriter() also checks if 
there was an oldest edit
If my understanding is correct, these are by design. These two are both to 
flush regions with old enough edits even their memstoreSize is below flush 
threshold(to reduce the number of log files at the cost of hfiles with small 
sizes), so it's natural they don't check whether memstoreSize > someValue(they 
assume and tolerate memstoreSize with small size). And a region will be 
selected to flush by rollWriter/periodicalFlusher only when they (do) have some 
'old' edits, so the region's memstoreSize *must not* <= 0(this is an implicit 
invariant and can be asserted, but not be check-and-exit)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher]

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902971#comment-13902971
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.mainly due to the equals and compareTo method in FlushRegionEntry
I agree with you on this: equals/compareTo should use things such as 
region.hashCode() to make entries unique when getDelay of two entries is 
equal(make equals/compareTo not to return 0/true), since flushQueue(DelayQueue) 
uses priority queue internally hosting flush entry.
bq.I will try to get the master logs. The cluster is stopped now but i think 
the logs should be available.
Thanks. I just want master logs to confirm another problematic region 
*6b788c498503ddd3e1433a4cd3fb4e39* was closed due to master selected it to move 
out when do balance.

Below are facts for region fdbb3242d3b673bbe4790a47bc30576f I can be sure after 
digging yesterday:
# Its memstoreSize > 0  (RegionTooBusyException is thrown due to this fact)
# internalFlushcache never be called
# writestate.flushRequested is true (prevent printing 'Flush requested on' log 
for each write operation before throwing RegionTooBusyException)

Below are high possible but not confirmed yet:
# regionsInQueue contains map entry for this region (reject flush entries 
generated by rollWriter/periodicFlusher)
# no flush entry ever polled out for this region from flushQueue, or polled out 
once but failed to trigger internalFlushcache and remove corresponding map 
entry from regionsInQueue (the former is more likely)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
>

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902948#comment-13902948
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


bq.Can you enable "hbase.regionserver.servlet.show.queuedump" so that you can 
get dump of flushQueue ?
This cluster is stopped now.  So next time when i run i can enable this.
[~fenghh]
Thanks for the analysis.  I was actually analysing another part of the code to 
see if there is any chance where the flushQueue is not able to pick up the 
region as you said and mainly due to the equals and compareTo method in 
FlushRegionEntry.  But could not find any proper evidence.  What I was 
thiniking was from time 2014-02-11 08:44:32,433 to 2014-02-11 08:44:34,033 
there were lot of flush requests, so may be some problem with the way the 
region was picked up based on the time based comparsion in the queue.  But 
later felt that because every time there was a flush request.
The HRegion.shouldFlush() does not check for the size instead only checks for 
the time of the oldest edit. Also the HLog rollWriter() also checks if there 
was an oldest edit.  So this is what made me think may be the flush size is 
calculated wrong.  
I will try to get the master logs.  The cluster is stopped now but i think the 
logs should be available.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b7

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902830#comment-13902830
 ] 

Ted Yu commented on HBASE-10499:


@Ram:
Can you enable "hbase.regionserver.servlet.show.queuedump" so that you can get 
dump of flushQueue ?

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39.

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902732#comment-13902732
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.But a manual designated move for this problematic region can succeed since 
preflush/flush of closing trigger flush directly without polling from flushQueue
A manual flush (from client/shell) is also OK here(without polling from 
flushQueue) and more lightweight since without closing and re-opening

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902724#comment-13902724
 ] 

Feng Honghua commented on HBASE-10499:
--

Some progress and conclusion till now for the problematic 
region*fdbb3242d3b673bbe4790a47bc30576f*(lack some log/info, can deduce only by 
code and log at hand):
# internalFlushcache() is never called, so root cause won't be its memstoreSize 
is mis-calculated ('memstoreSize <= 0' check is within internalFlushcache)
# On contrary memstoreSize should be > 0, and there is flushEntry in 
regionsInQueue, but never pop up to be executed
# Why it wasn't moved to other RS is not because it can't be flushed, but 
because HMaster didn't choose it to move out during balance... (this needs 
master log to confirm)

Below are the deduction process, a bit long...
# HRegion's private field 'updatesLock': its writeLock().lock() is only called 
in internalFlushcache(), and *after* below log.
{code}
LOG.debug("Started memstore flush for " + this + ", current region memstore 
size " +...
{code}
For the problematic region, above log never be printed, so writeLock().lock() 
is never called => the call of lock(updatesLock.readLock(), xxx) in write 
operations won't fail, this means RegionTooBusyException *can't* be thrown in 
below method
{code}
private void lock(final Lock lock, final int multiplier)
  throws RegionTooBusyException, InterruptedIOException {
try {
  final long waitTime = Math.min(maxBusyWaitDuration,
  busyWaitDuration * Math.min(multiplier, maxBusyWaitMultiplier));
  if (!lock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {
throw new RegionTooBusyException(
"failed to get a lock in " + waitTime + " ms. " +
"regionName=" + (this.getRegionInfo() == null ? "unknown" :
{code}
# RegionTooBusyException occurs in two places, so it *must* be thrown here:
{code}
private void checkResources()
throws RegionTooBusyException {
// If catalog region, do not impose resource constraints or block updates.
if (this.getRegionInfo().isMetaRegion()) return;

if (this.memstoreSize.get() > this.blockingMemStoreSize) {
  requestFlush();
  throw new RegionTooBusyException("Above memstore limit, " +
  "regionName=" + (this.getRegionInfo() == null ? "unknown" :
{code}
*this means memstoreSize > 0(also means it shouldn't <= 0 in 
internalFlushcache())*
# In above code, requestFlush() is called each time before 
RegionTooBusyException is thrown, the requestFlush()'s implementation:
{code}
private void requestFlush() {
if (this.rsServices == null) {
  return;
}
synchronized (writestate) {
  if (this.writestate.isFlushRequested()) {
return;
  }
  writestate.flushRequested = true;
}
// Make request outside of synchronize block; HBASE-818.
this.rsServices.getFlushRequester().requestFlush(this);
if (LOG.isDebugEnabled()) {
  LOG.debug("Flush requested on " + this);
}
  }
{code}
requestFlush is called the same times as RegionTooBusyException is thrown, but 
"Flush requested on" log for this region is printed only once...so in this 
method, it must return *before* the LOG.debug(), since rsServices is not null, 
so it must by *this.writestate.isFlushRequested() is true* => 
*writestate.flushRequested should always be true* each time 
RegionTooBusyException is thrown
# "Flush requested on" log does be printed once for the problematic region, 
means 'flushRequested' ever be set to true. And 
HRegion.writestate.flushRequested is set back to false only in below code 
besides region initialization
{code}
public boolean flushcache() throws IOException {
  ...
  try {
boolean result = internalFlushcache(status);

if (coprocessorHost != null) {
  status.setStatus("Running post-flush coprocessor hooks");
  coprocessorHost.postFlush();
}

status.markComplete("Flush successful");
return result;
  } finally {
synchronized (writestate) {
  writestate.flushing = false;
  this.writestate.flushRequested = false;
  writestate.notifyAll();
}
  }
}
{code}
This means above 'try' block never be called, otherwise the flushRequested in 
finally block should be set to false, above HRegion.flushcache method is the 
critical path for flush task coming out of flushQueue(another places calling 
internalFlushcache are replay split log and when closing region) ==> 
*internalFlushcache() is never called for flush task of this problematic region 
from flushQueue*, this also means the root cause is not  'memstoreSize <=0' 
(which is checked within internalFlushcache), so adding log before this 
statement won't help for diagnose even it reproduces next time:-)
# 'forcing flush of' / 'periodicFlusher requesting flush for' means 
requestFlush/requestDelayedFlush are called, they all d

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902679#comment-13902679
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.'hbase.hstore.flusher.count', if you did configure it
It can be deduced from the log that this configuration is 1 (default, not 
configured)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902670#comment-13902670
 ] 

Feng Honghua commented on HBASE-10499:
--

[~ram_krish], would you help provide below information? Thanks.
# The (active) master log file with log from '2014-02-11 08:40:00' to 
'2014-02-11 10:00:00'
# The start time(from client side log) when client received 
'RegionTooBusyException'
# 'hbase.hstore.flusher.count', if you did configure it

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901287#comment-13901287
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


bq.The problematic region can't be moved to other RS is because region need to 
be flushed before moving out, but the problematic region can't be successfully 
flushed
Yes that is the reason.  
I like your idea in your previous comment using HBASE-9873.  May be worth 
pursuing that.   Should be useful.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 201

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901286#comment-13901286
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.I restarted another RS and there were region movements with other regions 
but this region stays with the RS that has this issue
The problematic region can't be moved to other RS is because region need to be 
flushed before moving out, but the problematic region can't be successfully 
flushed...:-)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901279#comment-13901279
 ] 

Feng Honghua commented on HBASE-10499:
--

Some additional thought triggered by this jira:

A single problematic region which fails to flush all along can result in too 
many hlog files since its edits scatter in almost all hlog files, so even 
though other regions are successfully flushed and their edits are stale in hlog 
files, those hlog files still can't be archived because they are 'polluted' by 
the edits of that single problematic region. 'Polluted' means most of the 
edits, say maybe 99%, have been flushed to hfiles and are stale, but still 
non-archiveable due to the remained 1% edits from the problematic region are 
not flushed to hfile and still 'effective'.

For this case, the periodicFlusher can't help though by design it aims to 
eliminate 'polluated' hlog files resulted from region with sparse but steady 
write(its writes scatter in almost all hlog files but flush can't be triggered 
for this region due to its total size doesn't exceed the flush threshold) by 
checking if it's not flushed for a long time. The reason is the region not 
flushed can be a problematic region, hence can't be flushed by periodicFlusher 
as this jira shows, so it can't eliminate 'polluated' hlog files as desired. 
Too many hlog files can further have bad effect such as more frequent 
unnecessary flush-check(always)...

Once the number of hlog files exceeds the threshold to trigger forceful flush, 
that number will remain without any chance to be lower than that threshold, no 
matter how frequently other regions are flushed, or whether periodicFlusher 
triggers flush for other regions or this problematic regions, even though 
eventually this problematic region refuses/blocks writes to it... This is all 
because the older hlog files(once exceeds the threshold number) all contain 
some edits from this problematic region.

But the idea of background hlog compaction thread introduced by HBASE-9873 can 
help to reduce number of hlog files even there are problematic un-flushable 
regions: it reads original hlog files, only keeps the edits of these 
problematic regions(and of regions not flushed yet) and write them to the new 
hlog files, then archive the original hlog files, which reduces the number of 
hlog files. Since the problematic regions will refuse writes after some time(as 
this jira shows), so the eventual data size of this region left in hlog files 
won't increase infinitely. And such problematic regions won't have that bad 
ripple effect for the total system.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBase

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901269#comment-13901269
 ] 

Anoop Sam John commented on HBASE-10499:


{code}
2014-02-11 08:44:32,881 DEBUG [RpcServer.handler=1,port=60020] 
regionserver.HRegion: Flush requested on 
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f.
{code}
Almost sure that this is because of flush request from batchMutate() as we 
reached flush size after doing the puts.  This request for flush itself was not 
continued. No trace of this region getting started to flush even.So 
possibility is memstoreSize got miscalculated some how ? !!

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionser

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901260#comment-13901260
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


That is where i doubt that the memstoreSize is getting calculated wrong.  
Because once a flush is requested the flusher thread would try to handle it and 
if it is not able to flush then it would just ignore and there are no logs in 
that flow if you see.  Again it is a guess.  
Or is there any reason why the region is not flushed from the queue, may be not 
possible.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,13921078069

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901256#comment-13901256
 ] 

Feng Honghua commented on HBASE-10499:
--

Thanks for check. Yes I noticed regions are selected and forced to flush due to 
too many hlog files. But that can only confirm the write is really heavy. The 
problematic region was also triggered to flush by periodicFlusher since it 
contains old-enough edits, which double confirms this region had not been 
flushed for quite a long time.:-(
The question is why a flush is triggered but doesn't complete all along...

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> us

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-13 Thread Feng Honghua (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901181#comment-13901181
 ] 

Feng Honghua commented on HBASE-10499:
--

bq.Because we can see that the region is requsted for a flush but it does not 
happen and no logs related to flush are printed in the logs. so due to some 
reason this memstore.size() has become 0( I assume this)
Reasonable guess, but it can only justify why no flush happens, but not for 
RegionTooBusyException is thrown by this same region(it should be the same 
region, right?), actually ' memstoreSize.get() <= 0' contradicts the condition 
for RegionTooBusyException to be thrown. This exception is thrown here(another 
place is when failed to lock the region for write). It's impossible for both 
"memstoreSize.get() <= 0" and "this.memstoreSize.get() > 
this.blockingMemStoreSize" hold at the same time...
{code}
if (this.memstoreSize.get() > this.blockingMemStoreSize) {
  requestFlush();
  throw new RegionTooBusyException("Above memstore limit, " + 
...memstoreSize=" + memstoreSize.get()
{code}
It would help if you can attach the detailed client side log for 
RegionTooBusyException stack/info? would be great if with region Info and 
memstoreSize :-)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodic

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900447#comment-13900447
 ] 

Hudson commented on HBASE-10499:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #143 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/143/])
HBASE-10499-In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException- Adding a log alone (ramkrishna: rev 1567894)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> d

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900428#comment-13900428
 ] 

Hudson commented on HBASE-10499:


SUCCESS: Integrated in HBase-0.98 #155 (See 
[https://builds.apache.org/job/HBase-0.98/155/])
HBASE-10499-In write heavy scenario one of the regions does not get flushed 
causing RegionTooBusyException- Adding a log alone (ramkrishna: rev 1567894)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 0

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-13 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900253#comment-13900253
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


committed to 0.98 alone.  Will try reproducing this.  

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:4

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898838#comment-13898838
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


Not able to reproduce this further again.  

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
> Attachments: hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, 
> t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver60020.peri

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898030#comment-13898030
 ] 

Ted Yu commented on HBASE-10499:


bq. I have logs and thread dumps taken during this time

Please attach them.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897787#comment-13897787
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


In HBASE-5568 and HBASE-5312 there were multiple flushes on the same region and 
splitting of regions were happening.  Here no such things happen.  The region 
in discussion has not been flushed even once and there are no splits or 
compactions that has happened on it.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,us

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897780#comment-13897780
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


I have logs and thread dumps taken during this time.  If needed can attach them 
here.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: reg

[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897779#comment-13897779
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


Am not sure if this could come in 0.96 and trunk also.  I feel 0.96 it is 
possible but with trunk (recent HLog disruptor) changes am not sure.  Also may 
be possible in 0.94.
I don't have any soln in hand except for adding log msgs in such a case where 
memstoreSize could be zero. Will check more on this.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.0, 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region

73 matches

Mail list logo