[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---
Assignee: Ted Yu  (was: ramkrishna.s.vasudevan)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [

Re: [jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread ramkrishna vasudevan
+1 on patch
On Jan 22, 2015 9:12 PM, "Ted Yu (JIRA)"  wrote:
>
>
>  [
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
>
> Ted Yu updated HBASE-10499:
> ---
> Attachment: 10499-v6.txt
>
> > In write heavy scenario one of the regions does not get flushed causing
RegionTooBusyException
> >
--
> >
> > Key: HBASE-10499
> > URL: https://issues.apache.org/jira/browse/HBASE-10499
> > Project: HBase
> >  Issue Type: Bug
> >Affects Versions: 0.98.0
> >Reporter: ramkrishna.s.vasudevan
> >Assignee: ramkrishna.s.vasudevan
> >Priority: Critical
> > Fix For: 2.0.0, 1.1.0
> >
> > Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt,
10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch,
compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip,
hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log,
master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump,
workloada_0.98.dat
> >
> >
> > I got this while testing 0.98RC.  But am not sure if it is specific to
this version.  Doesn't seem so to me.
> > Also it is something similar to HBASE-5312 and HBASE-5568.
> > Using 10 threads i do writes to 4 RS using YCSB. The table created has
200 regions.  In one of the run with 0.98 server and 0.98 client I faced
this problem like the hlogs became more and the system requested flushes
for those many regions.
> > One by one everything was flushed except one and that one thing
remained unflushed.  The ripple effect of this on the client side
> > {code}
> > com.yahoo.ycsb.DBException:
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
> > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> > at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> > at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> > Caused by:
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
54 actions: RegionTooBusyException: 54 times,
> > at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> > at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> > at
org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> > at
org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> > at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> > at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> > ... 2 more
> > {code}
> > On one of the RS
> > {code}
> > 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog:
Too many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s):
97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2,
01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e,
127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253,
29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463,
acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84,
0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e,
d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69,
0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f,
bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b,
cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e,
6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1,
acc43e4b42c1a041078774f4f20a3ff5
> > ..
> > 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog:
Too many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s):
fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> > {code}
> > {code}
> > 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher]
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 16689
> > 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher]
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a
delay of 15868
> > 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher]
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flush for region
usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a
delay of 20847
> > 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher]
regionserver.HRegionServer: regionserver60020.periodicFlusher requesting
flus

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Fix Version/s: 0.98.10
   1.0.0

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver6002

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-0.98.txt

Patch for 0.98 adds the TestFlushRegionEntry which was absent in 0.98

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-0.98.txt, 10499-1.0.txt, 10499-v2.txt, 
> 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 
> 10499-v8.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-1.0.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-1.0.txt, 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> 10499-v6.txt, 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regi

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver6

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v8.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, 10499-v8.txt, HBASE-10499.patch, 
> HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.pe

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v7.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, 10499-v7.txt, HBASE-10499.patch, HBASE-10499_v5.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher req

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v6.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> 10499-v6.txt, HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flu

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-22 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v6.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 10499-v6.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---
Attachment: HBASE-10499_v5.patch

Going thro the comments again and the excellent analysis by [~fenghh] and 
[~johnking0099] I can see that the log that comes out of global heap pressure
{code}
2014-02-11 08:44:30,580 INFO  [Thread-18] regionserver.MemStoreFlusher: Flush 
of region usertable,user1854,1392107806972.e001ae4de867efc4c4c092e7e5a204e1. 
due to global heap pressure
{code}
Does not appear for the problematic regions.  So this would ensure that we have 
not entered the removal from FQE for these regions.
{code}
  if (fqe != null && emergencyFlush) {
// Need to remove from region from delay queue.  When NOT an
// emergencyFlush, then item was removed via a flushQueue.poll.
flushQueue.remove(fqe);
 }
{code}
which means they had been removed from FQE but not from regionQueue in such a 
way that FQE was wrongly mapped to an element which was not removed from the 
regionQueue.  So if region B had to be flushed and it was removed from 
regionQueue, but from FQE region A got removed.
So the delay is calculated 
{code}
@Override
public long getDelay(TimeUnit unit) {
  return unit.convert(this.whenToExpire - 
EnvironmentEdgeManager.currentTime(),
  TimeUnit.MILLISECONDS);
}
{code}
where whenToExpire is updated in two places, creation of the FRE which is 
actually synchronized with the regionQueue.  So Ideally we would not have two 
regions with the same whenToExpire (considering there is no precision issue)  
and in requeue().   The requeue() could update the whenToExpire to the same ts. 
But here it has not happened.  
If we are going to change the implementation of the equals() in FQE I would 
suggest compare if the region is the same and then go for the getDelay check in 
the compareTo()

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, HBASE-10499_v5.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v4.txt

@John:
I get your point :-)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, 10499-v4.txt, 
> HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> user

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v3.txt

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, 10499-v3.txt, HBASE-10499.patch, 
> compaction-queue.png, hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Status: Patch Available  (was: Open)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e14

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2015-01-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10499:
---
Attachment: 10499-v2.txt

Tentative patch which uses region's hashcode as tie-breaker.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10499-v2.txt, HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> us

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-12-17 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10499:
--
Fix Version/s: (was: 1.0.0)
   1.1.0

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-12-02 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10499:
--
Fix Version/s: (was: 0.99.2)
   2.0.0
   1.0.0

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.0.0, 2.0.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> userta

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-10-11 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10499:
--
Fix Version/s: (was: 0.99.1)
   0.99.2

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.2
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b7

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-09-09 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10499:
--
Fix Version/s: (was: 0.99.0)
   0.99.1

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.1
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b7

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-07-28 Thread Matt Kapilevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kapilevich updated HBASE-10499:


Attachment: compaction-queue.png

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: HBASE-10499.patch, compaction-queue.png, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4c

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-05-21 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10499:
---

Fix Version/s: (was: 0.98.3)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> del

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-04-30 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10499:
---

Fix Version/s: (was: 0.98.2)
   0.98.3

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0, 0.98.3
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503dd

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-03-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10499:
---

Fix Version/s: (was: 0.98.1)
   0.98.2

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.99.0, 0.98.2
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503dd

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-17 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10499:
-

Attachment: rs_576f.log
rs_4e39.log
master_576f.log
master_4e39.log

log for 6b788c498503ddd3e1433a4cd3fb4e39 in regionserver and master, log for 
fdbb3242d3b673bbe4790a47bc30576f in regionserver and master are attached for 
reference.

You can find the similarity(before master balance) and difference(after master 
balance) between these two regions.

6b788c498503ddd3e1433a4cd3fb4e39 would be the problematic region if it wasn't 
chosen as a move target when master did balance...

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
> master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 201

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-16 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Attachment: hbase-root-master-ip-10-157-0-229.zip

Master logs that you asked for [~fenghh].  For the problematic region i could 
not find anything specific there.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-master-ip-10-157-0-229.zip, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-14 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Attachment: workloada_0.98.dat

Check this .. I don't have the client logs. I have only the exception that i 
got on the client side.
If you see the logs attached on the server you could see that the no of hlog 
was greater than 32.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump, 
> workloada_0.98.dat
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlus

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10499:
---

Fix Version/s: 0.99.0

+1 for adding logging for this case. Let's get it in, rerun the test, and see 
if we can reproduce this issue.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1, 0.99.0
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Attachment: HBASE-10499.patch

Adding a patch that adds a log in case of memstore size is 0.

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
> Attachments: HBASE-10499.patch, 
> hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regions

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Attachment: hbase-root-regionserver-ip-10-93-128-92.zip
t2.dump
t1.dump

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
> Attachments: hbase-root-regionserver-ip-10-93-128-92.zip, t1.dump, 
> t2.dump
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver600

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Fix Version/s: (was: 0.98.0)

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d

[jira] [Updated] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-11 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10499:
---

Affects Version/s: 0.98.0

> In write heavy scenario one of the regions does not get flushed causing 
> RegionTooBusyException
> --
>
> Key: HBASE-10499
> URL: https://issues.apache.org/jira/browse/HBASE-10499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.98.1
>
>
> I got this while testing 0.98RC.  But am not sure if it is specific to this 
> version.  Doesn't seem so to me.  
> Also it is something similar to HBASE-5312 and HBASE-5568.
> Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
> regions.  In one of the run with 0.98 server and 0.98 client I faced this 
> problem like the hlogs became more and the system requested flushes for those 
> many regions.
> One by one everything was flushed except one and that one thing remained 
> unflushed.  The ripple effect of this on the client side
> {code}
> com.yahoo.ycsb.DBException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
> at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
> at com.yahoo.ycsb.ClientThread.run(Client.java:307)
> Caused by: 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
> 54 actions: RegionTooBusyException: 54 times,
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
> at 
> org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
> at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
> at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
> at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
> ... 2 more
> {code}
> On one of the RS
> {code}
> 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
> 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
> 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
> 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
> 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
> acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
> 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
> d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
> 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
> bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
> cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
> 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
> acc43e4b42c1a041078774f4f20a3ff5
> ..
> 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
> many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
> fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
> {code}
> {code}
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 16689
> 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 15868
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
> delay of 20847
> 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
> delay of 20099
> 2014-02-11 09:43:04,238 INFO  [regionserver60020.periodicFlusher] 
> regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
> flush for region 
> usertable,user3654,1392107806977.fdbb3242d3b673bb