[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Marc Spaggiari updated HBASE-18451:
----------------------------------------
    Description: 
If you run a big job every 4 hours, impacting many tables (they have 150 
regions per server), ad the end all the regions might have some data to be 
flushed, and we want, after one hour, trigger a periodic flush. That's totally 
fine.

Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
"randomDelay" to the delayed flush, that way we spread them away.

RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
which is very good.

However, because we don't check if there is already a request in the queue, 10 
seconds after, we create a new request, with a new randomDelay.

If you generate a randomDelay every 10 seconds, at some point, you will end up 
having a small one, and the flush will be triggered almost immediatly.

As a result, instead of spreading all the flush within the next 5 minutes, you 
end-up getting them all way more quickly. Like within the first minute. Which 
not only feed the queue to to many flush requests, but also defeats the purpose 
of the randomDelay.

{code}
    @Override
    protected void chore() {
      final StringBuffer whyFlush = new StringBuffer();
      for (Region r : this.server.onlineRegions.values()) {
        if (r == null) continue;
        if (((HRegion)r).shouldFlush(whyFlush)) {
          FlushRequester requester = server.getFlushRequester();
          if (requester != null) {
            long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
MIN_DELAY_TIME;
            LOG.info(getName() + " requesting flush of " +
              r.getRegionInfo().getRegionNameAsString() + " because " +
              whyFlush.toString() +
              " after random delay " + randomDelay + "ms");
            //Throttle the flushes by putting a delay. If we don't throttle, 
and there
            //is a balanced write-load on the regions in a table, we might end 
up
            //overwhelming the filesystem with too many flushes at once.
            requester.requestDelayedFlush(r, randomDelay, false);
          }
        }
      }
    }
{code}


{code}
2017-07-24 18:44:33,338 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 270785ms
2017-07-24 18:44:43,328 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 200143ms
2017-07-24 18:44:53,954 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 191082ms
2017-07-24 18:45:03,528 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 92532ms
2017-07-24 18:45:14,201 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 238780ms
2017-07-24 18:45:24,195 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 35390ms
2017-07-24 18:45:33,362 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 283034ms
2017-07-24 18:45:43,933 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 84328ms
2017-07-24 18:45:53,866 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 72291ms
2017-07-24 18:46:03,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 62658ms
2017-07-24 18:46:14,084 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 87491ms
2017-07-24 18:46:23,538 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 290170ms
2017-07-24 18:46:33,353 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 228541ms
2017-07-24 18:46:43,359 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 211814ms
2017-07-24 18:46:54,150 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 155167ms
2017-07-24 18:47:03,546 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 115318ms
2017-07-24 18:47:13,350 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 216662ms
2017-07-24 18:47:23,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 96018ms
2017-07-24 18:47:33,891 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 144751ms
2017-07-24 18:47:43,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 152461ms
2017-07-24 18:47:53,832 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 97352ms
2017-07-24 18:48:03,348 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 263910ms
2017-07-24 18:48:13,353 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 151793ms
2017-07-24 18:48:23,340 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 61926ms
2017-07-24 18:48:33,724 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 188901ms
2017-07-24 18:48:44,038 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 233148ms
2017-07-24 18:48:53,624 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 6206ms
2017-07-24 18:49:03,378 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 240946ms
2017-07-24 18:49:04,125 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Flushing 1/1 column families, memstore=160 B
2017-07-24 18:49:04,182 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=6, memsize=160, hasBloomFilter=true, into tmp file 
hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/.tmp/3609340d58ae4ca190e62c3b0fe415f4
2017-07-24 18:49:04,208 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/f/3609340d58ae4ca190e62c3b0fe415f4,
 entries=1, sequenceid=6, filesize=1015
2017-07-24 18:49:04,212 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~160 B/160, currentsize=0 B/0 for region 
testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. in 87ms, 
sequenceid=6, compaction requested=false
{code}

  was:
If you run a big job every 4 hours, impacting many tables (they have 150 
regions per server), ad the end all the regions might have some data to be 
flushed, and we want, after one hour, trigger a periodic flush. That's totally 
fine.

Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
"randomDelay" to the delayed flush, that way we spread them away.

RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
which is very good.

However, because we don't check if there is already a request in the queue, 10 
seconds after, we create a new request, with a new randomDelay.

If you generate a randomDelay every 10 seconds, at some point, you will end up 
having a small one, and the flush will be triggered almost immediatly.

As a result, instead of spreading all the flush within the next 5 minutes, you 
end-up getting them all way more quickly. Like within the first minute. Which 
defeats the purpose of the randomDelay.

{code}
    @Override
    protected void chore() {
      final StringBuffer whyFlush = new StringBuffer();
      for (Region r : this.server.onlineRegions.values()) {
        if (r == null) continue;
        if (((HRegion)r).shouldFlush(whyFlush)) {
          FlushRequester requester = server.getFlushRequester();
          if (requester != null) {
            long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
MIN_DELAY_TIME;
            LOG.info(getName() + " requesting flush of " +
              r.getRegionInfo().getRegionNameAsString() + " because " +
              whyFlush.toString() +
              " after random delay " + randomDelay + "ms");
            //Throttle the flushes by putting a delay. If we don't throttle, 
and there
            //is a balanced write-load on the regions in a table, we might end 
up
            //overwhelming the filesystem with too many flushes at once.
            requester.requestDelayedFlush(r, randomDelay, false);
          }
        }
      }
    }
{code}


{code}
2017-07-24 18:44:33,338 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 270785ms
2017-07-24 18:44:43,328 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 200143ms
2017-07-24 18:44:53,954 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 191082ms
2017-07-24 18:45:03,528 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 92532ms
2017-07-24 18:45:14,201 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 238780ms
2017-07-24 18:45:24,195 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 35390ms
2017-07-24 18:45:33,362 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 283034ms
2017-07-24 18:45:43,933 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 84328ms
2017-07-24 18:45:53,866 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 72291ms
2017-07-24 18:46:03,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 62658ms
2017-07-24 18:46:14,084 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 87491ms
2017-07-24 18:46:23,538 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 290170ms
2017-07-24 18:46:33,353 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 228541ms
2017-07-24 18:46:43,359 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 211814ms
2017-07-24 18:46:54,150 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 155167ms
2017-07-24 18:47:03,546 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 115318ms
2017-07-24 18:47:13,350 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 216662ms
2017-07-24 18:47:23,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 96018ms
2017-07-24 18:47:33,891 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 144751ms
2017-07-24 18:47:43,329 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 152461ms
2017-07-24 18:47:53,832 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 97352ms
2017-07-24 18:48:03,348 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 263910ms
2017-07-24 18:48:13,353 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 151793ms
2017-07-24 18:48:23,340 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 61926ms
2017-07-24 18:48:33,724 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 188901ms
2017-07-24 18:48:44,038 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 233148ms
2017-07-24 18:48:53,624 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 6206ms
2017-07-24 18:49:03,378 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
has an old edit so flush to free WALs after random delay 240946ms
2017-07-24 18:49:04,125 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Flushing 1/1 column families, memstore=160 B
2017-07-24 18:49:04,182 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=6, memsize=160, hasBloomFilter=true, into tmp file 
hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/.tmp/3609340d58ae4ca190e62c3b0fe415f4
2017-07-24 18:49:04,208 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/f/3609340d58ae4ca190e62c3b0fe415f4,
 entries=1, sequenceid=6, filesize=1015
2017-07-24 18:49:04,212 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~160 B/160, currentsize=0 B/0 for region 
testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. in 87ms, 
sequenceid=6, compaction requested=false
{code}


> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-18451
>                 URL: https://issues.apache.org/jira/browse/HBASE-18451
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.0.0-alpha-1
>            Reporter: Jean-Marc Spaggiari
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
>     @Override
>     protected void chore() {
>       final StringBuffer whyFlush = new StringBuffer();
>       for (Region r : this.server.onlineRegions.values()) {
>         if (r == null) continue;
>         if (((HRegion)r).shouldFlush(whyFlush)) {
>           FlushRequester requester = server.getFlushRequester();
>           if (requester != null) {
>             long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
>             LOG.info(getName() + " requesting flush of " +
>               r.getRegionInfo().getRegionNameAsString() + " because " +
>               whyFlush.toString() +
>               " after random delay " + randomDelay + "ms");
>             //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
>             //is a balanced write-load on the regions in a table, we might 
> end up
>             //overwhelming the filesystem with too many flushes at once.
>             requester.requestDelayedFlush(r, randomDelay, false);
>           }
>         }
>       }
>     }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 283034ms
> 2017-07-24 18:45:43,933 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 84328ms
> 2017-07-24 18:45:53,866 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 72291ms
> 2017-07-24 18:46:03,329 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 62658ms
> 2017-07-24 18:46:14,084 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 87491ms
> 2017-07-24 18:46:23,538 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 290170ms
> 2017-07-24 18:46:33,353 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 228541ms
> 2017-07-24 18:46:43,359 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 211814ms
> 2017-07-24 18:46:54,150 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 155167ms
> 2017-07-24 18:47:03,546 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 115318ms
> 2017-07-24 18:47:13,350 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 216662ms
> 2017-07-24 18:47:23,329 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 96018ms
> 2017-07-24 18:47:33,891 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 144751ms
> 2017-07-24 18:47:43,329 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 152461ms
> 2017-07-24 18:47:53,832 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 97352ms
> 2017-07-24 18:48:03,348 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 263910ms
> 2017-07-24 18:48:13,353 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 151793ms
> 2017-07-24 18:48:23,340 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 61926ms
> 2017-07-24 18:48:33,724 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 188901ms
> 2017-07-24 18:48:44,038 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 233148ms
> 2017-07-24 18:48:53,624 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 6206ms
> 2017-07-24 18:49:03,378 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 240946ms
> 2017-07-24 18:49:04,125 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Flushing 1/1 column families, memstore=160 B
> 2017-07-24 18:49:04,182 INFO 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
> sequenceid=6, memsize=160, hasBloomFilter=true, into tmp file 
> hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/.tmp/3609340d58ae4ca190e62c3b0fe415f4
> 2017-07-24 18:49:04,208 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Added 
> hdfs://hbasetest1.domainname.com:8020/hbase/data/default/testflush/578c27d2eb7ef0ad437bf2ff38c053ae/f/3609340d58ae4ca190e62c3b0fe415f4,
>  entries=1, sequenceid=6, filesize=1015
> 2017-07-24 18:49:04,212 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~160 B/160, currentsize=0 B/0 for region 
> testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. in 87ms, 
> sequenceid=6, compaction requested=false
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to