[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables
[ https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104582#comment-16104582 ] Zheng Hu commented on HBASE-16466: -- [~sukuna...@gmail.com], I have one question for your test provided. Did you run the MR job on the same HDFS cluster for Source/Peer HBase Cluster & Yarn Cluster ? Seems like that when source hbase cluster / peer hbase cluster / yarn cluster locate in three different HDFS cluster , it has one problem. when restoring the snapshot into tmpdir , we need to create region by following code (HRegion#createHRegion) {code} public static HRegion createHRegion(final HRegionInfo info, final Path rootDir, final Configuration conf, final TableDescriptor hTableDescriptor, final WAL wal, final boolean initialize) throws IOException { LOG.info("creating HRegion " + info.getTable().getNameAsString() + " HTD == " + hTableDescriptor + " RootDir = " + rootDir + " Table name == " + info.getTable().getNameAsString()); FileSystem fs = FileSystem.get(conf); <--- Here our code use fs.defaultFs configuration to create region. Path tableDir = FSUtils.getTableDir(rootDir, info.getTable()); HRegionFileSystem.createRegionOnFileSystem(conf, fs, tableDir, info); HRegion region = HRegion.newHRegion(tableDir, wal, fs, conf, info, hTableDescriptor, null); if (initialize) region.initialize(null); return region; } {code} When source cluster & peer cluster locate in two difference file systems , then their fs.defaultFs should be difference, so at least one cluster will fail when restore snapshot into tmpdir . after I added the following fix, it works fine for me. {code} - FileSystem fs = FileSystem.get(conf); + FileSystem fs = rootDir.getFileSystem(conf); {code} Looking forward to your reply, Thanks. > HBase snapshots support in VerifyReplication tool to reduce load on live > HBase cluster with large tables > > > Key: HBASE-16466 > URL: https://issues.apache.org/jira/browse/HBASE-16466 > Project: HBase > Issue Type: Improvement > Components: hbase >Affects Versions: 0.98.21 >Reporter: Sukumar Maddineni >Assignee: Maddineni Sukumar > Fix For: 2.0.0 > > Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, > HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, > HBASE-16466.v5.patch > > > As of now VerifyReplicatin tool is running using normal HBase scanners. If > you want to run VerifyReplication multiple times on a production live > cluster with large tables then it creates extra load on HBase layer. So if we > implement snapshot based support then both in source and target we can read > data from snapshots which reduces load on HBase -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18469) Correct RegionServer metric of totalRequestCount
Shibin Zhang created HBASE-18469: Summary: Correct RegionServer metric of totalRequestCount Key: HBASE-18469 URL: https://issues.apache.org/jira/browse/HBASE-18469 Project: HBase Issue Type: Bug Affects Versions: 1.2.0 Reporter: Shibin Zhang Priority: Minor when i get the metric ,i found this three metric may be have some error as follow : "totalRequestCount" : 17541, "readRequestCount" : 17483, "writeRequestCount" : 1633, -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Singh Chouhan updated HBASE-15134: --- Attachment: HBASE-15134.branch-1.001.patch Fixed trailing whitespaces. > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 2.0.0 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104606#comment-16104606 ] Abhishek Singh Chouhan commented on HBASE-15134: Pushed to branch-1.4+. > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 2.0.0 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104608#comment-16104608 ] Hadoop QA commented on HBASE-15134: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} HBASE-15134 does not apply to branch-1. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-15134 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879310/HBASE-15134.branch-1.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7823/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 2.0.0 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-18451 started by nihed mbarek. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae.
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch Patch provided with code refactor to support boolean return for requestDelayedFlush and requestFlush > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after rand
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: In Progress) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,6002
[jira] [Updated] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Singh Chouhan updated HBASE-15134: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: (was: 2.0.0) 2.0.0-alpha-2 1.5.0 1.4.0 3.0.0 Status: Resolved (was: Patch Available) > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104606#comment-16104606 ] Abhishek Singh Chouhan edited comment on HBASE-15134 at 7/28/17 8:08 AM: - Pushed to branch-1.4+. Thanks for the reviews !! was (Author: abhishek.chouhan): Pushed to branch-1.4+. > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104618#comment-16104618 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879311/0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7824/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104623#comment-16104623 ] Abhishek Singh Chouhan commented on HBASE-15134: Hadoop QA ran after the patch with whitespace fix was reattached and pushed to branch hence errored out. > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,15009
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104649#comment-16104649 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879313/0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7825/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: > 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, > 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: In Progress (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: (was: 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: (was: 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private
[ https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104677#comment-16104677 ] Duo Zhang commented on HBASE-18446: --- OK, got it. Seems we need to introduce an interface for {{StoreFileReader}} if we want to hide the implementation details to CP users. A big refactoring... > Mark StoreFileScanner as IA.Private > --- > > Key: HBASE-18446 > URL: https://issues.apache.org/jira/browse/HBASE-18446 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Duo Zhang > Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2 > > > Do not see any reason why it is marked as IA.LimitedPrivate. It is not > referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: ISSUE.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: In Progress) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flus
[jira] [Commented] (HBASE-17131) Avoid livelock caused by HRegion#processRowsWithLocks
[ https://issues.apache.org/jira/browse/HBASE-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104679#comment-16104679 ] Hudson commented on HBASE-17131: SUCCESS: Integrated in Jenkins build HBase-1.2-JDK8 #170 (See [https://builds.apache.org/job/HBase-1.2-JDK8/170/]) HBASE-17131 Avoid livelock caused by HRegion#processRowsWithLocks (chia7712: rev 670e9431d40d35df4802bc0445012271ee904efc) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide3.java > Avoid livelock caused by HRegion#processRowsWithLocks > - > > Key: HBASE-17131 > URL: https://issues.apache.org/jira/browse/HBASE-17131 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.6 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0, 1.4.0, 1.3.2, 1.2.7 > > Attachments: HBASE-17131.branch-1.2.v0.patch, > HBASE-17131.branch-1.3.v0.patch, HBASE-17131.branch-1.v0.patch, > HBASE-17131.v0.patch > > > {code:title=HRegion.java|borderStyle=solid} > try { > // STEP 2. Acquire the row lock(s) > acquiredRowLocks = new ArrayList(rowsToLock.size()); > for (byte[] row : rowsToLock) { > // Attempt to lock all involved rows, throw if any lock times out > // use a writer lock for mixed reads and writes > acquiredRowLocks.add(getRowLockInternal(row, false)); > } > // STEP 3. Region lock > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > locked = true; > boolean success = false; > long now = EnvironmentEdgeManager.currentTime(); > try { > {code} > We should lock all involved rows in the second try-finally. Otherwise, we > won’t release the previous locks if any subsequent lock times out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104681#comment-16104681 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879318/ISSUE.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7826/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WA
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: (was: ISSUE.patch) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: ISSUE.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of te
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104694#comment-16104694 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879321/ISSUE.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7827/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: ISSUE.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WA
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: HBASE-18451.master.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: (was: ISSUE.patch) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting >
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104706#comment-16104706 ] Hadoop QA commented on HBASE-18451: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879322/HBASE-18451.master.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7828/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an o
[jira] [Created] (HBASE-18470) `RetriesExhaustedWithDetailsException#getDesc` describe is not right
Benedict Jin created HBASE-18470: Summary: `RetriesExhaustedWithDetailsException#getDesc` describe is not right Key: HBASE-18470 URL: https://issues.apache.org/jira/browse/HBASE-18470 Project: HBase Issue Type: Bug Components: Client Affects Versions: 2.0.0-alpha-1 Reporter: Benedict Jin The describe from `RetriesExhaustedWithDetailsException#getDesc` is ` org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 actions: FailedServerException: 3 times, `, there is a not need ', ' in the tail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: (was: HBASE-18451.master.patch) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Attachment: HBASE-18451.master.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Patch Available (was: Open) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nihed mbarek updated HBASE-18451: - Status: Open (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Commented] (HBASE-17131) Avoid livelock caused by HRegion#processRowsWithLocks
[ https://issues.apache.org/jira/browse/HBASE-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104723#comment-16104723 ] Hudson commented on HBASE-17131: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK8 #224 (See [https://builds.apache.org/job/HBase-1.3-JDK8/224/]) HBASE-17131 Avoid livelock caused by HRegion#processRowsWithLocks (chia7712: rev f18f916f050cf4dc106543d3dc7c6d2f78077661) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide3.java > Avoid livelock caused by HRegion#processRowsWithLocks > - > > Key: HBASE-17131 > URL: https://issues.apache.org/jira/browse/HBASE-17131 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.6 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0, 1.4.0, 1.3.2, 1.2.7 > > Attachments: HBASE-17131.branch-1.2.v0.patch, > HBASE-17131.branch-1.3.v0.patch, HBASE-17131.branch-1.v0.patch, > HBASE-17131.v0.patch > > > {code:title=HRegion.java|borderStyle=solid} > try { > // STEP 2. Acquire the row lock(s) > acquiredRowLocks = new ArrayList(rowsToLock.size()); > for (byte[] row : rowsToLock) { > // Attempt to lock all involved rows, throw if any lock times out > // use a writer lock for mixed reads and writes > acquiredRowLocks.add(getRowLockInternal(row, false)); > } > // STEP 3. Region lock > lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : > acquiredRowLocks.size()); > locked = true; > boolean success = false; > long now = EnvironmentEdgeManager.currentTime(); > try { > {code} > We should lock all involved rows in the second try-finally. Otherwise, we > won’t release the previous locks if any subsequent lock times out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104744#comment-16104744 ] Hudson commented on HBASE-15134: FAILURE: Integrated in Jenkins build HBase-1.4 #826 (See [https://builds.apache.org/job/HBase-1.4/826/]) HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 92780371080a341d0b6f98307a0ea176db327c5a) * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * (edit) hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java * (edit) hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18437) Revoke access permissions of a user from a table does not work as expected
[ https://issues.apache.org/jira/browse/HBASE-18437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104826#comment-16104826 ] Anoop Sam John commented on HBASE-18437: bq.if (Bytes.toString(perm.getUser()).equals(Bytes.toString(userPerm.getUser( { The permsList is obtained for this user and why again user check? Sorry not getting. Or u have to check the table details? bq.perm.setActions(leftActions.toArray(new Permission.Action[leftActions.size()])); Should we create new UserPermission instance than adding this setter? Seems like as per design the actions has to be final (Even though it is not marked so) > Revoke access permissions of a user from a table does not work as expected > -- > > Key: HBASE-18437 > URL: https://issues.apache.org/jira/browse/HBASE-18437 > Project: HBase > Issue Type: Bug > Components: security >Affects Versions: 1.1.12 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Attachments: HBASE-18437.patch > > > A table for which a user was granted 'RW' permission. Now when we want to > revoke its 'W' permission only, code removes the user itself from that table > permissions. > Below is the test code which reproduces the issue. > {noformat} > @Test(timeout = 18) > public void testRevokeOnlySomePerms() throws Throwable { > TableName name = TableName.valueOf("testAgain"); > HTableDescriptor htd = new HTableDescriptor(name); > HColumnDescriptor hcd = new HColumnDescriptor("cf"); > htd.addFamily(hcd); > createTable(TEST_UTIL, htd); > TEST_UTIL.waitUntilAllRegionsAssigned(name); > try (Connection conn = ConnectionFactory.createConnection(conf)) { > AccessControlClient.grant(conn, name, USER_RO.getShortName(), null, > null, Action.READ, Action.WRITE); > ListMultimap tablePermissions = > AccessControlLists.getTablePermissions(conf, name); > // hbase user and USER_RO has permis > assertEquals(2, tablePermissions.size()); > AccessControlClient.revoke(conn, name, USER_RO.getShortName(), null, > null, Action.WRITE); > tablePermissions = AccessControlLists.getTablePermissions(conf, name); > List userPerm = > tablePermissions.get(USER_RO.getShortName()); > assertEquals(1, userPerm.size()); > } finally { > deleteTable(TEST_UTIL, name); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.
Thomas Martens created HBASE-18471: -- Summary: Deleted qualifier re-appearing after multiple puts. Key: HBASE-18471 URL: https://issues.apache.org/jira/browse/HBASE-18471 Project: HBase Issue Type: Bug Components: Deletes, hbase, scan Affects Versions: 1.3.0 Reporter: Thomas Martens The qualifier of a deleted row (with keep deleted cells true) re-appears after re-inserting the same row multiple times (with different timestamp) with an empty qualifier. Scenario: # Put row with family and qualifier (timestamp 1). # Delete entire row (timestamp 2). # Put same row again with family without qualifier (timestamp 3). A scan (latest version) returns the row with family without qualifier, version 3 (which is correct). # Put the same row again with family without qualifier (timestamp 4). A scan (latest version) returns multiple rows: * the row with family without qualifier, version 4 (which is correct). * the row with family with qualifier, version 1 (which is wrong). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18142) Deletion of a cell deletes the previous versions too
[ https://issues.apache.org/jira/browse/HBASE-18142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104838#comment-16104838 ] Sahil Aggarwal commented on HBASE-18142: Noob quest: On looking at _deleteall_internal in table.rb, if row name is a Hash then we call _deleterows_internal but here we don't honor the timestamp provided in row hash and use latest timestamp, could this be the reason we are deleting all the versions of all the cells in that row? > Deletion of a cell deletes the previous versions too > > > Key: HBASE-18142 > URL: https://issues.apache.org/jira/browse/HBASE-18142 > Project: HBase > Issue Type: Bug > Components: API >Reporter: Karthick > Labels: beginner > > When I tried to delete a cell using it's timestamp in the Hbase Shell, the > previous versions of the same cell also got deleted. But when I tried the > same using the Java API, then the previous versions are not deleted and I can > retrive the previous values. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java > see this file to fix the issue. This method (public Delete addColumns(final > byte [] family, final byte [] qualifier, final long timestamp)) only deletes > the current version of the cell. The previous versions are not deleted. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.
[ https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Martens updated HBASE-18471: --- Attachment: HBaseDmlTest.java > Deleted qualifier re-appearing after multiple puts. > --- > > Key: HBASE-18471 > URL: https://issues.apache.org/jira/browse/HBASE-18471 > Project: HBase > Issue Type: Bug > Components: Deletes, hbase, scan >Affects Versions: 1.3.0 >Reporter: Thomas Martens > Attachments: HBaseDmlTest.java > > > The qualifier of a deleted row (with keep deleted cells true) re-appears > after re-inserting the same row multiple times (with different timestamp) > with an empty qualifier. > Scenario: > # Put row with family and qualifier (timestamp 1). > # Delete entire row (timestamp 2). > # Put same row again with family without qualifier (timestamp 3). > A scan (latest version) returns the row with family without qualifier, > version 3 (which is correct). > # Put the same row again with family without qualifier (timestamp 4). > A scan (latest version) returns multiple rows: > * the row with family without qualifier, version 4 (which is correct). > * the row with family with qualifier, version 1 (which is wrong). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private
[ https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104860#comment-16104860 ] Anoop Sam John commented on HBASE-18446: Ya then StoreFileReader is what should be exposed. As Duo said not the impl class but an interface. Then why expose StoreFileScanner and StoreFile? > Mark StoreFileScanner as IA.Private > --- > > Key: HBASE-18446 > URL: https://issues.apache.org/jira/browse/HBASE-18446 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Duo Zhang > Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2 > > > Do not see any reason why it is marked as IA.LimitedPrivate. It is not > referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.
[ https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Martens updated HBASE-18471: --- Description: The qualifier of a deleted row (with keep deleted cells true) re-appears after re-inserting the same row multiple times (with different timestamp) with an empty qualifier. Scenario: # Put row with family and qualifier (timestamp 1). # Delete entire row (timestamp 2). # Put same row again with family without qualifier (timestamp 3). A scan (latest version) returns the row with family without qualifier, version 3 (which is correct). # Put the same row again with family without qualifier (timestamp 4). A scan (latest version) returns multiple rows: * the row with family without qualifier, version 4 (which is correct). * the row with family with qualifier, version 1 (which is wrong). There is a test scenario attached. output: 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml 13:42:58,592 [main] client.HBaseAdmin - Created test_dml Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with timestamp: '1' Scan printout => Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue' Delete row: 'myRow' Scan printout => Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '3' Scan printout => Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 'myValue' Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '4' Scan printout => Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 'myValue' Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue' was: The qualifier of a deleted row (with keep deleted cells true) re-appears after re-inserting the same row multiple times (with different timestamp) with an empty qualifier. Scenario: # Put row with family and qualifier (timestamp 1). # Delete entire row (timestamp 2). # Put same row again with family without qualifier (timestamp 3). A scan (latest version) returns the row with family without qualifier, version 3 (which is correct). # Put the same row again with family without qualifier (timestamp 4). A scan (latest version) returns multiple rows: * the row with family without qualifier, version 4 (which is correct). * the row with family with qualifier, version 1 (which is wrong). > Deleted qualifier re-appearing after multiple puts. > --- > > Key: HBASE-18471 > URL: https://issues.apache.org/jira/browse/HBASE-18471 > Project: HBase > Issue Type: Bug > Components: Deletes, hbase, scan >Affects Versions: 1.3.0 >Reporter: Thomas Martens > Attachments: HBaseDmlTest.java > > > The qualifier of a deleted row (with keep deleted cells true) re-appears > after re-inserting the same row multiple times (with different timestamp) > with an empty qualifier. > Scenario: > # Put row with family and qualifier (timestamp 1). > # Delete entire row (timestamp 2). > # Put same row again with family without qualifier (timestamp 3). > A scan (latest version) returns the row with family without qualifier, > version 3 (which is correct). > # Put the same row again with family without qualifier (timestamp 4). > A scan (latest version) returns multiple rows: > * the row with family without qualifier, version 4 (which is correct). > * the row with family with qualifier, version 1 (which is wrong). > There is a test scenario attached. > output: > 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml > 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml > 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml > 13:42:58,592 [main] client.HBaseAdmin - Created test_dml > Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with > timestamp: '1' > Scan printout => > Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', > Value: 'myValue' > Delete row: 'myRow' > Scan printout => > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '3' > Scan printout => > Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '4' > Scan printout => > Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', > Value: 'myValue' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.
[ https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Martens updated HBASE-18471: --- Description: The qualifier of a deleted row (with keep deleted cells true) re-appears after re-inserting the same row multiple times (with different timestamp) with an empty qualifier. Scenario: # Put row with family and qualifier (timestamp 1). # Delete entire row (timestamp 2). # Put same row again with family without qualifier (timestamp 3). A scan (latest version) returns the row with family without qualifier, version 3 (which is correct). # Put the same row again with family without qualifier (timestamp 4). A scan (latest version) returns multiple rows: * the row with family without qualifier, version 4 (which is correct). * the row with family with qualifier, version 1 (which is wrong). There is a test scenario attached. output: 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml 13:42:58,592 [main] client.HBaseAdmin - Created test_dml Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with timestamp: '1' Scan printout => Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue' Delete row: 'myRow' Scan printout => Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '3' Scan printout => Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 'myValue' Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '4' Scan printout => Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 'myValue' {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue'{color} was: The qualifier of a deleted row (with keep deleted cells true) re-appears after re-inserting the same row multiple times (with different timestamp) with an empty qualifier. Scenario: # Put row with family and qualifier (timestamp 1). # Delete entire row (timestamp 2). # Put same row again with family without qualifier (timestamp 3). A scan (latest version) returns the row with family without qualifier, version 3 (which is correct). # Put the same row again with family without qualifier (timestamp 4). A scan (latest version) returns multiple rows: * the row with family without qualifier, version 4 (which is correct). * the row with family with qualifier, version 1 (which is wrong). There is a test scenario attached. output: 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml 13:42:58,592 [main] client.HBaseAdmin - Created test_dml Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with timestamp: '1' Scan printout => Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue' Delete row: 'myRow' Scan printout => Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '3' Scan printout => Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 'myValue' Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: '4' Scan printout => Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 'myValue' Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', Value: 'myValue' > Deleted qualifier re-appearing after multiple puts. > --- > > Key: HBASE-18471 > URL: https://issues.apache.org/jira/browse/HBASE-18471 > Project: HBase > Issue Type: Bug > Components: Deletes, hbase, scan >Affects Versions: 1.3.0 >Reporter: Thomas Martens > Attachments: HBaseDmlTest.java > > > The qualifier of a deleted row (with keep deleted cells true) re-appears > after re-inserting the same row multiple times (with different timestamp) > with an empty qualifier. > Scenario: > # Put row with family and qualifier (timestamp 1). > # Delete entire row (timestamp 2). > # Put same row again with family without qualifier (timestamp 3). > A scan (latest version) returns the row with family without qualifier, > version 3 (which is correct). > # Put the same row again with family without qualifier (timestamp 4). > A scan (latest version) returns multiple rows: > * the row with family without qualifier, version 4 (which is correct). > * the row with family with qualifier, version 1 (which is wrong). > There is a test scenario attached. > output: > 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml > 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml > 13:42:57,256 [main] client.HBaseAd
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104868#comment-16104868 ] Hudson commented on HBASE-15134: FAILURE: Integrated in Jenkins build HBase-2.0 #248 (See [https://builds.apache.org/job/HBase-2.0/248/]) HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 12b9a151e6338297b253ca2e005eda22b1f2da4e) * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java * (edit) hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * (edit) hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private
[ https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104871#comment-16104871 ] Duo Zhang commented on HBASE-18446: --- {quote} but with local index we need to do full HFile scan and should be able to find the whether Cell belongs the child region based on actual data row key {quote} What happens if the file is compacted? > Mark StoreFileScanner as IA.Private > --- > > Key: HBASE-18446 > URL: https://issues.apache.org/jira/browse/HBASE-18446 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Duo Zhang > Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2 > > > Do not see any reason why it is marked as IA.LimitedPrivate. It is not > referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private
[ https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104876#comment-16104876 ] Anoop Sam John commented on HBASE-18446: For the compaction purpose also, each of the region's compaction would have selected the corresponding Half file and its reader for doing the compaction work (Scan for compaction). As of 2.0 design, we wont archive the compacted away files immediately after the compaction. The old scans will continue using them. See CompactedHFilesDischarger. The new compacted files will have proper data as per the split daughter regions > Mark StoreFileScanner as IA.Private > --- > > Key: HBASE-18446 > URL: https://issues.apache.org/jira/browse/HBASE-18446 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Duo Zhang > Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2 > > > Do not see any reason why it is marked as IA.LimitedPrivate. It is not > referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104880#comment-16104880 ] Hadoop QA commented on HBASE-18451: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 9s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}119m 11s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:bdc94b1 | | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879325/HBASE-18451.master.patch | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 963f4e251072 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2d06a06 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7829/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7829/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >
[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private
[ https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104881#comment-16104881 ] Duo Zhang commented on HBASE-18446: --- Oh, seems no problem. The compaction will also use the replaced StoreFileReader so it can read the index and write them to the new StoreFile. > Mark StoreFileScanner as IA.Private > --- > > Key: HBASE-18446 > URL: https://issues.apache.org/jira/browse/HBASE-18446 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Duo Zhang > Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2 > > > Do not see any reason why it is marked as IA.LimitedPrivate. It is not > referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104884#comment-16104884 ] Jean-Marc Spaggiari commented on HBASE-18451: - You got it! ;) LGTM. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104888#comment-16104888 ] Jean-Marc Spaggiari commented on HBASE-18451: - Thanks Anoop. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: nihed mbarek > Attachments: HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 35390ms > 2017-07-24 18:45:33,362 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,15
[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues
[ https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104915#comment-16104915 ] Hudson commented on HBASE-15134: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3449 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3449/]) HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 2d06a06ba4bbd2f64e28be5973eb1d447114bedc) * (edit) hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * (edit) hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * (edit) hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java > Add visibility into Flush and Compaction queues > --- > > Key: HBASE-15134 > URL: https://issues.apache.org/jira/browse/HBASE-15134 > Project: HBase > Issue Type: New Feature > Components: Compaction, metrics, regionserver >Reporter: Elliott Clark >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2 > > Attachments: HBASE-15134.branch-1.001.patch, > HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, > HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, > HBASE-15134.patch, HBASE-15134.patch > > > On busy spurts we can see regionservers start to see large queues for > compaction. It's really hard to tell if the server is queueing a lot of > compactions for the same region, lots of compactions for lots of regions, or > just falling behind. > For flushes much the same. There can be flushes in queue that aren't being > run because of delayed flushes. There's no way to know from the metrics how > many flushes are for each region, how many are delayed. Etc. > We should add either more metrics around this ( num per region, max per > region, min per region ) or add on a UI page that has the list of compactions > and flushes. > Or both. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18304) Start enforcing upperbounds on dependencies
[ https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Penzes updated HBASE-18304: - Attachment: HBASE-18304.master.001.patch > Start enforcing upperbounds on dependencies > --- > > Key: HBASE-18304 > URL: https://issues.apache.org/jira/browse/HBASE-18304 > Project: HBase > Issue Type: Task > Components: build, dependencies >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Tamas Penzes > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-18304.master.001.patch > > > would be nice to get this going before our next major version. > http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18374) RegionServer Metrics improvements
[ https://issues.apache.org/jira/browse/HBASE-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Singh Chouhan updated HBASE-18374: --- Attachment: HBASE-18374.master.005.patch Added putbatch metrics. > RegionServer Metrics improvements > - > > Key: HBASE-18374 > URL: https://issues.apache.org/jira/browse/HBASE-18374 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Abhishek Singh Chouhan >Assignee: Abhishek Singh Chouhan > Fix For: 3.0.0 > > Attachments: HBASE-18374.branch-1.001.patch, > HBASE-18374.branch-1.001.patch, HBASE-18374.branch-1.002.patch, > HBASE-18374.master.001.patch, HBASE-18374.master.002.patch, > HBASE-18374.master.003.patch, HBASE-18374.master.004.patch, > HBASE-18374.master.005.patch > > > At the RS level we have latency metrics for mutate/puts and deletes that are > updated per batch (ie. at the end of entire batchop if it contains put/delete > update the respective metric) in contrast with append/increment/get metrics > that are updated per op. This is a bit ambiguous since the delete and put > metrics are updated for multi row mutations that happen to contain a > put/delete. We should rename the metric(eg. delete_batch)/add better > description. Also we should add metrics for single delete client operations > that come through RSRpcServer.mutate path. We should also add metrics for > checkAndPut and checkAndDelete. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18304) Start enforcing upperbounds on dependencies
[ https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Penzes updated HBASE-18304: - Status: Patch Available (was: In Progress) > Start enforcing upperbounds on dependencies > --- > > Key: HBASE-18304 > URL: https://issues.apache.org/jira/browse/HBASE-18304 > Project: HBase > Issue Type: Task > Components: build, dependencies >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Tamas Penzes > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-18304.master.001.patch > > > would be nice to get this going before our next major version. > http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.
[ https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Martens updated HBASE-18471: --- Affects Version/s: 1.3.1 > Deleted qualifier re-appearing after multiple puts. > --- > > Key: HBASE-18471 > URL: https://issues.apache.org/jira/browse/HBASE-18471 > Project: HBase > Issue Type: Bug > Components: Deletes, hbase, scan >Affects Versions: 1.3.0, 1.3.1 >Reporter: Thomas Martens > Attachments: HBaseDmlTest.java > > > The qualifier of a deleted row (with keep deleted cells true) re-appears > after re-inserting the same row multiple times (with different timestamp) > with an empty qualifier. > Scenario: > # Put row with family and qualifier (timestamp 1). > # Delete entire row (timestamp 2). > # Put same row again with family without qualifier (timestamp 3). > A scan (latest version) returns the row with family without qualifier, > version 3 (which is correct). > # Put the same row again with family without qualifier (timestamp 4). > A scan (latest version) returns multiple rows: > * the row with family without qualifier, version 4 (which is correct). > * the row with family with qualifier, version 1 (which is wrong). > There is a test scenario attached. > output: > 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml > 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml > 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml > 13:42:58,592 [main] client.HBaseAdmin - Created test_dml > Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with > timestamp: '1' > Scan printout => > Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', > Value: 'myValue' > Delete row: 'myRow' > Scan printout => > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '3' > Scan printout => > Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '4' > Scan printout => > Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: > 'myQualifier', Value: 'myValue'{color} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18304) Start enforcing upperbounds on dependencies
[ https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104988#comment-16104988 ] Mike Drob commented on HBASE-18304: --- Hi [~tamaas], when I said to exclude the protobuf dep I meant to exclude it from the configuration, not the actual dependency tree. I think we can use the mechanism in MENFORCER-273 to do this. > Start enforcing upperbounds on dependencies > --- > > Key: HBASE-18304 > URL: https://issues.apache.org/jira/browse/HBASE-18304 > Project: HBase > Issue Type: Task > Components: build, dependencies >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Tamas Penzes > Labels: beginner > Fix For: 2.0.0 > > Attachments: HBASE-18304.master.001.patch > > > would be nice to get this going before our next major version. > http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18142) Deletion of a cell deletes the previous versions too
[ https://issues.apache.org/jira/browse/HBASE-18142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105160#comment-16105160 ] Chia-Ping Tsai commented on HBASE-18142: bq. could this be the reason we are deleting all the versions of all the cells in that row? Not exactly. The _deleterows_internal call _createdelete_internal for getting the Delete object. The _createdelete_internal create the Delete object through [Delete#addColumns|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java#L258]. The purpose of the Delete#addColumns is to *delete all versions of the specified column with a timestamp less than or equal to the specified timestamp*. > Deletion of a cell deletes the previous versions too > > > Key: HBASE-18142 > URL: https://issues.apache.org/jira/browse/HBASE-18142 > Project: HBase > Issue Type: Bug > Components: API >Reporter: Karthick > Labels: beginner > > When I tried to delete a cell using it's timestamp in the Hbase Shell, the > previous versions of the same cell also got deleted. But when I tried the > same using the Java API, then the previous versions are not deleted and I can > retrive the previous values. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java > see this file to fix the issue. This method (public Delete addColumns(final > byte [] family, final byte [] qualifier, final long timestamp)) only deletes > the current version of the cell. The previous versions are not deleted. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff
[ https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105247#comment-16105247 ] Mike Drob commented on HBASE-12387: --- Did the proposed DISCUSS thread ever happen? > committer guidelines should include patch signoff > - > > Key: HBASE-12387 > URL: https://issues.apache.org/jira/browse/HBASE-12387 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Sean Busbey >Assignee: Sean Busbey > > Right now our guide for committers apply patches has them use {{git am}} > without a signoff flag. This works okay, but it misses adding the > "signed-off-by" blurb in the commit message. > Those messages make it easier to see at a glance with e.g. {{git log}} which > committer applied the patch. > this section: > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > {code} > $ git checkout -b HBASE- > $ git am ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} > Should be > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > Note that the {{--signoff}} flag to {{git am}} will insert a line in the > commit message that the patch was checked by your author string. This > addition to your inclusion as the commit's committer makes your participation > more prominent to users browsing {{git log}}. > {code} > $ git checkout -b HBASE- > $ git am --signoff ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18472) Add guava license and update supplemental-models.xml
Yi Liang created HBASE-18472: Summary: Add guava license and update supplemental-models.xml Key: HBASE-18472 URL: https://issues.apache.org/jira/browse/HBASE-18472 Project: HBase Issue Type: Bug Reporter: Yi Liang Assignee: Yi Liang Priority: Blocker When I run mvn clean install -DskipTests on my local machine, lt always shows error below {quote} WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed with message: License errors detected, for more detail find ERROR in hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105253#comment-16105253 ] Mike Drob commented on HBASE-18472: --- Hi [~easyliangjob] - is this on a specific branch? Possibly related to HBASE-17908 but I haven't seen this failure happen locally. > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105254#comment-16105254 ] Yi Liang commented on HBASE-18472: -- the error on my machine was caused by HBASE-16351, it seems we do not add guava license into supplemental-models.xml in HBASE-17908. Hi [~mdrob], Have you seens this error when you mvn install, I think this errors has been here for while, it is strange no one report it. Just want to make sure if it only happens on my machine. > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liang updated HBASE-18472: - Attachment: HBASE-18472-master-v1.patch > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liang updated HBASE-18472: - Fix Version/s: 3.0.0 2.0.0 Affects Version/s: 3.0.0 2.0.0-alpha-1 Status: Patch Available (was: Open) > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-1, 3.0.0 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105260#comment-16105260 ] Yi Liang commented on HBASE-18472: -- I try it on master branch > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18374) RegionServer Metrics improvements
[ https://issues.apache.org/jira/browse/HBASE-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105289#comment-16105289 ] Hadoop QA commented on HBASE-18374: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 30m 11s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 44s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:bdc94b1 | | JIRA Issue | HBASE-18374 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879350/HBASE-18374.master.005.patch | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 2134b41544f1 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2d06a06 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7830/testReport/ | | modules | C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: . | | Console output | https://b
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105295#comment-16105295 ] Mike Drob commented on HBASE-18472: --- HBASE-16351 made the error reporting friendlier, it shouldn't have exposed any new errors. Can you reproduce this with a clean workspace and m2 repository? Guava 11.0.2 inherits it's license from guava-parent https://repo1.maven.org/maven2/com/google/guava/guava-parent/11.0.2/guava-parent-11.0.2.pom which has {code} The Apache Software License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.txt repo {code} and that looks 100% correct. Can you post more snippet from the ERROR message you see inside of the generated LICENSE file? > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105296#comment-16105296 ] Sean Busbey commented on HBASE-18472: - This shouldn't be needed. guava 11 has a parent of guava-parent 11, which lists a license of "Apache Software License, Version 2.0". Guava was expressly removed as a part of HBASE-18202 since that license name got handled fine. > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105299#comment-16105299 ] Yi Liang commented on HBASE-18472: -- OK, let me try on a clean workspace and m2 repository > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105317#comment-16105317 ] Hadoop QA commented on HBASE-18472: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 8s{color} | {color:green} hbase-resource-bundle in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 8s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 25s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:bdc94b1 | | JIRA Issue | HBASE-18472 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879373/HBASE-18472-master-v1.patch | | Optional Tests | asflicense javac javadoc unit xml | | uname | Linux 362f41f6c671 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2d06a06 | | Default Java | 1.8.0_131 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7832/testReport/ | | modules | C: hbase-resource-bundle U: hbase-resource-bundle | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7832/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff
[ https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105323#comment-16105323 ] Sean Busbey commented on HBASE-12387: - It didn't. since the ref guide includes a reference to a DISCUSS thread about the old attribution approach of "HBASE-121334 foo bar thing (contributor)", we should have one before we push on this. Mind starting the thread [~mdrob]? If you don't think you have enough context, let me know and I'll do it. > committer guidelines should include patch signoff > - > > Key: HBASE-12387 > URL: https://issues.apache.org/jira/browse/HBASE-12387 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Sean Busbey >Assignee: Sean Busbey > > Right now our guide for committers apply patches has them use {{git am}} > without a signoff flag. This works okay, but it misses adding the > "signed-off-by" blurb in the commit message. > Those messages make it easier to see at a glance with e.g. {{git log}} which > committer applied the patch. > this section: > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > {code} > $ git checkout -b HBASE- > $ git am ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} > Should be > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > Note that the {{--signoff}} flag to {{git am}} will insert a line in the > commit message that the patch was checked by your author string. This > addition to your inclusion as the commit's committer makes your participation > more prominent to users browsing {{git log}}. > {code} > $ git checkout -b HBASE- > $ git am --signoff ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liang updated HBASE-18472: - Resolution: Invalid Status: Resolved (was: Patch Available) > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml
[ https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105390#comment-16105390 ] Yi Liang commented on HBASE-18472: -- It seems something messed up in my local m2, it works fine now. Sorry for taking up you guys' time > Add guava license and update supplemental-models.xml > > > Key: HBASE-18472 > URL: https://issues.apache.org/jira/browse/HBASE-18472 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.0-alpha-1 >Reporter: Yi Liang >Assignee: Yi Liang >Priority: Blocker > Fix For: 2.0.0, 3.0.0 > > Attachments: HBASE-18472-master-v1.patch > > > When I run mvn clean install -DskipTests on my local machine, lt always shows > error below > {quote} > WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed > with message: > License errors detected, for more detail find ERROR in > hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE > Failed to execute goal > org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce > (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have > failed. Look above for specific messages explaining why the rule failed. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18469) Correct RegionServer metric of totalRequestCount
[ https://issues.apache.org/jira/browse/HBASE-18469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105407#comment-16105407 ] Josh Elser commented on HBASE-18469: [~zhangshibin], do you plan on inspecting this further yourself? I don't think there's much one of us could do with the information you provided. > Correct RegionServer metric of totalRequestCount > -- > > Key: HBASE-18469 > URL: https://issues.apache.org/jira/browse/HBASE-18469 > Project: HBase > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Shibin Zhang >Priority: Minor > > when i get the metric ,i found this three metric may be have some error as > follow : > "totalRequestCount" : 17541, > "readRequestCount" : 17483, > "writeRequestCount" : 1633, -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18409) Migrate Client Metrics from codahale to hbase-metrics
[ https://issues.apache.org/jira/browse/HBASE-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ronald Macmaster updated HBASE-18409: - Fix Version/s: 3.0.0 Affects Version/s: (was: 2.0.0-alpha-1) 3.0.0 Status: Patch Available (was: Open) The patch refactors the original MetricsConnection in the hbase-client module to report metrics via the Hadoop metrics2 system. Originally, metrics were reported privately through a codahale JMXReporter in the MetricsConnection class. The MetricsConnection class also recorded metrics using the codahale metrics classes rather than the hbase-metrics classes. These classes prove to be inflexible for the extendability and customization that hbase-client needs. Now, the MetricsConnection delegates updates to metrics to the metrics2 system. It does this through the addition of two new classes, MetricsClientSource and MetricsClientSourceImpl in the hbase-hadoop-compat and hbase-hadoop2-compat modules respectively. The new model closely resembles the architecture for collecting and reporting metrics from the Zookeeper client, master, and region server daemons. The patch unifies the concept of metrics reporting behind a single API. Once the native infrastructure for metrics reporting via hbase-metrics is completed, metrics2 sources and sinks can be phased out accordingly. > Migrate Client Metrics from codahale to hbase-metrics > - > > Key: HBASE-18409 > URL: https://issues.apache.org/jira/browse/HBASE-18409 > Project: HBase > Issue Type: Improvement > Components: Client, java, metrics >Affects Versions: 3.0.0 >Reporter: Ronald Macmaster > Labels: newbie > Fix For: 3.0.0 > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently, the metrics for hbase-client are tailored for reporting via a > client-side JMX server. > The MetricsConnection handles the metrics management and reporting via the > metrics platform from codahale. > This approach worked well for hbase-1.3.1 when the metrics platform was still > relatively young, but it could be improved by using the new > hbase-metrics-api. > Now that we have an actual hbase-metrics-api that master, regionserver, > zookeeper, and other daemons use, it would be good to also allow the client > to leverage the metrics-api. > Then, the client could also report its metrics via Hadoop's metrics2 if > desired or through another platform that utilizes the hbase-metrics-api. > If left alone, client metrics will continue to be only barely visible through > a client-side JMX server. > The migration to the new metrics-api could be done by simply changing the > Metrics data types from codahale types to hbase-metrics types without > changing the metrics signatures of MetricsConnection unless completely > necessary. > The codahale MetricsRegistry would also have to be exchanged for a > hbase-metrics MetricsRegistry. > I found this to be a necessary change after attempting to implement my own > Reporter to use within the MetricsConnection class. > I was attempting to create a HadoopMetrics2Reporter that extends the codahale > ScheduledReporter and reports the MetricsConnection metrics to Hadoop's > metrics2 system. > The already existing infrastructure in the hbase-metrics and > hbase-metrics-api projects could be easily leveraged for a cleaner solution. > If completed successfully, users could instead access their client-side > metrics through the hbase-metrics-api. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18409) Migrate Client Metrics from codahale to hbase-metrics
[ https://issues.apache.org/jira/browse/HBASE-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ronald Macmaster updated HBASE-18409: - Attachment: 0001-HBASE-18409-MetricsConnection-client-metrics-migration.patch > Migrate Client Metrics from codahale to hbase-metrics > - > > Key: HBASE-18409 > URL: https://issues.apache.org/jira/browse/HBASE-18409 > Project: HBase > Issue Type: Improvement > Components: Client, java, metrics >Affects Versions: 3.0.0 >Reporter: Ronald Macmaster > Labels: newbie > Fix For: 3.0.0 > > Attachments: > 0001-HBASE-18409-MetricsConnection-client-metrics-migration.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Currently, the metrics for hbase-client are tailored for reporting via a > client-side JMX server. > The MetricsConnection handles the metrics management and reporting via the > metrics platform from codahale. > This approach worked well for hbase-1.3.1 when the metrics platform was still > relatively young, but it could be improved by using the new > hbase-metrics-api. > Now that we have an actual hbase-metrics-api that master, regionserver, > zookeeper, and other daemons use, it would be good to also allow the client > to leverage the metrics-api. > Then, the client could also report its metrics via Hadoop's metrics2 if > desired or through another platform that utilizes the hbase-metrics-api. > If left alone, client metrics will continue to be only barely visible through > a client-side JMX server. > The migration to the new metrics-api could be done by simply changing the > Metrics data types from codahale types to hbase-metrics types without > changing the metrics signatures of MetricsConnection unless completely > necessary. > The codahale MetricsRegistry would also have to be exchanged for a > hbase-metrics MetricsRegistry. > I found this to be a necessary change after attempting to implement my own > Reporter to use within the MetricsConnection class. > I was attempting to create a HadoopMetrics2Reporter that extends the codahale > ScheduledReporter and reports the MetricsConnection metrics to Hadoop's > metrics2 system. > The already existing infrastructure in the hbase-metrics and > hbase-metrics-api projects could be easily leveraged for a cleaner solution. > If completed successfully, users could instead access their client-side > metrics through the hbase-metrics-api. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18473) VC.listLabels() erroneously closes any connection
Lars George created HBASE-18473: --- Summary: VC.listLabels() erroneously closes any connection Key: HBASE-18473 URL: https://issues.apache.org/jira/browse/HBASE-18473 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.11, 1.2.6, 1.3.1 Reporter: Lars George In HBASE-13358 the {{VisibilityClient.listLabels()}} was amended to take in a connection from the caller, which totally makes sense. But the patch forgot to remove the unconditional call to {{connection.close()}} in the {{finally}} block: {code} finally { if (table != null) { table.close(); } if (connection != null) { connection.close(); } } {code} Remove the second {{if}} completely. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17908) Upgrade guava
[ https://issues.apache.org/jira/browse/HBASE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105463#comment-16105463 ] Yi Liang commented on HBASE-17908: -- Never mind my above comments, something on my mvn repository messed up. It works fine now > Upgrade guava > - > > Key: HBASE-17908 > URL: https://issues.apache.org/jira/browse/HBASE-17908 > Project: HBase > Issue Type: Sub-task > Components: dependencies >Reporter: Balazs Meszaros >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: 0001-HBASE-17908-Upgrade-guava.022.patch, > HBASE-17908.master.001.patch, HBASE-17908.master.002.patch, > HBASE-17908.master.003.patch, HBASE-17908.master.004.patch, > HBASE-17908.master.005.patch, HBASE-17908.master.006.patch, > HBASE-17908.master.007.patch, HBASE-17908.master.008.patch, > HBASE-17908.master.009.patch, HBASE-17908.master.010.patch, > HBASE-17908.master.011.patch, HBASE-17908.master.012.patch, > HBASE-17908.master.013.patch, HBASE-17908.master.013.patch, > HBASE-17908.master.014.patch, HBASE-17908.master.015.patch, > HBASE-17908.master.015.patch, HBASE-17908.master.016.patch, > HBASE-17908.master.017.patch, HBASE-17908.master.018.patch, > HBASE-17908.master.019.patch, HBASE-17908.master.020.patch, > HBASE-17908.master.021.patch, HBASE-17908.master.021.patch, > HBASE-17908.master.022.patch, HBASE-17908.master.023.patch, > HBASE-17908.master.024.patch, HBASE-17908.master.025.patch, > HBASE-17908.master.026.patch, HBASE-17908.master.027.patch, > HBASE-17908.master.028.patch > > > Currently we are using guava 12.0.1, but the latest version is 21.0. > Upgrading guava is always a hassle because it is not always backward > compatible with itself. > Currently I think there are to approaches: > 1. Upgrade guava to the newest version (21.0) and shade it. > 2. Upgrade guava to a version which does not break or builds (15.0). > If we can update it, some dependencies should be removed: > commons-collections, commons-codec, ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18304) Start enforcing upperbounds on dependencies
[ https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105482#comment-16105482 ] Hadoop QA commented on HBASE-18304: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 56s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 8s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 21s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 34s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.2. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 0m 47s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.3. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 3s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.4. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 16s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.5. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 10s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 47s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 27s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 4s{color} | {color:green} hbase-spark in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hbase-spark-it in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}139m 51s{color} | {color:green} root in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}281m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.client.TestScanWithoutFetchingData | | | org.apache.hadoop.hbase.mapreduce.TestWALPlayer | | | org.apache.hadoop.hbase.coprocessor.TestHTableWrapper | | | org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence | | | org.apache.hadoop.hbase.mapreduce.TestTa
[jira] [Commented] (HBASE-18024) HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists
[ https://issues.apache.org/jira/browse/HBASE-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105532#comment-16105532 ] Josh Elser commented on HBASE-18024: Any update on those test failures, [~esteban]? Anything I can do to help out? > HRegion#initializeRegionInternals should not re-create .hregioninfo file when > the region directory no longer exists > --- > > Key: HBASE-18024 > URL: https://issues.apache.org/jira/browse/HBASE-18024 > Project: HBase > Issue Type: Bug > Components: Region Assignment, regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Attachments: HBASE-18024.001.patch > > > When a RegionSever attempts to open a region, during initialization the RS > tries to open the {{/data///.hregioninfo}} > file, however if the {{.hregioninfofile}} doesn't exist, the RegionServer > will create a new one on {{HRegionFileSystem#checkRegionInfoOnFilesystem}}. A > side effect of that tools like hbck will incorrectly assume an inconsistency > due the presence of this new {{.hregioninfofile}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18025) CatalogJanitor should collect outdated RegionStates from the AM
[ https://issues.apache.org/jira/browse/HBASE-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105540#comment-16105540 ] Josh Elser commented on HBASE-18025: Took a glance at v3 and the problem you described makes sense. I'm a bit scared to +1 because I know how tricky this state management is :) bq. A problem that we have observed is when region replicas are being used and there is a split, the region replica from parent doesn't get collected from the region states and when the balancer tries to assign the old parent region replica, this will cause the RegionServer to create a new HRI with the details of the parent causing an inconstancy Is this something that reliably happens and is possible to capture in a test? > CatalogJanitor should collect outdated RegionStates from the AM > --- > > Key: HBASE-18025 > URL: https://issues.apache.org/jira/browse/HBASE-18025 > Project: HBase > Issue Type: Bug >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Attachments: HBASE-18025.001.patch, HBASE-18025.002.patch, > HBASE-18025.003.patch > > > I don't think this will matter on the long run for HBase 2, but at least in > branch-1 and the current master we keep in multiple places copies of the > region states in the master and this copies include information like the HRI. > A problem that we have observed is when region replicas are being used and > there is a split, the region replica from parent doesn't get collected from > the region states and when the balancer tries to assign the old parent region > replica, this will cause the RegionServer to create a new HRI with the > details of the parent causing an inconstancy (see HBASE-18024). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18024) HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists
[ https://issues.apache.org/jira/browse/HBASE-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105547#comment-16105547 ] Esteban Gutierrez commented on HBASE-18024: --- Got a fix for the test in TestWALMonotonicallyIncreasingSeqId which is fine. What is interesting is in TestStoreFileRefresherChore, which is what was uncovered while troubleshooting the issue that brought this JIRA. After my change HRegion.initialize() will not attempt to re-create the regioninfo file, but throwing an exception will cause the test to fail since the region replica cannot be instantiated. One option I've been thinking is to modify HRegion#initialize() to have an optional argument to initialize the region on the filesystem in order to skip writeRegionInfoOnFilesystem. > HRegion#initializeRegionInternals should not re-create .hregioninfo file when > the region directory no longer exists > --- > > Key: HBASE-18024 > URL: https://issues.apache.org/jira/browse/HBASE-18024 > Project: HBase > Issue Type: Bug > Components: Region Assignment, regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Attachments: HBASE-18024.001.patch > > > When a RegionSever attempts to open a region, during initialization the RS > tries to open the {{/data///.hregioninfo}} > file, however if the {{.hregioninfofile}} doesn't exist, the RegionServer > will create a new one on {{HRegionFileSystem#checkRegionInfoOnFilesystem}}. A > side effect of that tools like hbck will incorrectly assume an inconsistency > due the presence of this new {{.hregioninfofile}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18025) CatalogJanitor should collect outdated RegionStates from the AM
[ https://issues.apache.org/jira/browse/HBASE-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105557#comment-16105557 ] Esteban Gutierrez commented on HBASE-18025: --- Yeah, it happens reliable. In TesCatalogJanitor the behavior cannot be captured, but I created another test for CatalogJanitor where I can monitor that the RegionStates are not cleared up after a split. I was a little skeptical about ServerManager#removeRegion but it think it makes sense to clean up storeFlushedSequenceIdsByRegion and flushedSequenceIdByRegion after a split or a merge. > CatalogJanitor should collect outdated RegionStates from the AM > --- > > Key: HBASE-18025 > URL: https://issues.apache.org/jira/browse/HBASE-18025 > Project: HBase > Issue Type: Bug >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Attachments: HBASE-18025.001.patch, HBASE-18025.002.patch, > HBASE-18025.003.patch > > > I don't think this will matter on the long run for HBase 2, but at least in > branch-1 and the current master we keep in multiple places copies of the > region states in the master and this copies include information like the HRI. > A problem that we have observed is when region replicas are being used and > there is a split, the region replica from parent doesn't get collected from > the region states and when the balancer tries to assign the old parent region > replica, this will cause the RegionServer to create a new HRI with the > details of the parent causing an inconstancy (see HBASE-18024). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17908) Upgrade guava
[ https://issues.apache.org/jira/browse/HBASE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105571#comment-16105571 ] stack commented on HBASE-17908: --- Thanks for reporting and taking a look [~easyliangjob] > Upgrade guava > - > > Key: HBASE-17908 > URL: https://issues.apache.org/jira/browse/HBASE-17908 > Project: HBase > Issue Type: Sub-task > Components: dependencies >Reporter: Balazs Meszaros >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: 0001-HBASE-17908-Upgrade-guava.022.patch, > HBASE-17908.master.001.patch, HBASE-17908.master.002.patch, > HBASE-17908.master.003.patch, HBASE-17908.master.004.patch, > HBASE-17908.master.005.patch, HBASE-17908.master.006.patch, > HBASE-17908.master.007.patch, HBASE-17908.master.008.patch, > HBASE-17908.master.009.patch, HBASE-17908.master.010.patch, > HBASE-17908.master.011.patch, HBASE-17908.master.012.patch, > HBASE-17908.master.013.patch, HBASE-17908.master.013.patch, > HBASE-17908.master.014.patch, HBASE-17908.master.015.patch, > HBASE-17908.master.015.patch, HBASE-17908.master.016.patch, > HBASE-17908.master.017.patch, HBASE-17908.master.018.patch, > HBASE-17908.master.019.patch, HBASE-17908.master.020.patch, > HBASE-17908.master.021.patch, HBASE-17908.master.021.patch, > HBASE-17908.master.022.patch, HBASE-17908.master.023.patch, > HBASE-17908.master.024.patch, HBASE-17908.master.025.patch, > HBASE-17908.master.026.patch, HBASE-17908.master.027.patch, > HBASE-17908.master.028.patch > > > Currently we are using guava 12.0.1, but the latest version is 21.0. > Upgrading guava is always a hassle because it is not always backward > compatible with itself. > Currently I think there are to approaches: > 1. Upgrade guava to the newest version (21.0) and shade it. > 2. Upgrade guava to a version which does not break or builds (15.0). > If we can update it, some dependencies should be removed: > commons-collections, commons-codec, ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks
Andrew Purtell created HBASE-18474: -- Summary: HRegion#doMiniBatchMutation is acquiring read row locks Key: HBASE-18474 URL: https://issues.apache.org/jira/browse/HBASE-18474 Project: HBase Issue Type: Bug Reporter: Andrew Purtell -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks
[ https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18474: --- Description: Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 1. {code} // If we haven't got any rows in our batch, we should block to // get the next one. RowLock rowLock = null; try { rowLock = getRowLockInternal(mutation.getRow(), true); } catch (TimeoutIOException e) { // We will retry when other exceptions, but we should stop if we timeout . throw e; } catch (IOException ioe) { LOG.warn("Failed getting lock in batch put, row=" + Bytes.toStringBinary(mutation.getRow()), ioe); } if (rowLock == null) { // We failed to grab another lock break; // stop acquiring more rows for this batch } else { acquiredRowLocks.add(rowLock); } {code} Other code paths that apply mutations are acquiring write locks. In HRegion#append {code} try { rowLock = getRowLockInternal(row, false); assert rowLock != null; ... {code} In HRegion#doIn {code} try { rowLock = getRowLockInternal(increment.getRow(), false); ... {code} In HRegion#checkAndMutate {code} // Lock row - note that doBatchMutate will relock this row if called RowLock rowLock = getRowLockInternal(get.getRow(), false); // wait for all previous transactions to complete (with lock held) mvcc.await(); {code} What doMiniBatchMutation is doing looks wrong. > HRegion#doMiniBatchMutation is acquiring read row locks > --- > > Key: HBASE-18474 > URL: https://issues.apache.org/jira/browse/HBASE-18474 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell > > Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in > step 1. > {code} > // If we haven't got any rows in our batch, we should block to > > // get the next one. > > RowLock rowLock = null; > try { > rowLock = getRowLockInternal(mutation.getRow(), true); > } catch (TimeoutIOException e) { > // We will retry when other exceptions, but we should stop if we > timeout . > throw e; > } catch (IOException ioe) { > LOG.warn("Failed getting lock in batch put, row=" > + Bytes.toStringBinary(mutation.getRow()), ioe); > } > if (rowLock == null) { > // We failed to grab another lock > > break; // stop acquiring more rows for this batch > > } else { > acquiredRowLocks.add(rowLock); > } > {code} > Other code paths that apply mutations are acquiring write locks. > In HRegion#append > {code} > try { > rowLock = getRowLockInternal(row, false); > assert rowLock != null; > ... > {code} > In HRegion#doIn > {code} > try { > rowLock = getRowLockInternal(increment.getRow(), false); > ... > {code} > In HRegion#checkAndMutate > {code} > // Lock row - note that doBatchMutate will relock this row if called > > RowLock rowLock = getRowLockInternal(get.getRow(), false); > // wait for all previous transactions to complete (with lock held) > > mvcc.await(); > {code} > What doMiniBatchMutation is doing looks wrong. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks
[ https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18474: --- Description: Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 1. {code} // If we haven't got any rows in our batch, we should block to // get the next one. RowLock rowLock = null; try { rowLock = getRowLockInternal(mutation.getRow(), true); } catch (TimeoutIOException e) { // We will retry when other exceptions, but we should stop if we timeout . throw e; } catch (IOException ioe) { LOG.warn("Failed getting lock in batch put, row=" + Bytes.toStringBinary(mutation.getRow()), ioe); } if (rowLock == null) { // We failed to grab another lock break; // stop acquiring more rows for this batch } else { acquiredRowLocks.add(rowLock); } {code} Other code paths that apply mutations are acquiring write locks. In HRegion#append {code} try { rowLock = getRowLockInternal(row, false); assert rowLock != null; ... {code} In HRegion#doIn {code} try { rowLock = getRowLockInternal(increment.getRow(), false); ... {code} In HRegion#checkAndMutate {code} // Lock row - note that doBatchMutate will relock this row if called RowLock rowLock = getRowLockInternal(get.getRow(), false); // wait for all previous transactions to complete (with lock held) mvcc.await(); {code} In HRegion#processRowsWithLocks {code} // 2. Acquire the row lock(s) acquiredRowLocks = new ArrayList(rowsToLock.size()); for (byte[] row : rowsToLock) { // Attempt to lock all involved rows, throw if any lock times out // use a writer lock for mixed reads and writes acquiredRowLocks.add(getRowLockInternal(row, false)); } {code} and so on. What doMiniBatchMutation is doing looks wrong. was: Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 1. {code} // If we haven't got any rows in our batch, we should block to // get the next one. RowLock rowLock = null; try { rowLock = getRowLockInternal(mutation.getRow(), true); } catch (TimeoutIOException e) { // We will retry when other exceptions, but we should stop if we timeout . throw e; } catch (IOException ioe) { LOG.warn("Failed getting lock in batch put, row=" + Bytes.toStringBinary(mutation.getRow()), ioe); } if (rowLock == null) { // We failed to grab another lock break; // stop acquiring more rows for this batch } else { acquiredRowLocks.add(rowLock); } {code} Other code paths that apply mutations are acquiring write locks. In HRegion#append {code} try { rowLock = getRowLockInternal(row, false); assert rowLock != null; ... {code} In HRegion#doIn {code} try { rowLock = getRowLockInternal(increment.getRow(), false); ... {code} In HRegion#checkAndMutate {code} // Lock row - note that doBatchMutate will relock this row if called RowLock rowLock = getRowLockInternal(get.getRow(), false); // wait for all previous transactions to complete (with lock held) mvcc.await(); {code} What doMiniBatchMutation is doing looks wrong. > HRegion#doMiniBatchMu
[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff
[ https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105609#comment-16105609 ] Mike Drob commented on HBASE-12387: --- It looks like this got included almost verbatim already at some point (see example 60) in https://hbase.apache.org/book.html#committing.patches > committer guidelines should include patch signoff > - > > Key: HBASE-12387 > URL: https://issues.apache.org/jira/browse/HBASE-12387 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Sean Busbey >Assignee: Sean Busbey > > Right now our guide for committers apply patches has them use {{git am}} > without a signoff flag. This works okay, but it misses adding the > "signed-off-by" blurb in the commit message. > Those messages make it easier to see at a glance with e.g. {{git log}} which > committer applied the patch. > this section: > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > {code} > $ git checkout -b HBASE- > $ git am ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} > Should be > {quote} > The directive to use git format-patch rather than git diff, and not to use > --no-prefix, is a new one. See the second example for how to apply a patch > created with git diff, and educate the person who created the patch. > Note that the {{--signoff}} flag to {{git am}} will insert a line in the > commit message that the patch was checked by your author string. This > addition to your inclusion as the commit's committer makes your participation > more prominent to users browsing {{git log}}. > {code} > $ git checkout -b HBASE- > $ git am --signoff ~/Downloads/HBASE--v2.patch > $ git checkout master > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary or ask the submitter to do it > $ git pull --rebase # Better safe than sorry > $ git push origin master > $ git checkout branch-1 > $ git pull --rebase > $ git cherry-pick > # Resolve conflicts if necessary > $ git pull --rebase # Better safe than sorry > $ git push origin branch-1 > $ git branch -D HBASE- > {code} > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18466) [C++] Support handling exception in RpcTestServer
[ https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HBASE-18466: -- Attachment: HBASE-18466.001.patch > [C++] Support handling exception in RpcTestServer > - > > Key: HBASE-18466 > URL: https://issues.apache.org/jira/browse/HBASE-18466 > Project: HBase > Issue Type: Sub-task >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch > > > In order simulate various error from servers, exceptions should be handled > properly. The idea is to zip exception into hbase::Response in > RpcTestService, and Serialize response to folly::IOBuf and write it down the > pipeline. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18466) [C++] Support handling exception in RpcTestServer
[ https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HBASE-18466: -- Status: Patch Available (was: Open) > [C++] Support handling exception in RpcTestServer > - > > Key: HBASE-18466 > URL: https://issues.apache.org/jira/browse/HBASE-18466 > Project: HBase > Issue Type: Sub-task >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch > > > In order simulate various error from servers, exceptions should be handled > properly. The idea is to zip exception into hbase::Response in > RpcTestService, and Serialize response to folly::IOBuf and write it down the > pipeline. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18466) [C++] Support handling exception in RpcTestServer
[ https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105643#comment-16105643 ] Xiaobing Zhou commented on HBASE-18466: --- Posted v1: # fixed promise broken issue when ResponseHeader::set_allocated_exception(pb::ExceptionResponse) is called without new style allocation of pb::ExceptionResponse. > [C++] Support handling exception in RpcTestServer > - > > Key: HBASE-18466 > URL: https://issues.apache.org/jira/browse/HBASE-18466 > Project: HBase > Issue Type: Sub-task >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch > > > In order simulate various error from servers, exceptions should be handled > properly. The idea is to zip exception into hbase::Response in > RpcTestService, and Serialize response to folly::IOBuf and write it down the > pipeline. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks
[ https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105647#comment-16105647 ] stack commented on HBASE-18474: --- Our row locks are read/write since 1.2. Whenever a modification on a row, we take a read lock. Concurrent threads updating a row are allowed. mvcc ensures ongoing reads always get a consistent row 'view'. The exceptions are read/modify/write operations such as increment/append/checkAndPut. These need to 'read' and then update the value they just read. To ensure the row 'view' doesn't change between read and write (mvcc is column family scope only), r/m/w ops take a write lock. [~apurtell] FYI sir. > HRegion#doMiniBatchMutation is acquiring read row locks > --- > > Key: HBASE-18474 > URL: https://issues.apache.org/jira/browse/HBASE-18474 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell > > Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in > step 1. > {code} > // If we haven't got any rows in our batch, we should block to > > // get the next one. > > RowLock rowLock = null; > try { > rowLock = getRowLockInternal(mutation.getRow(), true); > } catch (TimeoutIOException e) { > // We will retry when other exceptions, but we should stop if we > timeout . > throw e; > } catch (IOException ioe) { > LOG.warn("Failed getting lock in batch put, row=" > + Bytes.toStringBinary(mutation.getRow()), ioe); > } > if (rowLock == null) { > // We failed to grab another lock > > break; // stop acquiring more rows for this batch > > } else { > acquiredRowLocks.add(rowLock); > } > {code} > Other code paths that apply mutations are acquiring write locks. > In HRegion#append > {code} > try { > rowLock = getRowLockInternal(row, false); > assert rowLock != null; > ... > {code} > In HRegion#doIn > {code} > try { > rowLock = getRowLockInternal(increment.getRow(), false); > ... > {code} > In HRegion#checkAndMutate > {code} > // Lock row - note that doBatchMutate will relock this row if called > > RowLock rowLock = getRowLockInternal(get.getRow(), false); > // wait for all previous transactions to complete (with lock held) > > mvcc.await(); > {code} > In HRegion#processRowsWithLocks > {code} > // 2. Acquire the row lock(s) > > acquiredRowLocks = new ArrayList(rowsToLock.size()); > for (byte[] row : rowsToLock) { > // Attempt to lock all involved rows, throw if any lock times out > > // use a writer lock for mixed reads and writes > > acquiredRowLocks.add(getRowLockInternal(row, false)); > } > {code} > and so on. > What doMiniBatchMutation is doing looks wrong. -- This message was sent by Atlassian JIRA (v6.4.14#64029)