[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632810#comment-16632810 ] Hudson commented on HBASE-18451: Results for branch branch-2.1 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of
[jira] [Commented] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632811#comment-16632811 ] Hudson commented on HBASE-19418: Results for branch branch-2.1 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632814#comment-16632814 ] Hudson commented on HBASE-21196: Results for branch branch-2.1 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region
[jira] [Commented] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632812#comment-16632812 ] Hudson commented on HBASE-21207: Results for branch branch-2.1 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8 > > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, HBASE-21207-branch-1.patch, > HBASE-21207-branch-1.v1.patch, HBASE-21207-branch-2.v1.patch, > HBASE-21207.patch, HBASE-21207.patch, HBASE-21207.v1.patch, > edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632813#comment-16632813 ] Hudson commented on HBASE-20857: Results for branch branch-2.1 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Kiran Kumar Maturi >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-20857.branch-1.4.001.patch, > HBASE-20857.branch-1.4.002.patch > > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21248) Implement exponential backoff when retrying for ModifyPeerProcedure
[ https://issues.apache.org/jira/browse/HBASE-21248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21248: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-2.1+. Thanks [~zghaobac] for reviewing. > Implement exponential backoff when retrying for ModifyPeerProcedure > --- > > Key: HBASE-21248 > URL: https://issues.apache.org/jira/browse/HBASE-21248 > Project: HBase > Issue Type: Bug > Components: proc-v2, Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1 > > Attachments: HBASE-21248-v1.patch, HBASE-21248-v2.patch, > HBASE-21248.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21248) Implement exponential backoff when retrying for ModifyPeerProcedure
[ https://issues.apache.org/jira/browse/HBASE-21248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632789#comment-16632789 ] Hadoop QA commented on HBASE-21248: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 55s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 52s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}133m 47s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941791/HBASE-21248-v2.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1c8061c68f26 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / ab6ec1f9e4 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14536/testReport/ | | Max. process+thread count | 5156 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14536/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Implement exponential backoff
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632794#comment-16632794 ] Hudson commented on HBASE-21196: Results for branch branch-2 [build #1317 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632790#comment-16632790 ] Hudson commented on HBASE-18451: Results for branch branch-2 [build #1317 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of
[jira] [Commented] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632792#comment-16632792 ] Hudson commented on HBASE-21207: Results for branch branch-2 [build #1317 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8 > > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, HBASE-21207-branch-1.patch, > HBASE-21207-branch-1.v1.patch, HBASE-21207-branch-2.v1.patch, > HBASE-21207.patch, HBASE-21207.patch, HBASE-21207.v1.patch, > edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632791#comment-16632791 ] Hudson commented on HBASE-19418: Results for branch branch-2 [build #1317 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632793#comment-16632793 ] Hudson commented on HBASE-20857: Results for branch branch-2 [build #1317 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1317//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Kiran Kumar Maturi >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-20857.branch-1.4.001.patch, > HBASE-20857.branch-1.4.002.patch > > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21245) Add exponential backoff when retrying for sync replication related procedures
[ https://issues.apache.org/jira/browse/HBASE-21245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632788#comment-16632788 ] Hadoop QA commented on HBASE-21245: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 25s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s{color} | {color:red} hbase-server: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 26s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}122m 44s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}170m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21245 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941793/HBASE-21245.master.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 02ace6e83779 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 17 11:07:07 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / ab6ec1f9e4 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/14537/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results |
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632786#comment-16632786 ] Hudson commented on HBASE-20952: Results for branch HBASE-20952 [build #2 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/2/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/2//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/2//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/2//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632778#comment-16632778 ] Hudson commented on HBASE-21196: Results for branch branch-2.0 [build #879 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/879/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/879//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/879//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/879//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that map" and "Looking > up meta region location in
[jira] [Updated] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21213: -- Attachment: HBASE-21213.branch-2.1.010.patch > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, > HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch, > HBASE-21213.branch-2.1.009.patch, HBASE-21213.branch-2.1.010.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21250) Refactor WALProcedureStore and add more comments for better understanding the implementation
[ https://issues.apache.org/jira/browse/HBASE-21250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632756#comment-16632756 ] Duo Zhang commented on HBASE-21250: --- I have to say that it is really difficult to understand... And it seems that there is a bug? In the resetTo method of ProcedureStoreTracker, we do not clear the old BitSetNode map before adding the new ones, while in resetToProto we call reset first to clear the old map. But in resetToProto we do not update the minProcId and maxProcId, it maybe OK as this will only be called when restarting and seems we will use another data structure to calculate the minProcId and maxProcId, but there is no comment to say this... > Refactor WALProcedureStore and add more comments for better understanding the > implementation > > > Key: HBASE-21250 > URL: https://issues.apache.org/jira/browse/HBASE-21250 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > > The implementation is complicated and lack of comments to say how it works. > {code} > /** > * WAL implementation of the ProcedureStore. > * @see ProcedureWALPrettyPrinter for printing content of a single WAL. > * @see #main(String[]) to parse a directory of MasterWALProcs. > */ > {code} > I think at least we can move sub classes to separated files to make the class > smaller, and add more comments to describe what is going on here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632739#comment-16632739 ] Hadoop QA commented on HBASE-21213: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 45s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 25s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 48s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 45s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 11s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 17m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 20 unchanged - 0 fixed = 21 total (was 20) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 17s{color} | {color:red} hbase-server: The patch generated 1 new + 326 unchanged - 1 fixed = 327 total (was 327) {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 7s{color} | {color:red} The patch generated 3 new + 20 unchanged - 2 fixed = 23 total (was 22) {color} | | {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange} 0m 3s{color} | {color:orange} The patch generated 1 new + 41 unchanged - 0 fixed = 42 total (was 41) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s{color} | {color:red} hbase-procedure generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 59s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 56s{color} | {color:green} hbase-procedure
[jira] [Created] (HBASE-21254) Need to find a way to limit the number of proc wal files
Duo Zhang created HBASE-21254: - Summary: Need to find a way to limit the number of proc wal files Key: HBASE-21254 URL: https://issues.apache.org/jira/browse/HBASE-21254 Project: HBase Issue Type: Sub-task Components: proc-v2 Reporter: Duo Zhang Fix For: 3.0.0, 2.2.0 For regionserver, we have a max wal file limitation, if we reach the limitation, we will trigger a flush on specific regions so that we can delete old wal files. But for proc wals, we do not have this mechanism, and it will be worse after HBASE-21233, as if there is an old procedure which can not make progress and do not persist its state, we need to keep the old proc wal file for ever... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21245) Add exponential backoff when retrying for sync replication related procedures
[ https://issues.apache.org/jira/browse/HBASE-21245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21245: --- Status: Patch Available (was: Open) > Add exponential backoff when retrying for sync replication related procedures > - > > Key: HBASE-21245 > URL: https://issues.apache.org/jira/browse/HBASE-21245 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21245.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21245) Add exponential backoff when retrying for sync replication related procedures
[ https://issues.apache.org/jira/browse/HBASE-21245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21245: --- Attachment: HBASE-21245.master.001.patch > Add exponential backoff when retrying for sync replication related procedures > - > > Key: HBASE-21245 > URL: https://issues.apache.org/jira/browse/HBASE-21245 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21245.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21248) Implement exponential backoff when retrying for ModifyPeerProcedure
[ https://issues.apache.org/jira/browse/HBASE-21248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21248: -- Attachment: HBASE-21248-v2.patch > Implement exponential backoff when retrying for ModifyPeerProcedure > --- > > Key: HBASE-21248 > URL: https://issues.apache.org/jira/browse/HBASE-21248 > Project: HBase > Issue Type: Bug > Components: proc-v2, Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1 > > Attachments: HBASE-21248-v1.patch, HBASE-21248-v2.patch, > HBASE-21248.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21248) Implement exponential backoff when retrying for ModifyPeerProcedure
[ https://issues.apache.org/jira/browse/HBASE-21248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632703#comment-16632703 ] Duo Zhang commented on HBASE-21248: --- Seems I pushed the patch to master by accident... Reverted, let me try the pre commit again. > Implement exponential backoff when retrying for ModifyPeerProcedure > --- > > Key: HBASE-21248 > URL: https://issues.apache.org/jira/browse/HBASE-21248 > Project: HBase > Issue Type: Bug > Components: proc-v2, Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1 > > Attachments: HBASE-21248-v1.patch, HBASE-21248.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632692#comment-16632692 ] stack commented on HBASE-21213: --- .009 Fixes the TestShell failure and adds an EXPENSIVE recursive call as discussed above. Seems like I need it in my tests on this big cluster. Need to figure out the why but in meantime, it should let me fixup extremes. > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, > HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch, > HBASE-21213.branch-2.1.009.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21067) Backport HBASE-17519 (Rollback the removed cells) to branch-1.3
[ https://issues.apache.org/jira/browse/HBASE-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632690#comment-16632690 ] Hudson commented on HBASE-21067: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #487 (See [https://builds.apache.org/job/HBase-1.3-IT/487/]) HBASE-21067 Backport HBASE-17519 (Rollback the removed cells) (apurtell: rev 171f8f066ec072475ae4454e9b3f5d545cee73a3) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRollbackFromClient.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3 > --- > > Key: HBASE-21067 > URL: https://issues.apache.org/jira/browse/HBASE-21067 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 1.3.3 > > Attachments: HBASE-21067.branch-1.3.001.patch, > HBASE-21067.branch-1.3.002.patch > > > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3, which > handles rollback of append/increment completely in case of failure -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17519) Rollback the removed cells
[ https://issues.apache.org/jira/browse/HBASE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632691#comment-16632691 ] Hudson commented on HBASE-17519: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #487 (See [https://builds.apache.org/job/HBase-1.3-IT/487/]) HBASE-21067 Backport HBASE-17519 (Rollback the removed cells) (apurtell: rev 171f8f066ec072475ae4454e9b3f5d545cee73a3) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRollbackFromClient.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultMemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java > Rollback the removed cells > -- > > Key: HBASE-17519 > URL: https://issues.apache.org/jira/browse/HBASE-17519 > Project: HBase > Issue Type: Bug >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > Fix For: 1.4.0 > > Attachments: HBASE-17519.branch-1.v0.patch, > HBASE-17519.branch-1.v1.patch, HBASE-17519.branch-1.v1.patch, > HBASE-17519.branch-1.v2.patch, HBASE-17519.branch-1.v2.patch, > HBASE-17519.branch-1.v2.patch > > > The Store#upsert removes the old cells but we don’t rollback the removed > cells when failing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21067) Backport HBASE-17519 (Rollback the removed cells) to branch-1.3
[ https://issues.apache.org/jira/browse/HBASE-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21067: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Memstore, store, and append/increment tests pass. As did the test flagged in the precommit. Committing to branch-1.3. > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3 > --- > > Key: HBASE-21067 > URL: https://issues.apache.org/jira/browse/HBASE-21067 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 1.3.3 > > Attachments: HBASE-21067.branch-1.3.001.patch, > HBASE-21067.branch-1.3.002.patch > > > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3, which > handles rollback of append/increment completely in case of failure -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21194) Add TestCopyTable which exercises MOB feature
[ https://issues.apache.org/jira/browse/HBASE-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21194: --- Labels: mob (was: ) > Add TestCopyTable which exercises MOB feature > - > > Key: HBASE-21194 > URL: https://issues.apache.org/jira/browse/HBASE-21194 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > Labels: mob > > Currently TestCopyTable doesn't cover table(s) with MOB feature enabled. > We should add variant that enables MOB on the table being copied and verify > that MOB content is copied correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers
[ https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632670#comment-16632670 ] Hadoop QA commented on HBASE-21247: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for instructions. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 30s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 38s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}195m 31s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}251m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21247 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941726/21247.v4.txt | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux f3d0aa6d488f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 6bc7089f9e | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14534/testReport/ | | Max. process+thread count | 4275 (vs. ulimit of 1) | |
[jira] [Commented] (HBASE-21067) Backport HBASE-17519 (Rollback the removed cells) to branch-1.3
[ https://issues.apache.org/jira/browse/HBASE-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632663#comment-16632663 ] Andrew Purtell commented on HBASE-21067: Let me try to commit this after some local checks. > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3 > --- > > Key: HBASE-21067 > URL: https://issues.apache.org/jira/browse/HBASE-21067 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Fix For: 1.3.3 > > Attachments: HBASE-21067.branch-1.3.001.patch, > HBASE-21067.branch-1.3.002.patch > > > Backport HBASE-17519 (Rollback the removed cells) to branch-1.3, which > handles rollback of append/increment completely in case of failure -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632662#comment-16632662 ] Andrew Purtell commented on HBASE-21196: Sure thing Ted! > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that map" and "Looking > up meta region location in ZK" are present for every put. > *Analysis:* > The problem occurs as we call the {{cleanServerCache}} method always clears > the server cache in case tablename is null and exception is null. See > [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918] > {code:java} > private void cleanServerCache(ServerName server, Throwable regionException) { > if (tableName == null && > ClientExceptionsUtil.isMetaClearingException(regionException)) { > // For multi-actions, we don't have a table name, but we want to make > sure to clear the > // cache in case there were location-related exceptions. We don't to > clear the cache > // for every possible exception that comes
[jira] [Updated] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21196: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0.3 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Committed to master and all of the branch-2s > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that map" and "Looking > up meta region location in ZK" are present for every put. > *Analysis:* > The problem occurs as we call the {{cleanServerCache}} method always clears > the server cache in case tablename is null and exception is null. See > [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918] > {code:java} > private void cleanServerCache(ServerName server, Throwable regionException) { > if (tableName == null && > ClientExceptionsUtil.isMetaClearingException(regionException)) { > // For multi-actions, we don't have a table name, but we want to make >
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632658#comment-16632658 ] Ted Yu commented on HBASE-21196: Thanks for the commit, Andrew. > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that map" and "Looking > up meta region location in ZK" are present for every put. > *Analysis:* > The problem occurs as we call the {{cleanServerCache}} method always clears > the server cache in case tablename is null and exception is null. See > [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918] > {code:java} > private void cleanServerCache(ServerName server, Throwable regionException) { > if (tableName == null && > ClientExceptionsUtil.isMetaClearingException(regionException)) { > // For multi-actions, we don't have a table name, but we want to make > sure to clear the > // cache in case there were location-related exceptions. We don't to > clear the cache > // for every possible exception that comes through, however. >
[jira] [Commented] (HBASE-21249) Add jitter for ProcedureUtil.getBackoffTimeMs
[ https://issues.apache.org/jira/browse/HBASE-21249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632650#comment-16632650 ] Hudson commented on HBASE-21249: Results for branch branch-2.0 [build #878 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/878/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/878//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/878//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/878//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Add jitter for ProcedureUtil.getBackoffTimeMs > - > > Key: HBASE-21249 > URL: https://issues.apache.org/jira/browse/HBASE-21249 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Yi Mei >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21249.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21196) HTableMultiplexer clears the meta cache after every put operation
[ https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632647#comment-16632647 ] Andrew Purtell commented on HBASE-21196: Hmm, this looks like it was dropped on the floor. Let me try to commit. > HTableMultiplexer clears the meta cache after every put operation > - > > Key: HBASE-21196 > URL: https://issues.apache.org/jira/browse/HBASE-21196 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 3.0.0, 1.3.3, 2.2.0 >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-21196.master.001.patch, > HBASE-21196.master.001.patch, HBASE-21196.master.002.patch, > HTableMultiplexer1000Puts.UT.txt > > > *Problem:* Operations which use > {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction, ServerName, > MultiResponse, int)}} API with tablename set to null reset the meta cache of > the corresponding server after each call. One such operation is put operation > of HTableMultiplexer (Might not be the only one). This may impact the > performance of the system severely as all new ops directed to that server > will have to go to zk first to get the meta table address and then get the > location of the table region as it will become empty after every > htablemultiplexer put. > From the logs below, one can see after every other put the cached region > locations are cleared. As a side effect of this, before every put the server > needs to contact zk and get meta table location and read meta to get region > locations of the table. > {noformat} > 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): > Removed all cached region locations that map to > root1-thinkpad-t440p,35811,1536857446588 > 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5] > client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for > root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811] > ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get" > request_param: true priority: 0 timeout: 6 totalRequestSize: 137 bytes > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 executing as root1 > 2018-09-13 22:21:15,515 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size: > 137 connection: 127.0.0.1:42338 param: region= > testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f., > row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result { > associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0 > totalTime: 0 > 2018-09-13 22:21:15,516 TRACE > [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] > io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, > count=0, allocations=1 > 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get, > callTime: 2ms > 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan > table=hbase:meta, > startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99 > 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179): > Advancing internal small scanner to startKey at > 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99' > 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up > meta region location in ZK, > connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f > {noformat} > From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see > that the string "Removed all cached region locations that map" and "Looking > up meta region location in ZK" are present for every put. > *Analysis:* > The problem occurs as we call the {{cleanServerCache}} method always clears > the server cache in case tablename is null and exception is null. See > [AsyncRequestFutureImpl.java#L918|https://github.com/apache/hbase/blob/5d14c1af65c02f4e87059337c35e4431505de91c/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L918] > {code:java} > private void cleanServerCache(ServerName server, Throwable regionException) { > if (tableName == null && > ClientExceptionsUtil.isMetaClearingException(regionException)) { > // For multi-actions, we don't have a table name, but we want to make > sure to clear the > // cache in case there were location-related exceptions. We don't to > clear the cache > // for every
[jira] [Commented] (HBASE-21033) Separate the StoreHeap from StoreFileHeap
[ https://issues.apache.org/jira/browse/HBASE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632645#comment-16632645 ] Andrew Purtell commented on HBASE-21033: What does master / branch-2 changes look like? The same thing, basically? > Separate the StoreHeap from StoreFileHeap > - > > Key: HBASE-21033 > URL: https://issues.apache.org/jira/browse/HBASE-21033 > Project: HBase > Issue Type: Improvement >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Minor > Attachments: 21033-branch-1-minimal.txt, 21033-branch-1-v0.txt > > > Currently KeyValueHeap is used for both, heaps of StoreScanners at the Region > level as well as heaps of StoreFileScanners (and a MemstoreScanner) at the > Store level. > This is various problems: > # Some incorrect method usage can only be deduced at runtime via runtime > exception. > # In profiling sessions it's hard to distinguish the two. > # It's just not clean :) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20857: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.1.1 1.4.8 2.2.0 1.5.0 3.0.0 Status: Resolved (was: Patch Available) Pushed to master, branch-2, branch-2.1, branch-1, and branch-1.4. Thanks for the contribution [~kiran.maturi] > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Kiran Kumar Maturi >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-20857.branch-1.4.001.patch, > HBASE-20857.branch-1.4.002.patch > > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632633#comment-16632633 ] Hadoop QA commented on HBASE-21242: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 14s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 15s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 30s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 31s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 55s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 2s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 4s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 10s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}181m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 | | JIRA Issue | HBASE-21242 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941729/HBASE-21242.branch-2.1.002.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9dc1403a259f 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build
[jira] [Commented] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632628#comment-16632628 ] Andrew Purtell commented on HBASE-20857: Hmm, looks easy, proceeding > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Kiran Kumar Maturi >Priority: Major > Attachments: HBASE-20857.branch-1.4.001.patch, > HBASE-20857.branch-1.4.002.patch > > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632626#comment-16632626 ] Andrew Purtell commented on HBASE-20857: This looks fine to me, and will apply to branch-1.4 and branch-1, but what about more recent versions? Let me see how hard it is to apply to branch-2 and up. > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Kiran Kumar Maturi >Priority: Major > Attachments: HBASE-20857.branch-1.4.001.patch, > HBASE-20857.branch-1.4.002.patch > > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21249) Add jitter for ProcedureUtil.getBackoffTimeMs
[ https://issues.apache.org/jira/browse/HBASE-21249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632622#comment-16632622 ] Hudson commented on HBASE-21249: Results for branch branch-2.1 [build #391 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/391/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/391//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/391//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/391//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Add jitter for ProcedureUtil.getBackoffTimeMs > - > > Key: HBASE-21249 > URL: https://issues.apache.org/jira/browse/HBASE-21249 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Yi Mei >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21249.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21244) Skip persistence when retrying for assignment related procedures
[ https://issues.apache.org/jira/browse/HBASE-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632612#comment-16632612 ] Hudson commented on HBASE-21244: Results for branch branch-2 [build #1316 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Skip persistence when retrying for assignment related procedures > > > Key: HBASE-21244 > URL: https://issues.apache.org/jira/browse/HBASE-21244 > Project: HBase > Issue Type: Sub-task > Components: amv2, Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21244.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21249) Add jitter for ProcedureUtil.getBackoffTimeMs
[ https://issues.apache.org/jira/browse/HBASE-21249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632613#comment-16632613 ] Hudson commented on HBASE-21249: Results for branch branch-2 [build #1316 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1316//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Add jitter for ProcedureUtil.getBackoffTimeMs > - > > Key: HBASE-21249 > URL: https://issues.apache.org/jira/browse/HBASE-21249 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Yi Mei >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21249.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21207: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.8 2.2.0 1.5.0 3.0.0 Status: Resolved (was: Patch Available) Pushed to master, branch-2, branch-2.1, branch-1, and branch-1.4 > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8 > > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, HBASE-21207-branch-1.patch, > HBASE-21207-branch-1.v1.patch, HBASE-21207-branch-2.v1.patch, > HBASE-21207.patch, HBASE-21207.patch, HBASE-21207.v1.patch, > edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21117) Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing table locking issue.)
[ https://issues.apache.org/jira/browse/HBASE-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632602#comment-16632602 ] Andrew Purtell commented on HBASE-21117: Applying this to branch-1 and testing afterward with {{mvn clean install -Dtest=TestRSGroups}}, I get the below failure. If I go back one revision and history to current head of branch and test again, all units pass. [ERROR] Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 227.777 s <<< FAILURE! - in org.apache.hadoop.hbase.rsgroup.TestRSGroups [ERROR] testNamespaceConstraint(org.apache.hadoop.hbase.rsgroup.TestRSGroups) Time elapsed: 2.404 s <<< FAILURE! java.lang.AssertionError at org.apache.hadoop.hbase.rsgroup.TestRSGroups.testNamespaceConstraint(TestRSGroups.java:254) Same with branch-1.4 > Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing > table locking issue.) > -- > > Key: HBASE-21117 > URL: https://issues.apache.org/jira/browse/HBASE-21117 > Project: HBase > Issue Type: Bug > Components: backport, rsgroup, shell >Affects Versions: 1.3.2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Labels: backport > Attachments: HBASE-21117-branch-1.001.patch, > HBASE-21117-branch-1.002.patch > > > When working on HBASE-20666, I found out HBASE-18350 did not get ported to > branch-1, which causes procedure to hang when #moveTables called sometimes. > After looking into the 18350 patch, seems it's important since it fixes 4 > issues. This Jira is an attempt to backport it to branch-1. > > > Edited: Aug26. > After reviewed the HBASE-18350 patch. I decided to only port part 2 of the > patch. > Because part1 and part3 is AMv2 related. I won't touch is since Amv2 is only > for branch-2 > > {quote} > Subject: [PATCH] HBASE-18350 RSGroups are broken under AMv2 > - Table moving to RSG was buggy, because it left the table unassigned. > Now it is fixed we immediately assign to an appropriate RS > (MoveRegionProcedure). > *- Table was locked while moving, but unassign operation hung, because* > *locked table queues are not scheduled while locked. Fixed. port > this one.* > - ProcedureSyncWait was buggy, because it searched the procId in > executor, but executor does not store the return values of internal > operations (they are stored, but immediately removed by the cleaner). > - list_rsgroups in the shell show also the assigned tables and servers. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21117) Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing table locking issue.)
[ https://issues.apache.org/jira/browse/HBASE-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21117: --- Status: Open (was: Patch Available) > Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing > table locking issue.) > -- > > Key: HBASE-21117 > URL: https://issues.apache.org/jira/browse/HBASE-21117 > Project: HBase > Issue Type: Bug > Components: backport, rsgroup, shell >Affects Versions: 1.3.2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Labels: backport > Attachments: HBASE-21117-branch-1.001.patch, > HBASE-21117-branch-1.002.patch > > > When working on HBASE-20666, I found out HBASE-18350 did not get ported to > branch-1, which causes procedure to hang when #moveTables called sometimes. > After looking into the 18350 patch, seems it's important since it fixes 4 > issues. This Jira is an attempt to backport it to branch-1. > > > Edited: Aug26. > After reviewed the HBASE-18350 patch. I decided to only port part 2 of the > patch. > Because part1 and part3 is AMv2 related. I won't touch is since Amv2 is only > for branch-2 > > {quote} > Subject: [PATCH] HBASE-18350 RSGroups are broken under AMv2 > - Table moving to RSG was buggy, because it left the table unassigned. > Now it is fixed we immediately assign to an appropriate RS > (MoveRegionProcedure). > *- Table was locked while moving, but unassign operation hung, because* > *locked table queues are not scheduled while locked. Fixed. port > this one.* > - ProcedureSyncWait was buggy, because it searched the procId in > executor, but executor does not store the return values of internal > operations (they are stored, but immediately removed by the cleaner). > - list_rsgroups in the shell show also the assigned tables and servers. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632600#comment-16632600 ] Hudson commented on HBASE-19418: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #486 (See [https://builds.apache.org/job/HBase-1.3-IT/486/]) HBASE-19418 configurable range of delay in PeriodicMemstoreFlusher (apurtell: rev f91a912474551d136cfe8328b572a55b9fcdba3b) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-19418: --- Priority: Minor (was: Major) > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Minor > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21213: -- Attachment: HBASE-21213.branch-2.1.009.patch > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, > HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch, > HBASE-21213.branch-2.1.009.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-19418. Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.1.1 1.4.8 2.2.0 1.3.3 1.5.0 3.0.0 Pushed up, thanks for the contribution [~ramatronics] > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632574#comment-16632574 ] Hadoop QA commented on HBASE-21213: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} branch-2.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 5s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 59s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 7s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 33s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 29s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 50s{color} | {color:green} branch-2.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} branch-2.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 17m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 19s{color} | {color:red} hbase-server: The patch generated 1 new + 326 unchanged - 1 fixed = 327 total (was 327) {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 7s{color} | {color:red} The patch generated 3 new + 20 unchanged - 2 fixed = 23 total (was 22) {color} | | {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange} 0m 2s{color} | {color:orange} The patch generated 1 new + 41 unchanged - 0 fixed = 42 total (was 41) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 39s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 1s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 46s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 58s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632532#comment-16632532 ] Hadoop QA commented on HBASE-18451: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 28s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 24s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 32s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-18451 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941715/HBASE-18451.master.004.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 272ce9242570 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 6bc7089f9e | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14531/testReport/ | | Max. process+thread count | 5240 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14531/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. >
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18451: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.1.1 1.4.8 2.2.0 1.5.0 3.0.0 Status: Resolved (was: Patch Available) > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632526#comment-16632526 ] Andrew Purtell commented on HBASE-18451: Done, pushed to master, branch-2, branch-2.1, branch-1, branch-1.4. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8, 2.1.1 > > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: >
[jira] [Commented] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
[ https://issues.apache.org/jira/browse/HBASE-21216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632510#comment-16632510 ] Ted Yu commented on HBASE-21216: Looped the test 30 times locally with patch which all passed. > TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky > -- > > Key: HBASE-21216 > URL: https://issues.apache.org/jira/browse/HBASE-21216 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 21216.v1.txt > > > From > https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/ > : > {code} > java.lang.AssertionError: Archived hfiles [] and table hfiles > [9ca09392705f425f9c916beedc10d63c] is missing snapshot > file:6739a09747e54189a4112a6d8f37e894 > at > org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370) > {code} > The file appeared in archive dir before hfile cleaners were run: > {code} > 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-archive/ > 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): > |data/ > 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): > |---default/ > 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): > |--test/ > 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-1237d57b63a7bdf067a930441a02514a/ > 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): > |recovered.edits/ > 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---4.seqid > 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-29e1700e09b51223ad2f5811105a4d51/ > 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): > |fam/ > 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---2c66a18f6c1a4074b84ffbb3245268c4 > 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---45bb396c6a5e49629e45a4d56f1e9b14 > 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---6739a09747e54189a4112a6d8f37e894 > {code} > However, the archive dir became empty after hfile cleaners were run: > {code} > 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-archive/ > 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-corrupt/ > {code} > Leading to the assertion failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21216) TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky
[ https://issues.apache.org/jira/browse/HBASE-21216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-21216: -- Assignee: Ted Yu > TestSnapshotFromMaster#testSnapshotHFileArchiving is flaky > -- > > Key: HBASE-21216 > URL: https://issues.apache.org/jira/browse/HBASE-21216 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Attachments: 21216.v1.txt > > > From > https://builds.apache.org/job/HBase-Flaky-Tests/job/branch-2/794/testReport/junit/org.apache.hadoop.hbase.master.cleaner/TestSnapshotFromMaster/testSnapshotHFileArchiving/ > : > {code} > java.lang.AssertionError: Archived hfiles [] and table hfiles > [9ca09392705f425f9c916beedc10d63c] is missing snapshot > file:6739a09747e54189a4112a6d8f37e894 > at > org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:370) > {code} > The file appeared in archive dir before hfile cleaners were run: > {code} > 2018-09-20 10:38:53,187 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-archive/ > 2018-09-20 10:38:53,188 DEBUG [Time-limited test] util.CommonFSUtils(771): > |data/ > 2018-09-20 10:38:53,189 DEBUG [Time-limited test] util.CommonFSUtils(771): > |---default/ > 2018-09-20 10:38:53,190 DEBUG [Time-limited test] util.CommonFSUtils(771): > |--test/ > 2018-09-20 10:38:53,191 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-1237d57b63a7bdf067a930441a02514a/ > 2018-09-20 10:38:53,192 DEBUG [Time-limited test] util.CommonFSUtils(771): > |recovered.edits/ > 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---4.seqid > 2018-09-20 10:38:53,193 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-29e1700e09b51223ad2f5811105a4d51/ > 2018-09-20 10:38:53,194 DEBUG [Time-limited test] util.CommonFSUtils(771): > |fam/ > 2018-09-20 10:38:53,195 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---2c66a18f6c1a4074b84ffbb3245268c4 > 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---45bb396c6a5e49629e45a4d56f1e9b14 > 2018-09-20 10:38:53,196 DEBUG [Time-limited test] util.CommonFSUtils(774): > |---6739a09747e54189a4112a6d8f37e894 > {code} > However, the archive dir became empty after hfile cleaners were run: > {code} > 2018-09-20 10:38:53,312 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-archive/ > 2018-09-20 10:38:53,313 DEBUG [Time-limited test] util.CommonFSUtils(771): > |-corrupt/ > {code} > Leading to the assertion failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21242: -- Attachment: HBASE-21242.branch-2.1.002.patch > [amv2] Miscellaneous minor log and assign procedure create improvements > --- > > Key: HBASE-21242 > URL: https://issues.apache.org/jira/browse/HBASE-21242 > Project: HBase > Issue Type: Bug > Components: amv2, Operability >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.002.patch > > > Some minor fixups: > {code} > For RIT Duration, do better than print ms/seconds. Remove redundant UI > column dedicated to duration when we log it in the status field too. > Make bypass log at INFO level -- when DEBUG we can miss important > fixup detail like why we failed. > Make it so on complete of subprocedure, we note count of outstanding > siblings so we have a clue how much further the parent has to go before > it is done (Helpful when hundreds of servers doing SCP). > Have the SCP run the AP preflight check before creating an AP; saves > creation of hundreds of thousands of APs during fixup of this big cluster > of mine. > Don't log tablename three times when reporting remote call failed. > If lock is held already, note who has it. Also log after we get lock > or if we have to wait rather than log on entrance though we may > later have to wait (or we may have just picked up the lock). > {code} > Posting patch in a sec but let me try it on cluster too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21242: -- Attachment: HBASE-21242.branch-2.1.001.patch > [amv2] Miscellaneous minor log and assign procedure create improvements > --- > > Key: HBASE-21242 > URL: https://issues.apache.org/jira/browse/HBASE-21242 > Project: HBase > Issue Type: Bug > Components: amv2, Operability >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch > > > Some minor fixups: > {code} > For RIT Duration, do better than print ms/seconds. Remove redundant UI > column dedicated to duration when we log it in the status field too. > Make bypass log at INFO level -- when DEBUG we can miss important > fixup detail like why we failed. > Make it so on complete of subprocedure, we note count of outstanding > siblings so we have a clue how much further the parent has to go before > it is done (Helpful when hundreds of servers doing SCP). > Have the SCP run the AP preflight check before creating an AP; saves > creation of hundreds of thousands of APs during fixup of this big cluster > of mine. > Don't log tablename three times when reporting remote call failed. > If lock is held already, note who has it. Also log after we get lock > or if we have to wait rather than log on entrance though we may > later have to wait (or we may have just picked up the lock). > {code} > Posting patch in a sec but let me try it on cluster too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632260#comment-16632260 ] stack commented on HBASE-21242: --- On test failures, they seem unrelated and pass locally. Will retry. > [amv2] Miscellaneous minor log and assign procedure create improvements > --- > > Key: HBASE-21242 > URL: https://issues.apache.org/jira/browse/HBASE-21242 > Project: HBase > Issue Type: Bug > Components: amv2, Operability >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.001.patch > > > Some minor fixups: > {code} > For RIT Duration, do better than print ms/seconds. Remove redundant UI > column dedicated to duration when we log it in the status field too. > Make bypass log at INFO level -- when DEBUG we can miss important > fixup detail like why we failed. > Make it so on complete of subprocedure, we note count of outstanding > siblings so we have a clue how much further the parent has to go before > it is done (Helpful when hundreds of servers doing SCP). > Have the SCP run the AP preflight check before creating an AP; saves > creation of hundreds of thousands of APs during fixup of this big cluster > of mine. > Don't log tablename three times when reporting remote call failed. > If lock is held already, note who has it. Also log after we get lock > or if we have to wait rather than log on entrance though we may > later have to wait (or we may have just picked up the lock). > {code} > Posting patch in a sec but let me try it on cluster too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632258#comment-16632258 ] stack commented on HBASE-21242: --- Thanks for taking a look [~mdrob]. bq. Do we want to include the proc id in any of the new logging? It is there, no? A proc toString includes procId. Or perhaps you have a particular line in mind. > [amv2] Miscellaneous minor log and assign procedure create improvements > --- > > Key: HBASE-21242 > URL: https://issues.apache.org/jira/browse/HBASE-21242 > Project: HBase > Issue Type: Bug > Components: amv2, Operability >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.001.patch > > > Some minor fixups: > {code} > For RIT Duration, do better than print ms/seconds. Remove redundant UI > column dedicated to duration when we log it in the status field too. > Make bypass log at INFO level -- when DEBUG we can miss important > fixup detail like why we failed. > Make it so on complete of subprocedure, we note count of outstanding > siblings so we have a clue how much further the parent has to go before > it is done (Helpful when hundreds of servers doing SCP). > Have the SCP run the AP preflight check before creating an AP; saves > creation of hundreds of thousands of APs during fixup of this big cluster > of mine. > Don't log tablename three times when reporting remote call failed. > If lock is held already, note who has it. Also log after we get lock > or if we have to wait rather than log on entrance though we may > later have to wait (or we may have just picked up the lock). > {code} > Posting patch in a sec but let me try it on cluster too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21247) Allow WAL Provider to be specified by configuration without explicit enum in Providers
[ https://issues.apache.org/jira/browse/HBASE-21247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-21247: --- Attachment: 21247.v4.txt > Allow WAL Provider to be specified by configuration without explicit enum in > Providers > -- > > Key: HBASE-21247 > URL: https://issues.apache.org/jira/browse/HBASE-21247 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: 3.0.0 > > Attachments: 21247.v1.txt, 21247.v2.txt, 21247.v3.txt, 21247.v4.txt > > > Currently all the WAL Providers acceptable to hbase are specified in > Providers enum of WALFactory. > This restricts the ability for additional WAL Providers to be supplied - by > class name. > This issue introduces additional config which allows the specification of new > WAL Provider through class name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19418) RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable.
[ https://issues.apache.org/jira/browse/HBASE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632256#comment-16632256 ] Andrew Purtell commented on HBASE-19418: lgtm That test failure is not related. Let me see about committing this today. > RANGE_OF_DELAY in PeriodicMemstoreFlusher should be configurable. > - > > Key: HBASE-19418 > URL: https://issues.apache.org/jira/browse/HBASE-19418 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-alpha-4 >Reporter: Jean-Marc Spaggiari >Assignee: Ramie Raufdeen >Priority: Major > Attachments: HBASE-19418.master.000.patch > > > When RSs have a LOT of regions and CFs, flushing everything within 5 minutes > is not always doable. It might be interesting to be able to increase the > RANGE_OF_DELAY. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21117) Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing table locking issue.)
[ https://issues.apache.org/jira/browse/HBASE-21117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632254#comment-16632254 ] Andrew Purtell commented on HBASE-21117: lgtm Let me try to commit this today. > Backport HBASE-18350 (fix RSGroups) to branch-1 (Only port the part fixing > table locking issue.) > -- > > Key: HBASE-21117 > URL: https://issues.apache.org/jira/browse/HBASE-21117 > Project: HBase > Issue Type: Bug > Components: backport, rsgroup, shell >Affects Versions: 1.3.2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Labels: backport > Attachments: HBASE-21117-branch-1.001.patch, > HBASE-21117-branch-1.002.patch > > > When working on HBASE-20666, I found out HBASE-18350 did not get ported to > branch-1, which causes procedure to hang when #moveTables called sometimes. > After looking into the 18350 patch, seems it's important since it fixes 4 > issues. This Jira is an attempt to backport it to branch-1. > > > Edited: Aug26. > After reviewed the HBASE-18350 patch. I decided to only port part 2 of the > patch. > Because part1 and part3 is AMv2 related. I won't touch is since Amv2 is only > for branch-2 > > {quote} > Subject: [PATCH] HBASE-18350 RSGroups are broken under AMv2 > - Table moving to RSG was buggy, because it left the table unassigned. > Now it is fixed we immediately assign to an appropriate RS > (MoveRegionProcedure). > *- Table was locked while moving, but unassign operation hung, because* > *locked table queues are not scheduled while locked. Fixed. port > this one.* > - ProcedureSyncWait was buggy, because it searched the procId in > executor, but executor does not store the return values of internal > operations (they are stored, but immediately removed by the cleaner). > - list_rsgroups in the shell show also the assigned tables and servers. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632249#comment-16632249 ] Andrew Purtell commented on HBASE-18451: Doing a few local checks now > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of
[jira] [Commented] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632240#comment-16632240 ] Andrew Purtell commented on HBASE-21207: Thanks [~archana.katiyar] > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, HBASE-21207-branch-1.patch, > HBASE-21207-branch-1.v1.patch, HBASE-21207-branch-2.v1.patch, > HBASE-21207.patch, HBASE-21207.patch, HBASE-21207.v1.patch, > edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632239#comment-16632239 ] Andrew Purtell commented on HBASE-18451: Ok, I'll do the commit now unless someone beats me to it. > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting
[jira] [Updated] (HBASE-21231) Add documentation for MajorCompactor
[ https://issues.apache.org/jira/browse/HBASE-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-21231: -- Status: Open (was: Patch Available) moving out of patch available pending feedback on RB > Add documentation for MajorCompactor > > > Key: HBASE-21231 > URL: https://issues.apache.org/jira/browse/HBASE-21231 > Project: HBase > Issue Type: Task > Components: documentation >Affects Versions: 3.0.0 >Reporter: Balazs Meszaros >Assignee: Balazs Meszaros >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21231.master.001.patch > > > HBASE-19528 added a new MajorCompactor tool, but it lacks of documentation. > Let's document it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements
[ https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632232#comment-16632232 ] Mike Drob commented on HBASE-21242: --- Do we want to include the proc id in any of the new logging? > [amv2] Miscellaneous minor log and assign procedure create improvements > --- > > Key: HBASE-21242 > URL: https://issues.apache.org/jira/browse/HBASE-21242 > Project: HBase > Issue Type: Bug > Components: amv2, Operability >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21242.branch-2.1.001.patch, > HBASE-21242.branch-2.1.001.patch > > > Some minor fixups: > {code} > For RIT Duration, do better than print ms/seconds. Remove redundant UI > column dedicated to duration when we log it in the status field too. > Make bypass log at INFO level -- when DEBUG we can miss important > fixup detail like why we failed. > Make it so on complete of subprocedure, we note count of outstanding > siblings so we have a clue how much further the parent has to go before > it is done (Helpful when hundreds of servers doing SCP). > Have the SCP run the AP preflight check before creating an AP; saves > creation of hundreds of thousands of APs during fixup of this big cluster > of mine. > Don't log tablename three times when reporting remote call failed. > If lock is held already, note who has it. Also log after we get lock > or if we have to wait rather than log on entrance though we may > later have to wait (or we may have just picked up the lock). > {code} > Posting patch in a sec but let me try it on cluster too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632225#comment-16632225 ] stack commented on HBASE-20952: --- I find that I am responding to Ted Yu up on the doc. I've stopped commenting. I do not, by choice, want to work with him. Generally, I only review his work when it looks like something of his is about to be committed. IMO, he does more damage than good in the project and so for the good of the project, I review his work when I can. Thanks. > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'
[ https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sakthi updated HBASE-21225: --- Description: A part of HBASE-20705 is still unresolved. In that Jira it was assumed that problem is: when table having both rpc & space quotas is dropped (with hbase.quota.remove.on.table.delete set as true), the rpc quota is not set to be dropped along with table-drops, and space quota was not being able to be removed completely because of the "EMPTY" row that rpc quota left even after removing. The proposed solution for that was to make sure that rpc quota didn't leave empty rows after removal of quota. And setting automatic removal of rpc quota with table drops. That made sure that space quotas can be recreated/removed. But all this was under the assumption that hbase.quota.remove.on.table.delete is set as true. When it is set as false, the same issue can reproduced. Also the below shown steps can used to reproduce the issue without table-drops. {noformat} hbase(main):005:0> create 't2','cf' Created table t2 Took 0.7619 seconds => Hbase::Table - t2 hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec' Took 0.0514 seconds hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0162 seconds hbase(main):008:0> list_quotas OWNER QUOTAS TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, VIOLATION_POLICY => NO_WRIT ES 2 row(s) Took 0.0716 seconds hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE Took 0.0082 seconds hbase(main):010:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0254 seconds hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0082 seconds hbase(main):012:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0411 seconds {noformat} was: A part of HBASE-20705 is still unresolved. In that Jira it was assumed that problem is: when table having both rpc & space quotas is dropped (with hbase.delete.on.table.remove set as true), the quotas are still present in the quotas table. {noformat} hbase(main):005:0> create 't2','cf' Created table t2 Took 0.7619 seconds => Hbase::Table - t2 hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec' Took 0.0514 seconds hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0162 seconds hbase(main):008:0> list_quotas OWNER QUOTAS TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, VIOLATION_POLICY => NO_WRIT ES 2 row(s) Took 0.0716 seconds hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE Took 0.0082 seconds hbase(main):010:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0254 seconds hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0082 seconds hbase(main):012:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0411 seconds {noformat} > Having RPC & Space quota on a table/Namespace doesn't allow space quota to be > removed using 'NONE' > -- > > Key: HBASE-21225 > URL: https://issues.apache.org/jira/browse/HBASE-21225 > Project: HBase > Issue Type: Bug >Reporter: Sakthi >Assignee: Sakthi >Priority: Major > Attachments: hbase-21225.master.001.patch > > > A part of HBASE-20705 is still unresolved. In that Jira it was assumed that > problem is: when table having both rpc & space quotas is dropped (with >
[jira] [Updated] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'
[ https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sakthi updated HBASE-21225: --- Description: A part of HBASE-20705 is still unresolved. In that Jira it was assumed that problem is: when table having both rpc & space quotas is dropped (with hbase.delete.on.table.remove set as true), the quotas are still present in the quotas table. {noformat} hbase(main):005:0> create 't2','cf' Created table t2 Took 0.7619 seconds => Hbase::Table - t2 hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec' Took 0.0514 seconds hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0162 seconds hbase(main):008:0> list_quotas OWNER QUOTAS TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, VIOLATION_POLICY => NO_WRIT ES 2 row(s) Took 0.0716 seconds hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE Took 0.0082 seconds hbase(main):010:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0254 seconds hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0082 seconds hbase(main):012:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0411 seconds {noformat} was: A part of HBASE-20705 is still unresolved {noformat} hbase(main):005:0> create 't2','cf' Created table t2 Took 0.7619 seconds => Hbase::Table - t2 hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => '10M/sec' Took 0.0514 seconds hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0162 seconds hbase(main):008:0> list_quotas OWNER QUOTAS TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, VIOLATION_POLICY => NO_WRIT ES 2 row(s) Took 0.0716 seconds hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE Took 0.0082 seconds hbase(main):010:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0254 seconds hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', POLICY => NO_WRITES Took 0.0082 seconds hbase(main):012:0> list_quotas OWNER QUOTAS TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true 2 row(s) Took 0.0411 seconds {noformat} > Having RPC & Space quota on a table/Namespace doesn't allow space quota to be > removed using 'NONE' > -- > > Key: HBASE-21225 > URL: https://issues.apache.org/jira/browse/HBASE-21225 > Project: HBase > Issue Type: Bug >Reporter: Sakthi >Assignee: Sakthi >Priority: Major > Attachments: hbase-21225.master.001.patch > > > A part of HBASE-20705 is still unresolved. In that Jira it was assumed that > problem is: when table having both rpc & space quotas is dropped (with > hbase.delete.on.table.remove set as true), the quotas are still present in > the quotas table. > {noformat} > hbase(main):005:0> create 't2','cf' > Created table t2 > Took 0.7619 seconds > => Hbase::Table - t2 > hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => > '10M/sec' > Took 0.0514 seconds > hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', > POLICY => NO_WRITES > Took 0.0162 seconds > hbase(main):008:0> list_quotas > OWNER QUOTAS > TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, > LIMIT => 10M/sec, SCOPE => >MACHINE > TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, > VIOLATION_POLICY => NO_WRIT >ES > 2 row(s) > Took 0.0716 seconds >
[jira] [Commented] (HBASE-21248) Implement exponential backoff when retrying for ModifyPeerProcedure
[ https://issues.apache.org/jira/browse/HBASE-21248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632197#comment-16632197 ] Hadoop QA commented on HBASE-21248: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 50s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 3s{color} | {color:red} hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 57s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 57s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}287m 4s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}329m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestMobRestoreSnapshotFromClient | | | hadoop.hbase.client.TestAsyncTableAdminApi | | | hadoop.hbase.util.TestFromClientSide3WoUnsafe | | | hadoop.hbase.replication.TestReplicationSmallTests | | | hadoop.hbase.client.TestFromClientSideWithCoprocessor | | | hadoop.hbase.namespace.TestNamespaceAuditor | | | hadoop.hbase.regionserver.TestRegionReplicaFailover | | | hadoop.hbase.client.TestFromClientSide3 | | | hadoop.hbase.replication.TestReplicationKillSlaveRS | | | hadoop.hbase.replication.TestReplicationSmallTestsSync | | | hadoop.hbase.replication.TestMasterReplication | | | hadoop.hbase.client.replication.TestReplicationAdminWithClusters | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941680/HBASE-21248-v1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux aa1d382729e6 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018
[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632195#comment-16632195 ] Mike Drob commented on HBASE-18451: --- For branch-1 patch, the javac warnings are showing up because the indent level changed so the line number filtering doesn't work as expected. Fine to ignore that for now as a false positive. +1 for both > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO
[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632190#comment-16632190 ] stack commented on HBASE-21213: --- .008 Fix checkstyle and findbugs. > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, > HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
[ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21213: -- Attachment: HBASE-21213.branch-2.1.008.patch > [hbck2] bypass leaves behind state in RegionStates when assign/unassign > --- > > Key: HBASE-21213 > URL: https://issues.apache.org/jira/browse/HBASE-21213 > Project: HBase > Issue Type: Bug > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: HBASE-21213.branch-2.1.001.patch, > HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, > HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, > HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, > HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch > > > This is a follow-on from HBASE-21083 which added the 'bypass' functionality. > On bypass, there is more state to be cleared if we are allow new Procedures > to be scheduled. > For example, here is a bypass: > {code} > 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: > pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, > bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null > to finish it > 2018-09-20 05:45:44,022 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, > state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, > region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec > {code} > ... but then when I try to assign the bypassed region later, I get this: > {code} > 2018-09-20 05:46:31,435 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is > already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, > location=ve1233.halxg.cloudera.com,22101,1537397961664 > 2018-09-20 05:46:31,510 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, > state=ROLLEDBACK, > exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via > AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: > There is already another procedure running on this region this=pid=100450, > state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, > server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure > table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 > exec-time=473msec > {code} > ... which is a long-winded way of saying the Unassign Procedure still exists > still in RegionStateNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
[ https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang updated HBASE-18451: Attachment: HBASE-18451.master.004.patch > PeriodicMemstoreFlusher should inspect the queue before adding a delayed > flush request > -- > > Key: HBASE-18451 > URL: https://issues.apache.org/jira/browse/HBASE-18451 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-1 >Reporter: Jean-Marc Spaggiari >Assignee: Xu Cang >Priority: Major > Attachments: HBASE-18451.branch-1.001.patch, > HBASE-18451.branch-1.002.patch, HBASE-18451.branch-1.002.patch, > HBASE-18451.master.002.patch, HBASE-18451.master.003.patch, > HBASE-18451.master.004.patch, HBASE-18451.master.004.patch, > HBASE-18451.master.patch > > > If you run a big job every 4 hours, impacting many tables (they have 150 > regions per server), ad the end all the regions might have some data to be > flushed, and we want, after one hour, trigger a periodic flush. That's > totally fine. > Now, to avoid a flush storm, when we detect a region to be flushed, we add a > "randomDelay" to the delayed flush, that way we spread them away. > RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, > which is very good. > However, because we don't check if there is already a request in the queue, > 10 seconds after, we create a new request, with a new randomDelay. > If you generate a randomDelay every 10 seconds, at some point, you will end > up having a small one, and the flush will be triggered almost immediatly. > As a result, instead of spreading all the flush within the next 5 minutes, > you end-up getting them all way more quickly. Like within the first minute. > Which not only feed the queue to to many flush requests, but also defeats the > purpose of the randomDelay. > {code} > @Override > protected void chore() { > final StringBuffer whyFlush = new StringBuffer(); > for (Region r : this.server.onlineRegions.values()) { > if (r == null) continue; > if (((HRegion)r).shouldFlush(whyFlush)) { > FlushRequester requester = server.getFlushRequester(); > if (requester != null) { > long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + > MIN_DELAY_TIME; > LOG.info(getName() + " requesting flush of " + > r.getRegionInfo().getRegionNameAsString() + " because " + > whyFlush.toString() + > " after random delay " + randomDelay + "ms"); > //Throttle the flushes by putting a delay. If we don't throttle, > and there > //is a balanced write-load on the regions in a table, we might > end up > //overwhelming the filesystem with too many flushes at once. > requester.requestDelayedFlush(r, randomDelay, false); > } > } > } > } > {code} > {code} > 2017-07-24 18:44:33,338 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 270785ms > 2017-07-24 18:44:43,328 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 200143ms > 2017-07-24 18:44:53,954 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 191082ms > 2017-07-24 18:45:03,528 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 92532ms > 2017-07-24 18:45:14,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f > has an old edit so flush to free WALs after random delay 238780ms > 2017-07-24 18:45:24,195 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting > flush of
[jira] [Updated] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21220: --- Status: Patch Available (was: Open) Resubmit patch > Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and > ROWPREFIX_DELIMITED) to branch-1 > -- > > Key: HBASE-21220 > URL: https://issues.apache.org/jira/browse/HBASE-21220 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0 > > Attachments: HBASE-21220-branch-1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21220) Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and ROWPREFIX_DELIMITED) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21220: --- Status: Open (was: Patch Available) > Port HBASE-20636 (Introduce two bloom filter type : ROWPREFIX and > ROWPREFIX_DELIMITED) to branch-1 > -- > > Key: HBASE-21220 > URL: https://issues.apache.org/jira/browse/HBASE-21220 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Major > Fix For: 1.5.0 > > Attachments: HBASE-21220-branch-1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632147#comment-16632147 ] Josh Elser commented on HBASE-20952: {quote}This aspect is thin to absent (IMO). {quote} That's fair. It's hard to get this all correct and down on paper without sending a (literal) book out for review. I see Ted is already on making some modifications, and I'll make some passes over the comments too. Appreciate your input, boss. I'll be focusing on your review points around "how things work now" to polish that up. Will make it easier to then shift into beefing up the parts on "what to change". > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632118#comment-16632118 ] Hadoop QA commented on HBASE-21207: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 9s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 11s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 34s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 5s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 46s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 2s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}169m 20s{color} | {color:green} root in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 53s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}249m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 | | JIRA Issue | HBASE-21207 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941681/HBASE-21207-branch-2.v1.patch | | Optional Tests | dupname asflicense javac javadoc unit shadedjars hadoopcheck xml compile | | uname | Linux 6ebc08b8115a 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 17 11:07:07 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2 / 0ef57439cc | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14529/testReport/ | | Max. process+thread count | 4934 (vs. ulimit of 1) | | modules | C: hbase-server . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14529/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add client side sorting functionality in master web UI for table and region > server details. >
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632108#comment-16632108 ] stack commented on HBASE-20952: --- bq. From the chatter above on this issue, the request was that a document first be presented which outlines how HBase uses WALs and idiosyncrasies/gotchas around that usage. Makes sense. Can't change something if you don't know what it entails. I'd think any "revisit" would have such a preface. bq. and then made a proposal if there was something obvious that should be done instead. This aspect is thin to absent (IMO). > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632097#comment-16632097 ] Josh Elser commented on HBASE-20952: {quote}Yeah, sorry, I'm lost on what the doc is supposed to be doing. {quote} >From the chatter above on this issue, the request was that a document first be >presented which outlines how HBase uses WALs and idiosyncrasies/gotchas around >that usage. The premise being: if we don't understand how we use WALs, we >can't make a proposal around what "ideal" is. The structure of the document was that, for each section, we covered how it works now and then made a proposal if there was something obvious that should be done instead. Obviously, emphasis is more on the former than the latter (given the immediately above acknowledgements) Thanks for reviewing. > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21233) Allow the procedure implementation to skip persistence of the state after a execution
[ https://issues.apache.org/jira/browse/HBASE-21233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632093#comment-16632093 ] stack commented on HBASE-21233: --- +1 > Allow the procedure implementation to skip persistence of the state after a > execution > - > > Key: HBASE-21233 > URL: https://issues.apache.org/jira/browse/HBASE-21233 > Project: HBase > Issue Type: Sub-task > Components: Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21233.patch, HBASE-21233.patch > > > Discussed with [~stack] and [~allan163] on HBASE-21035, that when retrying we > do not need to persist the procedure state every time, as the retry timeout > is not a critical stuff. It is OK that we loss this information and start > from 0 when after restarting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632087#comment-16632087 ] stack commented on HBASE-20952: --- bq. The point of this doc was to make sure we had a clear picture of all "WAL-related" things in HBase and what their roles & responsibilities were. Sorry. I did not get that that was the point reading the doc. The preamble drops just when it is about to make the point of what the doc is for. It is followed by three sections: "Components of WAL System", "Evolving individual WAL system components", and "Limitation". The first seems to be for the 'WAL-related' listing you suggest. The second's section implies a listing of how hbase will be changed ('Evolving') but it seems to be just a continuation of the listing ... Yeah, sorry, I'm lost on what the doc is supposed to be doing. I also made notes on imprecision and stuff I thought incorrect. Thanks. > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632075#comment-16632075 ] Josh Elser commented on HBASE-20952: {quote}IMO, it gives little to no inkling as to how hbase will be changed. I left some comments. Thanks. {quote} Yes, that was intentional. The point of this doc was to make sure we had a clear picture of all "WAL-related" things in HBase and what their roles & responsibilities were. > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21186) Document hbase.regionserver.executor.openregion.threads in MTTR section
[ https://issues.apache.org/jira/browse/HBASE-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632020#comment-16632020 ] Josh Elser commented on HBASE-21186: How about adding that property to the existing site.xml configuration block under "Set the following in the RegionServer." instead of adding it where you have? Further, what should a user be setting this property to if the default of "3" is bad? 10 to 20? > Document hbase.regionserver.executor.openregion.threads in MTTR section > --- > > Key: HBASE-21186 > URL: https://issues.apache.org/jira/browse/HBASE-21186 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Sahil Aggarwal >Assignee: Sahil Aggarwal >Priority: Minor > Attachments: HBASE-21186.master.001.patch, > HBASE-21186.master.002.patch > > > hbase.regionserver.executor.openregion.threads helps in improving MTTR by > increasing assign rpc processing rate at RS from HMaster but is not > documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'
[ https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632004#comment-16632004 ] Josh Elser commented on HBASE-21225: [~jatsakthi], please update the description with an explanation of the problem, not just the reproduction steps. In HBASE-20705, you seemed to have some notion of why this still doesn't work in all cases. A couple of comments: 1. Why all of the modification/removal of existing tests? {code:java} if (quotaToMerge.getRemove()) { - // Update the builder to propagate the removal - spaceBuilder.setRemove(true).clearSoftLimit().clearViolationPolicy(); + // To prevent the "REMOVE => true" row in QuotaTableUtil.QUOTA_TABLE_NAME + spaceBuilder = null;{code} 2. The purpose of the {{REMOVE => true}} entry is to denote that HBase should remove the SpaceQuota when it writes that out. With this change, it seems like you're leaving the {{REMOVE}} attribute in the protobuf message, but completely ignoring it which is confusing. IMO, the bug is that GlobalSettingsQuotaImpl is not correctly removing the SpaceQuota when REMOVE is set to true. Seems the logic I had initially written around an RPC and Space quota on the same table/namespace is lacking. Does that make sense? > Having RPC & Space quota on a table/Namespace doesn't allow space quota to be > removed using 'NONE' > -- > > Key: HBASE-21225 > URL: https://issues.apache.org/jira/browse/HBASE-21225 > Project: HBase > Issue Type: Bug >Reporter: Sakthi >Assignee: Sakthi >Priority: Major > Attachments: hbase-21225.master.001.patch > > > A part of HBASE-20705 is still unresolved > {noformat} > hbase(main):005:0> create 't2','cf' > Created table t2 > Took 0.7619 seconds > => Hbase::Table - t2 > hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => > '10M/sec' > Took 0.0514 seconds > hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', > POLICY => NO_WRITES > Took 0.0162 seconds > hbase(main):008:0> list_quotas > OWNER QUOTAS > TABLE => t2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, > LIMIT => 10M/sec, SCOPE => >MACHINE > TABLE => t2 TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, > VIOLATION_POLICY => NO_WRIT >ES > 2 row(s) > Took 0.0716 seconds > hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE > Took 0.0082 seconds > hbase(main):010:0> list_quotas > OWNER QUOTAS > TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => > REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE > TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true > 2 row(s) > Took 0.0254 seconds > hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', > POLICY => NO_WRITES > Took 0.0082 seconds > hbase(main):012:0> list_quotas > OWNER QUOTAS > TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => > REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE > TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true > 2 row(s) > Took 0.0411 seconds > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21253) Backport HBASE-21244 Skip persistence when retrying for assignment related procedures to branch-2.0 and branch-2.1
Allan Yang created HBASE-21253: -- Summary: Backport HBASE-21244 Skip persistence when retrying for assignment related procedures to branch-2.0 and branch-2.1 Key: HBASE-21253 URL: https://issues.apache.org/jira/browse/HBASE-21253 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang See HBASE-21244 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21244) Skip persistence when retrying for assignment related procedures
[ https://issues.apache.org/jira/browse/HBASE-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631945#comment-16631945 ] Allan Yang commented on HBASE-21244: {quote} Allan Yang Mind opening a issue for branch-2.1 and branch-2.0? The assignment related procedures are different for these two branches. {quote} Sure, with pleasure. > Skip persistence when retrying for assignment related procedures > > > Key: HBASE-21244 > URL: https://issues.apache.org/jira/browse/HBASE-21244 > Project: HBase > Issue Type: Sub-task > Components: amv2, Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21244.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21187) The HBase UTs are extremely slow on some jenkins node
[ https://issues.apache.org/jira/browse/HBASE-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631867#comment-16631867 ] Duo Zhang commented on HBASE-21187: --- This is build 993 {noformat} 07:41:03 up 81 days, 20 min, 0 users, load average: 0.92, 0.51, 0.66 {noformat} We passed. 994 {noformat} 10:09:37 up 81 days, 2:28, 0 users, load average: 14.13, 12.51, 14.56 {noformat} Lots of tests failed. 995 {noformat} 10:51:42 up 81 days, 3:15, 0 users, load average: 3.84, 4.95, 7.31 {noformat} Only TestRSGroups failed. 996 {noformat} 11:16:46 up 36 days, 13:35, 0 users, load average: 13.16, 11.57, 11.42 {noformat} Lots of tests failed. 997 {noformat} 11:46:21 up 81 days, 4:25, 0 users, load average: 2.34, 3.17, 6.23 {noformat} Only TestCompactingToCellFlatMapMemStore failed. And it is not because of timeout, just an assertion error, so this one is truly flaky... So I think the problem is that, for TRSP, we will have one more procedure as we use a sub procedure to schedule the remote procedure to simplify the logic, so on a already loaded machine, and if there are lots of regions to assign/unassign, it will be slower as there are extra context switches, and lead to the timeout... > The HBase UTs are extremely slow on some jenkins node > - > > Key: HBASE-21187 > URL: https://issues.apache.org/jira/browse/HBASE-21187 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Duo Zhang >Priority: Major > > Looking at the flaky dashboard for master branch, the top several UTs are > likely to fail at the same time. One of the common things for the failed > flaky tests job is that, the execution time is more than one hour, and the > successful executions are usually only about half an hour. > And I have compared the output for > TestRestoreSnapshotFromClientWithRegionReplicas, for a successful run, the > DisableTableProcedure can finish within one second, and for the failed run, > it can take even more than half a minute. > Not sure what is the real problem, but it seems that for the failed runs, > there are likely time holes in the output, i.e, there is no log output for > several seconds. Like this: > {noformat} > 2018-09-11 21:08:08,152 INFO [PEWorker-4] > procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, > hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in > 12.9380sec > 2018-09-11 21:08:15,590 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=33663] > master.MasterRpcServices(1174): Checking to see if procedure is done pid=490 > {noformat} > No log output for about 7 seconds. > And for a successful run, the same place > {noformat} > 2018-09-12 07:47:32,488 INFO [PEWorker-7] > procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, > hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in > 1.2220sec > 2018-09-12 07:47:32,881 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=59079] > master.MasterRpcServices(1174): Checking to see if procedure is done pid=490 > {noformat} > There is no such hole. > Maybe there is big GC? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21252) Collect loadavg when gathering machine environment
[ https://issues.apache.org/jira/browse/HBASE-21252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-21252. --- Resolution: Not A Problem OK, uptime already contains the load information. Let me check. > Collect loadavg when gathering machine environment > -- > > Key: HBASE-21252 > URL: https://issues.apache.org/jira/browse/HBASE-21252 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > > Skimmed several flaky test report, it seems that, for a successful run, the > SystemLoadAverage is usually less than 500, and for a failed run, the > SystemLoadAverage is usually greater than 1000. > Let's collect the loadavg in the script to see if the machine itself has > already overloaded before we actually execute the tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21249) Add jitter for ProcedureUtil.getBackoffTimeMs
[ https://issues.apache.org/jira/browse/HBASE-21249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21249: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-2.0+. Thanks [~Yi Mei] for contributing. > Add jitter for ProcedureUtil.getBackoffTimeMs > - > > Key: HBASE-21249 > URL: https://issues.apache.org/jira/browse/HBASE-21249 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Yi Mei >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21249.master.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21244) Skip persistence when retrying for assignment related procedures
[ https://issues.apache.org/jira/browse/HBASE-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21244: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to master and branch-2. Thanks [~zghaobac] for reviewing. [~allan163] Mind opening a issue for branch-2.1 and branch-2.0? The assignment related procedures are different for these two branches. Thanks. > Skip persistence when retrying for assignment related procedures > > > Key: HBASE-21244 > URL: https://issues.apache.org/jira/browse/HBASE-21244 > Project: HBase > Issue Type: Sub-task > Components: amv2, Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21244.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21249) Add jitter for ProcedureUtil.getBackoffTimeMs
[ https://issues.apache.org/jira/browse/HBASE-21249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631824#comment-16631824 ] Hadoop QA commented on HBASE-21249: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 20s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 49s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 46s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 42m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21249 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941660/HBASE-21249.master.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 49f3101ea292 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 22ac655704 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14528/testReport/ | | Max. process+thread count | 282 (vs. ulimit of 1) | | modules | C: hbase-procedure U: hbase-procedure | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14528/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add jitter
[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'
[ https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631821#comment-16631821 ] Hadoop QA commented on HBASE-21225: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 53s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 55s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 53s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 36s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21225 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941663/hbase-21225.master.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7a4fc891e108 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 22ac655704 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/14527/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14527/testReport/ | | Max. process+thread count | 4598 (vs. ulimit of 1) | | modules | C: hbase-server U:
[jira] [Created] (HBASE-21252) Collect loadavg when gathering machine environment
Duo Zhang created HBASE-21252: - Summary: Collect loadavg when gathering machine environment Key: HBASE-21252 URL: https://issues.apache.org/jira/browse/HBASE-21252 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang Skimmed several flaky test report, it seems that, for a successful run, the SystemLoadAverage is usually less than 500, and for a failed run, the SystemLoadAverage is usually greater than 1000. Let's collect the loadavg in the script to see if the machine itself has already overloaded before we actually execute the tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21233) Allow the procedure implementation to skip persistence of the state after a execution
[ https://issues.apache.org/jira/browse/HBASE-21233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631812#comment-16631812 ] Hudson commented on HBASE-21233: Results for branch master [build #515 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/515/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/515//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/515//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/515//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Allow the procedure implementation to skip persistence of the state after a > execution > - > > Key: HBASE-21233 > URL: https://issues.apache.org/jira/browse/HBASE-21233 > Project: HBase > Issue Type: Sub-task > Components: Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21233.patch, HBASE-21233.patch > > > Discussed with [~stack] and [~allan163] on HBASE-21035, that when retrying we > do not need to persist the procedure state every time, as the retry timeout > is not a critical stuff. It is OK that we loss this information and start > from 0 when after restarting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker
[ https://issues.apache.org/jira/browse/HBASE-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21188: -- Resolution: Later Status: Resolved (was: Patch Available) Reverted for now as it is not the cause for the failing UTs, and gc time and count are not 'resources'. > Print heap and gc informations in our junit ResourceChecker > --- > > Key: HBASE-21188 > URL: https://issues.apache.org/jira/browse/HBASE-21188 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21188.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21244) Skip persistence when retrying for assignment related procedures
[ https://issues.apache.org/jira/browse/HBASE-21244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631797#comment-16631797 ] Guanghao Zhang commented on HBASE-21244: +1 > Skip persistence when retrying for assignment related procedures > > > Key: HBASE-21244 > URL: https://issues.apache.org/jira/browse/HBASE-21244 > Project: HBase > Issue Type: Sub-task > Components: amv2, Performance, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21244.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631772#comment-16631772 ] Archana Katiyar commented on HBASE-21207: - Thanks [~apurtell]; I have uploaded patch for branch-2 as well. (patch for branch-1 was already uploaded). > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, HBASE-21207-branch-1.patch, > HBASE-21207-branch-1.v1.patch, HBASE-21207-branch-2.v1.patch, > HBASE-21207.patch, HBASE-21207.patch, HBASE-21207.v1.patch, > edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)