[jira] [Created] (HBASE-24028) MapReduce on snapshot restores and opens all regions in each mapper
Xu Cang created HBASE-24028: --- Summary: MapReduce on snapshot restores and opens all regions in each mapper Key: HBASE-24028 URL: https://issues.apache.org/jira/browse/HBASE-24028 Project: HBase Issue Type: Bug Affects Versions: 1.6.0, 2.3.0 Reporter: Xu Cang Given this scenario: one MR job scans a table (with many regions). I will use 'RestoreSnapshotHelper' to restore snapshot for all regions in each mapper. In the code [https://github.com/apache/hbase/blob/branch-2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L183] Seems there is no way to only restore relevant regions from snapshot to region. This leads to extreme slowness and waste of resource. Please correct me if I am wrong or miss anything. thanks. One quick example I san show as below, in my test, there are 2 regions in a testing table. and each mapper opens and iterates 2 regions. 2020-03-19 18:58:15,225 INFO [main] mapred.MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2020-03-19 18:58:15,285 INFO [main] snapshot.RestoreSnapshotHelper - region to add: *d7f85b4a9d3fa22a5e7b88bda39f6d50* 2020-03-19 18:58:15,285 INFO [main] snapshot.RestoreSnapshotHelper - region to add: *69dd3fdba3698f827f8883ed911161ef* 2020-03-19 18:58:15,286 INFO [main] snapshot.RestoreSnapshotHelper - clone region=d7f85b4a9d3fa22a5e7b88bda39f6d50 as d7f85b4a9d3fa22a5e7b88bda39f6d50 So if I misunderstood anything, can anyone point to me where in this class, can distinguish which region to go through for different mappers? btw the original implementation for MR on Snapshot is here, there weren't too many big changes after that HBASE-8369 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-23143) Region Server Crash due to 2 cells out of order ( between 2 DELETEs)
Xu Cang created HBASE-23143: --- Summary: Region Server Crash due to 2 cells out of order ( between 2 DELETEs) Key: HBASE-23143 URL: https://issues.apache.org/jira/browse/HBASE-23143 Project: HBase Issue Type: Bug Affects Versions: 1.3.2 Reporter: Xu Cang Region Server Crash due to 2 cells out of order ( between 2 DELETEs) Caused by: java.io.IOException: Added a key not lexically larger than previous. Current cell = 00D7F00xxQ10D52v8UY6yV0057F00bPaGT\x00057F00bPaG/0:TABLE1_ID/*1570095189597*/DeleteColumn/vlen=0/seqid=*2128373*, lastCell = 00D7F00xxQ10D52v8UY6yV0057F00bPaGT\x00057F00bPaG/0:TABLE1_ID/*1570095165147*/DeleteColumn/vlen=0/seqid=*2128378* I am aware https://issues.apache.org/jira/browse/HBASE-22862 but it's slightly different, this issue is not caused by One Delete and One Put. This issue I am seeing is caused by 2 Deletes Has anyone seen this issue? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-22804) Provide an API to get list of successful regions and total expected regions in Canary
[ https://issues.apache.org/jira/browse/HBASE-22804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-22804. - Fix Version/s: 1.4.12 2.2.2 2.1.7 1.3.6 2.3.1 3.0.0 Resolution: Fixed > Provide an API to get list of successful regions and total expected regions > in Canary > - > > Key: HBASE-22804 > URL: https://issues.apache.org/jira/browse/HBASE-22804 > Project: HBase > Issue Type: Improvement > Components: canary >Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0, 2.1.5, 2.2.1 >Reporter: Caroline >Assignee: Caroline >Priority: Minor > Labels: Canary > Fix For: 3.0.0, 1.5.0, 2.3.1, 1.3.6, 2.1.7, 2.2.2, 1.4.12 > > Attachments: HBASE-22804.branch-1.001.patch, > HBASE-22804.branch-1.002.patch, HBASE-22804.branch-1.003.patch, > HBASE-22804.branch-1.004.patch, HBASE-22804.branch-1.005.patch, > HBASE-22804.branch-1.006.patch, HBASE-22804.branch-1.007.patch, > HBASE-22804.branch-1.008.patch, HBASE-22804.branch-1.009.patch, > HBASE-22804.branch-1.009.patch, HBASE-22804.branch-1.010.patch, > HBASE-22804.branch-2.001.patch, HBASE-22804.branch-2.002.patch, > HBASE-22804.branch-2.003.patch, HBASE-22804.branch-2.004.patch, > HBASE-22804.branch-2.005.patch, HBASE-22804.branch-2.006.patch, > HBASE-22804.master.001.patch, HBASE-22804.master.002.patch, > HBASE-22804.master.003.patch, HBASE-22804.master.004.patch, > HBASE-22804.master.005.patch, HBASE-22804.master.006.patch > > > At present HBase Canary tool only prints the successes as part of logs. > Providing an API to get the list of successes, as well as total number of > expected regions, will make it easier to get a more accurate availability > estimate. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (HBASE-22775) Enhance logging for peer related operations
Xu Cang created HBASE-22775: --- Summary: Enhance logging for peer related operations Key: HBASE-22775 URL: https://issues.apache.org/jira/browse/HBASE-22775 Project: HBase Issue Type: Improvement Reporter: Xu Cang Now we don't have good logging regarding peer operations, for example addPeer does not log itself: [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationPeerStorage.java#L102] This Jira is aiming to enhancing this area -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HBASE-22391) Fix flaky tests from TestFromClientSide
Xu Cang created HBASE-22391: --- Summary: Fix flaky tests from TestFromClientSide Key: HBASE-22391 URL: https://issues.apache.org/jira/browse/HBASE-22391 Project: HBase Issue Type: New Feature Components: test Affects Versions: 2.0.5, 3.0.0, 1.5.1 Reporter: Xu Cang tests in TestFromClientSide.java in general are flaky due to the reason that after createTable, they did not wait for table to be ready before adding data into table. Found this issue when working on HBASE-22274 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-22215) Backport MultiRowRangeFilter does not work with reverse scans
[ https://issues.apache.org/jira/browse/HBASE-22215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-22215. - Resolution: Fixed > Backport MultiRowRangeFilter does not work with reverse scans > - > > Key: HBASE-22215 > URL: https://issues.apache.org/jira/browse/HBASE-22215 > Project: HBase > Issue Type: Sub-task > Components: Filters >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Fix For: 1.5.0, 1.4.10 > > Attachments: HBASE-22215.001.branch-1.patch, HBASE-22215.001.patch > > > See parent. Modify and apply to 1.x lines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-22215) Backport MultiRowRangeFilter does not work with reverse scans
[ https://issues.apache.org/jira/browse/HBASE-22215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang reopened HBASE-22215: - > Backport MultiRowRangeFilter does not work with reverse scans > - > > Key: HBASE-22215 > URL: https://issues.apache.org/jira/browse/HBASE-22215 > Project: HBase > Issue Type: Sub-task > Components: Filters >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Fix For: 1.5.0, 1.4.10 > > Attachments: HBASE-22215.001.branch-1.patch, HBASE-22215.001.patch > > > See parent. Modify and apply to 1.x lines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22274) Cell size limit check on append should consider cell's previous size.
Xu Cang created HBASE-22274: --- Summary: Cell size limit check on append should consider cell's previous size. Key: HBASE-22274 URL: https://issues.apache.org/jira/browse/HBASE-22274 Project: HBase Issue Type: New Feature Reporter: Xu Cang Now we have cell size limit check based on this parameter *hbase.server.keyvalue.maxsize* One case was missing: appending to a cell only take append op's cell size into account against this limit check. we should check against the potential final cell size after the append.' It's easy to reproduce this : Apply this diff {code:java} diff --git a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java b/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java index 5a285ef6ba..8633177ebe 100644 --- a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java +++ b/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java @@ -6455,7 +6455,7 @@ public class TestFromClientSide { // expected } try { - t.append(new Append(ROW).addColumn(FAMILY, QUALIFIER, new byte[10 * 1024])); + t.append(new Append(ROW).addColumn(FAMILY, QUALIFIER, new byte[2 * 1024])); fail("Oversize cell failed to trigger exception"); } catch (IOException e) { // expected{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22217) HBase shell command proposal : "rit assign all"
Xu Cang created HBASE-22217: --- Summary: HBase shell command proposal : "rit assign all" Key: HBASE-22217 URL: https://issues.apache.org/jira/browse/HBASE-22217 Project: HBase Issue Type: New Feature Reporter: Xu Cang HBase shell command proposal : "rit assign all" Currently we have shell command "rit" to list all RITs. It would be handy having a command "rit assign all" to assign all RITs. This equals to getting the list of RITs from 'rit' command and running "assign " one by one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22216) "Waiting on master failover to complete" shows 30 to 40 time per millisecond
Xu Cang created HBASE-22216: --- Summary: "Waiting on master failover to complete" shows 30 to 40 time per millisecond Key: HBASE-22216 URL: https://issues.apache.org/jira/browse/HBASE-22216 Project: HBase Issue Type: Bug Components: proc-v2 Affects Versions: 1.3.0 Reporter: Xu Cang "Waiting on master failover to complete" shows 30 to 40 time per millisecond from one host when master is initializing. This message is too noisy. Need to fix this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21752) Backport getProcedures() to branch-1 from branch-2 in HMaster class
[ https://issues.apache.org/jira/browse/HBASE-21752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-21752. - Resolution: Won't Fix > Backport getProcedures() to branch-1 from branch-2 in HMaster class > --- > > Key: HBASE-21752 > URL: https://issues.apache.org/jira/browse/HBASE-21752 > Project: HBase > Issue Type: Improvement > Environment: Backport getProcedures() to branch-1 from branch-2 in > HMaster class >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Minor > Fix For: 1.5.1 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21846) Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt
[ https://issues.apache.org/jira/browse/HBASE-21846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-21846. - Resolution: Resolved Release Note: test is not flaky anymore after the revert > Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt > --- > > Key: HBASE-21846 > URL: https://issues.apache.org/jira/browse/HBASE-21846 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.5.0 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Trivial > > Flaky test: > [ERROR] > TestFilterListOrOperatorWithBlkCnt.testMultiRowRangeWithFilterListOrOperatorWithBlkCnt:127 > expected:<4> but was:<5> > Added some debugging logs and test result as below: > 1028 2019-02-05 01:14:13,525 INFO [main] > filter.TestFilterListOrOperatorWithBlkCnt(118): 0. blocksStart: 0 > 1029 2019-02-05 01:14:13,572 INFO [main] > filter.TestFilterListOrOperatorWithBlkCnt(121): found 20 results > 1030 2019-02-05 01:14:13,572 INFO [main] > filter.TestFilterListOrOperatorWithBlkCnt(124): 1. Diff in number of blocks 3 > blocksEnd is: 3 blocksStart: 0 > 1031 2019-02-05 01:14:13,573 INFO [main] > filter.TestFilterListOrOperatorWithBlkCnt(129): 2. Diff in number of blocks 4 > blocksEnd is: 4 blocksStart: 0 > 1032 2019-02-05 01:14:13,576 INFO [main] > filter.TestFilterListOrOperatorWithBlkCnt(136): 3. Diff in number of blocks 5 > blocksEnd is: 5 blocksStart: 0 > Basically,in my testing environment the scan with filterList read 3 blocks. > Latter 2 scans read 1 respectively. > According to this flaky tests > list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html > This test is always failing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-22009) Improve RSGroupInfoManagerImpl#getDefaultServers()
[ https://issues.apache.org/jira/browse/HBASE-22009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang reopened HBASE-22009: - Re-opening for pending branch-1 fix. > Improve RSGroupInfoManagerImpl#getDefaultServers() > -- > > Key: HBASE-22009 > URL: https://issues.apache.org/jira/browse/HBASE-22009 > Project: HBase > Issue Type: Improvement > Components: rsgroup >Reporter: Xiang Li >Assignee: Xiang Li >Priority: Minor > Fix For: 3.0.0, 2.2.0, 1.5.1, 2.2.1 > > Attachments: HBASE-22009.master.000.patch, > call_stack_getDefaultServers.png > > > {code:title=RSGroupInfoManagerImpl.java|borderStyle=solid} > private SortedSet getDefaultServers() throws IOException { > SortedSet defaultServers = Sets.newTreeSet(); > for (ServerName serverName : getOnlineRS()) { > Address server = Address.fromParts(serverName.getHostname(), > serverName.getPort()); > boolean found = false; > for (RSGroupInfo rsgi : listRSGroups()) { > if (!RSGroupInfo.DEFAULT_GROUP.equals(rsgi.getName()) && > rsgi.containsServer(server)) { > found = true; > break; > } > } > if (!found) { > defaultServers.add(server); > } > } > return defaultServers; > } > {code} > That is a logic of 2 nest loops. And for each server, listRSGroups() > allocates a new LinkedList and calls Map#values(), both of which are very > heavy operations. > Maybe the inner loop could be moved out, that is > # Build a list of servers of other groups than default group > # Iterate each online servers and check if it is in the list above. If it is > not, then it belongs to default group. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22067) Fix log line in StochasticLoadBalancer when balancer is an ill-fit for cluster size
Xu Cang created HBASE-22067: --- Summary: Fix log line in StochasticLoadBalancer when balancer is an ill-fit for cluster size Key: HBASE-22067 URL: https://issues.apache.org/jira/browse/HBASE-22067 Project: HBase Issue Type: Bug Reporter: Xu Cang HBASE-21338 Added log lines regarding load balancer warnings. There is a bug in log that uses wrong parameter. 'maxRunningTime' is used , should be maxSteps. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21952) Test Failure: TestClientOperationInterrupt.testInterrupt50Percent
Xu Cang created HBASE-21952: --- Summary: Test Failure: TestClientOperationInterrupt.testInterrupt50Percent Key: HBASE-21952 URL: https://issues.apache.org/jira/browse/HBASE-21952 Project: HBase Issue Type: Improvement Reporter: Xu Cang Fix For: 1.5.0 --- Test set: org.apache.hadoop.hbase.client.TestClientOperationInterrupt --- Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 51.861 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestClientOperationInterrupt testInterrupt50Percent(org.apache.hadoop.hbase.client.TestClientOperationInterrupt) Time elapsed: 50.108 s <<< FAILURE! java.lang.AssertionError: noEx: 53, badEx=0, noInt=0 at org.apache.hadoop.hbase.client.TestClientOperationInterrupt.testInterrupt50Percent(TestClientOperationInterrupt.java:149) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21848) Fix tests in TestRegionLocationCaching
[ https://issues.apache.org/jira/browse/HBASE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-21848. - Resolution: Fixed > Fix tests in TestRegionLocationCaching > --- > > Key: HBASE-21848 > URL: https://issues.apache.org/jira/browse/HBASE-21848 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Minor > > There are 4 flaky tests in TestRegionLocationCaching. > They are in flaky tests list too: > https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html > > [ERROR] > TestRegionLocationCaching.testCachingForHTableMultiPut:133->checkRegionLocationIsCached:148 > Expected non-zero number of cached region locations. Actual: 0 > [ERROR] > TestRegionLocationCaching.testCachingForHTableMultiplexerMultiPut:95->checkRegionLocationIsCached:148 > Expected non-zero number of cached region locations. Actual: 0 > [ERROR] > TestRegionLocationCaching.testCachingForHTableMultiplexerSinglePut:73->checkRegionLocationIsCached:148 > Expected non-zero number of cached region locations. Actual: 0 > [ERROR] > TestRegionLocationCaching.testCachingForHTableSinglePut:116->checkRegionLocationIsCached:148 > Expected non-zero number of cached region locations. Actual: 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21848) Fix tests in TestRegionLocationCaching
Xu Cang created HBASE-21848: --- Summary: Fix tests in TestRegionLocationCaching Key: HBASE-21848 URL: https://issues.apache.org/jira/browse/HBASE-21848 Project: HBase Issue Type: Bug Affects Versions: 1.3.0 Reporter: Xu Cang There are 4 flaky tests in TestRegionLocationCaching. They are in flaky tests list too: https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21847) Fix test TestRegionServerMetrics#testRequestCount
Xu Cang created HBASE-21847: --- Summary: Fix test TestRegionServerMetrics#testRequestCount Key: HBASE-21847 URL: https://issues.apache.org/jira/browse/HBASE-21847 Project: HBase Issue Type: Bug Affects Versions: 1.3.0 Reporter: Xu Cang This test is also in flaky test list: [ERROR] TestRegionServerMetrics.testRequestCount:137 Metrics Counters should be equal expected:<59> but was:<89> The failutre is consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21846) Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt
Xu Cang created HBASE-21846: --- Summary: Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt Key: HBASE-21846 URL: https://issues.apache.org/jira/browse/HBASE-21846 Project: HBase Issue Type: Bug Affects Versions: 1.5.0, 1.3.0 Reporter: Xu Cang Flaky test: [ERROR] TestFilterListOrOperatorWithBlkCnt.testMultiRowRangeWithFilterListOrOperatorWithBlkCnt:127 expected:<4> but was:<5> Added some debugging logs and test result as below: 1028 2019-02-05 01:14:13,525 INFO [main] filter.TestFilterListOrOperatorWithBlkCnt(118): 0. blocksStart: 0 1029 2019-02-05 01:14:13,572 INFO [main] filter.TestFilterListOrOperatorWithBlkCnt(121): found 20 results 1030 2019-02-05 01:14:13,572 INFO [main] filter.TestFilterListOrOperatorWithBlkCnt(124): 1. Diff in number of blocks 3 blocksEnd is: 3 blocksStart: 0 1031 2019-02-05 01:14:13,573 INFO [main] filter.TestFilterListOrOperatorWithBlkCnt(129): 2. Diff in number of blocks 4 blocksEnd is: 4 blocksStart: 0 1032 2019-02-05 01:14:13,576 INFO [main] filter.TestFilterListOrOperatorWithBlkCnt(136): 3. Diff in number of blocks 5 blocksEnd is: 5 blocksStart: 0 Basically,in my testing environment the scan with filterList read 3 blocks. Latter 2 scans read 1 respectively. According to this flaky tests list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html This test is always failing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21752) Backport getProcedures() to branch-1 from branch-2 in HMaster class
Xu Cang created HBASE-21752: --- Summary: Backport getProcedures() to branch-1 from branch-2 in HMaster class Key: HBASE-21752 URL: https://issues.apache.org/jira/browse/HBASE-21752 Project: HBase Issue Type: Improvement Environment: Backport getProcedures() to branch-1 from branch-2 in HMaster class Reporter: Xu Cang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21553) schedLock not released ni MasterProcedureScheduler
Xu Cang created HBASE-21553: --- Summary: schedLock not released ni MasterProcedureScheduler Key: HBASE-21553 URL: https://issues.apache.org/jira/browse/HBASE-21553 Project: HBase Issue Type: Improvement Reporter: Xu Cang https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21552) backport HBASE-16735(Procedure v2 - Fix yield while holding locks) to branch-1 .
Xu Cang created HBASE-21552: --- Summary: backport HBASE-16735(Procedure v2 - Fix yield while holding locks) to branch-1 . Key: HBASE-21552 URL: https://issues.apache.org/jira/browse/HBASE-21552 Project: HBase Issue Type: Improvement Components: proc-v2 Reporter: Xu Cang Assignee: Xu Cang Attachments: Screen Shot 2018-12-05 at 4.34.05 PM.png Please see screenshot for the stack trace. We met this issue in production: many createNamespaceProcedures cannot proceed. After some debugging and JIRA digging, I think HBASE-16735 addressed this issue. It fixed the issue that WAITING procedure fails to be added back to the runQueue. But that change wasn't ported to branch-1. I am creating this JIRA for backporting it to branch-1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21224) Handle compaction queue duplication
Xu Cang created HBASE-21224: --- Summary: Handle compaction queue duplication Key: HBASE-21224 URL: https://issues.apache.org/jira/browse/HBASE-21224 Project: HBase Issue Type: Improvement Components: Compaction Reporter: Xu Cang Mentioned by [~allan163] that we may want to handle compaction queue duplication in this Jira https://issues.apache.org/jira/browse/HBASE-18451 Creating this item for further assessment and discussion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21117) Backport HBASE-18350 (RSGroups are broken underAMv2) to branch-1 :
Xu Cang created HBASE-21117: --- Summary: Backport HBASE-18350 (RSGroups are broken underAMv2) to branch-1 : Key: HBASE-21117 URL: https://issues.apache.org/jira/browse/HBASE-21117 Project: HBase Issue Type: Bug Components: backport, rsgroup, shell Affects Versions: 1.3.2 Reporter: Xu Cang Assignee: Xu Cang When working on HBASE-20666, I found out HBASE-18350 did not get ported to branch-1, which causes procedure to hang when #moveTables called sometimes. After looking into the 18350 patch, seems it's important since it fixes 4 issues. This Jira is an attempt to backport it to branch-1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21066) Improve isTableState() method to ensure caller gets correct info
[ https://issues.apache.org/jira/browse/HBASE-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang resolved HBASE-21066. - Resolution: Won't Fix > Improve isTableState() method to ensure caller gets correct info > > > Key: HBASE-21066 > URL: https://issues.apache.org/jira/browse/HBASE-21066 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 1.3.0, 2.0.0 >Reporter: Xu Cang >Priority: Minor > Attachments: HBASE-21066.master.001.patch, > HBASE-21066.master.002.patch > > > > {code:java} > public boolean isTableState(TableName tableName, TableState.State... states) { > try { > TableState tableState = getTableState(tableName); > return tableState.isInStates(states); > } catch (IOException e) { > LOG.error("Unable to get table " + tableName + " state", e); > // XXX: is it safe to just return false here? > return false; > } > } > > {code} > > When cannot get table state, returning false is not always safe or correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21066) Improve isTableState() method to ensure caller gets correct info
Xu Cang created HBASE-21066: --- Summary: Improve isTableState() method to ensure caller gets correct info Key: HBASE-21066 URL: https://issues.apache.org/jira/browse/HBASE-21066 Project: HBase Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Xu Cang {{public boolean isTableState(TableName tableName, TableState.State... states) {}} {{try {}} {{TableState tableState = getTableState(tableName);}} {{return tableState.isInStates(states);}} {{} catch (IOException e) {}} {{LOG.error("Unable to get table " + tableName + " state", e);}} {{// XXX: is it safe to just return false here?}} {{return false;}} {{}}} {{}}} When cannot get table state, returning false is not always safe or correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20928) Rewrite calculation of midpoint in binarySearch functions to prevent overflow
[ https://issues.apache.org/jira/browse/HBASE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xu Cang reopened HBASE-20928: - > Rewrite calculation of midpoint in binarySearch functions to prevent overflow > - > > Key: HBASE-20928 > URL: https://issues.apache.org/jira/browse/HBASE-20928 > Project: HBase > Issue Type: Bug > Components: io >Reporter: saurabh singh >Assignee: saurabh singh >Priority: Minor > Fix For: 2.2.0 > > Attachments: HBASE-20928-addendum.patch, > HBASE-20928-fix-binarySearch-v5.patch, HBASE-20928-fix-binarySearch-v5.patch > > > There are couple of issues in the function: > * {{>>>}} operator would mess the values if {{low}} + {{high}} end up being > negative. This shouldn't happen but I don't see anything to prevent this from > happening. > * The code fails around boundary values of {{low}} and {{high}}. This is a > well known binary search catch. > [https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html] > > Most of the code should already be covered by tests. I would have liked to > add a test that actually fails without the fix but given these are private > methods I am not sure on the best place to add the test. Suggestions? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20925) Canary test to expose results per table
Xu Cang created HBASE-20925: --- Summary: Canary test to expose results per table Key: HBASE-20925 URL: https://issues.apache.org/jira/browse/HBASE-20925 Project: HBase Issue Type: Improvement Components: canary Reporter: Xu Cang Canary test to expose results per table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20858) port HBASE-20695 to branch-1
Xu Cang created HBASE-20858: --- Summary: port HBASE-20695 to branch-1 Key: HBASE-20858 URL: https://issues.apache.org/jira/browse/HBASE-20858 Project: HBase Issue Type: Improvement Reporter: Xu Cang port HBASE-20695 to branch-1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20695) Implement table level RegionServer replication metrics
Xu Cang created HBASE-20695: --- Summary: Implement table level RegionServer replication metrics Key: HBASE-20695 URL: https://issues.apache.org/jira/browse/HBASE-20695 Project: HBase Issue Type: Improvement Components: metrics Reporter: Xu Cang Assignee: Xu Cang Region server metrics now are mainly global metrics. It would be nice to have table level metrics such as table level source.AgeOfLastShippedOp to indicate operators which table's replication is lagging behind. -- This message was sent by Atlassian JIRA (v7.6.3#76005)