[jira] [Created] (HBASE-24028) MapReduce on snapshot restores and opens all regions in each mapper

2020-03-20 Thread Xu Cang (Jira)
Xu Cang created HBASE-24028:
---

 Summary: MapReduce on snapshot restores and opens all regions in 
each mapper
 Key: HBASE-24028
 URL: https://issues.apache.org/jira/browse/HBASE-24028
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.6.0, 2.3.0
Reporter: Xu Cang


Given this scenario: one MR job scans a table (with many regions). I will use 
'RestoreSnapshotHelper' to restore snapshot for all regions in each mapper. 

In the code 
[https://github.com/apache/hbase/blob/branch-2.0/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L183]

Seems there is no way to only restore relevant regions from snapshot to region.

This leads to extreme slowness and waste of resource. 

Please correct me if I am wrong or miss anything. thanks.

 

One quick example I san show as below, in my test, there are 2 regions in a 
testing table. and each mapper opens and iterates 2 regions. 

2020-03-19 18:58:15,225 INFO [main] mapred.MapTask - Map output collector class 
= org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2020-03-19 18:58:15,285 INFO [main] snapshot.RestoreSnapshotHelper - region to 
add: *d7f85b4a9d3fa22a5e7b88bda39f6d50*
2020-03-19 18:58:15,285 INFO [main] snapshot.RestoreSnapshotHelper - region to 
add: *69dd3fdba3698f827f8883ed911161ef*
2020-03-19 18:58:15,286 INFO [main] snapshot.RestoreSnapshotHelper - clone 
region=d7f85b4a9d3fa22a5e7b88bda39f6d50 as d7f85b4a9d3fa22a5e7b88bda39f6d50

 

So if I misunderstood anything, can anyone point to me where in this class, can 
distinguish which region to go through for different mappers? 

 

btw the original implementation for MR on Snapshot is here, there weren't too 
many big changes after that HBASE-8369 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23143) Region Server Crash due to 2 cells out of order ( between 2 DELETEs)

2019-10-09 Thread Xu Cang (Jira)
Xu Cang created HBASE-23143:
---

 Summary: Region Server Crash due to 2 cells out of order ( between 
2 DELETEs)
 Key: HBASE-23143
 URL: https://issues.apache.org/jira/browse/HBASE-23143
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.2
Reporter: Xu Cang


Region Server Crash due to 2 cells out of order ( between 2 DELETEs)

 

Caused by: java.io.IOException: Added a key not lexically larger than previous.
 Current cell = 
00D7F00xxQ10D52v8UY6yV0057F00bPaGT\x00057F00bPaG/0:TABLE1_ID/*1570095189597*/DeleteColumn/vlen=0/seqid=*2128373*,
 
 lastCell = 
00D7F00xxQ10D52v8UY6yV0057F00bPaGT\x00057F00bPaG/0:TABLE1_ID/*1570095165147*/DeleteColumn/vlen=0/seqid=*2128378*

 

 

I am aware https://issues.apache.org/jira/browse/HBASE-22862

but it's slightly different, this issue is not caused by One Delete and One Put.

This issue I am seeing is caused by 2 Deletes

 

Has anyone seen this issue? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22804) Provide an API to get list of successful regions and total expected regions in Canary

2019-09-16 Thread Xu Cang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-22804.
-
Fix Version/s: 1.4.12
   2.2.2
   2.1.7
   1.3.6
   2.3.1
   3.0.0
   Resolution: Fixed

> Provide an API to get list of successful regions and total expected regions 
> in Canary
> -
>
> Key: HBASE-22804
> URL: https://issues.apache.org/jira/browse/HBASE-22804
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0, 2.1.5, 2.2.1
>Reporter: Caroline
>Assignee: Caroline
>Priority: Minor
>  Labels: Canary
> Fix For: 3.0.0, 1.5.0, 2.3.1, 1.3.6, 2.1.7, 2.2.2, 1.4.12
>
> Attachments: HBASE-22804.branch-1.001.patch, 
> HBASE-22804.branch-1.002.patch, HBASE-22804.branch-1.003.patch, 
> HBASE-22804.branch-1.004.patch, HBASE-22804.branch-1.005.patch, 
> HBASE-22804.branch-1.006.patch, HBASE-22804.branch-1.007.patch, 
> HBASE-22804.branch-1.008.patch, HBASE-22804.branch-1.009.patch, 
> HBASE-22804.branch-1.009.patch, HBASE-22804.branch-1.010.patch, 
> HBASE-22804.branch-2.001.patch, HBASE-22804.branch-2.002.patch, 
> HBASE-22804.branch-2.003.patch, HBASE-22804.branch-2.004.patch, 
> HBASE-22804.branch-2.005.patch, HBASE-22804.branch-2.006.patch, 
> HBASE-22804.master.001.patch, HBASE-22804.master.002.patch, 
> HBASE-22804.master.003.patch, HBASE-22804.master.004.patch, 
> HBASE-22804.master.005.patch, HBASE-22804.master.006.patch
>
>
> At present HBase Canary tool only prints the successes as part of logs. 
> Providing an API to get the list of successes, as well as total number of 
> expected regions, will make it easier to get a more accurate availability 
> estimate.
>   



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HBASE-22775) Enhance logging for peer related operations

2019-07-31 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22775:
---

 Summary: Enhance logging for peer related operations
 Key: HBASE-22775
 URL: https://issues.apache.org/jira/browse/HBASE-22775
 Project: HBase
  Issue Type: Improvement
Reporter: Xu Cang


Now we don't have good logging regarding peer operations, for example addPeer 
does not log itself:

[https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationPeerStorage.java#L102]

This Jira is aiming to enhancing this area



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22391) Fix flaky tests from TestFromClientSide

2019-05-09 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22391:
---

 Summary: Fix flaky tests from TestFromClientSide
 Key: HBASE-22391
 URL: https://issues.apache.org/jira/browse/HBASE-22391
 Project: HBase
  Issue Type: New Feature
  Components: test
Affects Versions: 2.0.5, 3.0.0, 1.5.1
Reporter: Xu Cang


tests in TestFromClientSide.java in general are flaky due to the reason that 
after createTable, they did not wait for table to be ready before adding data 
into table.

Found this issue when working on HBASE-22274

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-22215) Backport MultiRowRangeFilter does not work with reverse scans

2019-04-24 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-22215.
-
Resolution: Fixed

> Backport MultiRowRangeFilter does not work with reverse scans
> -
>
> Key: HBASE-22215
> URL: https://issues.apache.org/jira/browse/HBASE-22215
> Project: HBase
>  Issue Type: Sub-task
>  Components: Filters
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-22215.001.branch-1.patch, HBASE-22215.001.patch
>
>
> See parent. Modify and apply to 1.x lines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-22215) Backport MultiRowRangeFilter does not work with reverse scans

2019-04-24 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang reopened HBASE-22215:
-

> Backport MultiRowRangeFilter does not work with reverse scans
> -
>
> Key: HBASE-22215
> URL: https://issues.apache.org/jira/browse/HBASE-22215
> Project: HBase
>  Issue Type: Sub-task
>  Components: Filters
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-22215.001.branch-1.patch, HBASE-22215.001.patch
>
>
> See parent. Modify and apply to 1.x lines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22274) Cell size limit check on append should consider cell's previous size.

2019-04-19 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22274:
---

 Summary: Cell size limit check on append should consider cell's 
previous size.
 Key: HBASE-22274
 URL: https://issues.apache.org/jira/browse/HBASE-22274
 Project: HBase
  Issue Type: New Feature
Reporter: Xu Cang


Now we have cell size limit check based on this parameter 
*hbase.server.keyvalue.maxsize* 

One case was missing: appending to a cell only take append op's cell size into 
account against this limit check. we should check against the potential final 
cell size after the append.'

It's easy to reproduce this :

 

Apply this diff

 
{code:java}
diff --git 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 index 5a285ef6ba..8633177ebe 100644 --- 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 +++ 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 @@ -6455,7 +6455,7 @@ public class TestFromClientSide { // expected } try { - 
t.append(new Append(ROW).addColumn(FAMILY, QUALIFIER, new byte[10 * 1024])); + 
t.append(new Append(ROW).addColumn(FAMILY, QUALIFIER, new byte[2 * 1024])); 
fail("Oversize cell failed to trigger exception"); } catch (IOException e) { // 
expected{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22217) HBase shell command proposal : "rit assign all"

2019-04-11 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22217:
---

 Summary: HBase shell command proposal : "rit assign all" 
 Key: HBASE-22217
 URL: https://issues.apache.org/jira/browse/HBASE-22217
 Project: HBase
  Issue Type: New Feature
Reporter: Xu Cang


HBase shell command proposal : "rit assign all" 

 

Currently we have shell command "rit" to list all RITs.

It would be handy having a command "rit assign all" to assign all RITs.

This equals to getting the list of RITs from 'rit' command and running "assign 
" one by one.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22216) "Waiting on master failover to complete" shows 30 to 40 time per millisecond

2019-04-11 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22216:
---

 Summary: "Waiting on master failover to complete" shows 30 to 40 
time per millisecond 
 Key: HBASE-22216
 URL: https://issues.apache.org/jira/browse/HBASE-22216
 Project: HBase
  Issue Type: Bug
  Components: proc-v2
Affects Versions: 1.3.0
Reporter: Xu Cang


"Waiting on master failover to complete" shows 30 to 40 time per millisecond 
from one host when master is initializing. 

This message is too noisy. Need to fix this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21752) Backport getProcedures() to branch-1 from branch-2 in HMaster class

2019-03-26 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-21752.
-
Resolution: Won't Fix

> Backport getProcedures() to branch-1 from branch-2 in HMaster class
> ---
>
> Key: HBASE-21752
> URL: https://issues.apache.org/jira/browse/HBASE-21752
> Project: HBase
>  Issue Type: Improvement
> Environment: Backport getProcedures() to branch-1 from branch-2 in 
> HMaster class
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Minor
> Fix For: 1.5.1
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21846) Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt

2019-03-26 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-21846.
-
  Resolution: Resolved
Release Note: test is not flaky anymore after the revert

> Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt
> ---
>
> Key: HBASE-21846
> URL: https://issues.apache.org/jira/browse/HBASE-21846
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.5.0
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Trivial
>
> Flaky test:
> [ERROR]   
> TestFilterListOrOperatorWithBlkCnt.testMultiRowRangeWithFilterListOrOperatorWithBlkCnt:127
>  expected:<4> but was:<5>
> Added some debugging logs and test result as below:
> 1028 2019-02-05 01:14:13,525 INFO  [main] 
> filter.TestFilterListOrOperatorWithBlkCnt(118): 0. blocksStart: 0
> 1029 2019-02-05 01:14:13,572 INFO  [main] 
> filter.TestFilterListOrOperatorWithBlkCnt(121): found 20 results
> 1030 2019-02-05 01:14:13,572 INFO  [main] 
> filter.TestFilterListOrOperatorWithBlkCnt(124): 1. Diff in number of blocks 3 
> blocksEnd is: 3 blocksStart: 0
> 1031 2019-02-05 01:14:13,573 INFO  [main] 
> filter.TestFilterListOrOperatorWithBlkCnt(129): 2. Diff in number of blocks 4 
> blocksEnd is: 4 blocksStart: 0
> 1032 2019-02-05 01:14:13,576 INFO  [main] 
> filter.TestFilterListOrOperatorWithBlkCnt(136): 3. Diff in number of blocks 5 
> blocksEnd is: 5 blocksStart: 0
> Basically,in my testing environment the scan with filterList read 3 blocks. 
> Latter 2 scans read 1 respectively. 
> According to this flaky tests 
> list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html
> This test is always failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-22009) Improve RSGroupInfoManagerImpl#getDefaultServers()

2019-03-19 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang reopened HBASE-22009:
-

Re-opening for pending branch-1 fix.

> Improve RSGroupInfoManagerImpl#getDefaultServers()
> --
>
> Key: HBASE-22009
> URL: https://issues.apache.org/jira/browse/HBASE-22009
> Project: HBase
>  Issue Type: Improvement
>  Components: rsgroup
>Reporter: Xiang Li
>Assignee: Xiang Li
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 1.5.1, 2.2.1
>
> Attachments: HBASE-22009.master.000.patch, 
> call_stack_getDefaultServers.png
>
>
> {code:title=RSGroupInfoManagerImpl.java|borderStyle=solid}
> private SortedSet getDefaultServers() throws IOException {
>   SortedSet defaultServers = Sets.newTreeSet();
>   for (ServerName serverName : getOnlineRS()) {
> Address server = Address.fromParts(serverName.getHostname(), 
> serverName.getPort());
> boolean found = false;
> for (RSGroupInfo rsgi : listRSGroups()) {
>   if (!RSGroupInfo.DEFAULT_GROUP.equals(rsgi.getName()) && 
> rsgi.containsServer(server)) {
> found = true;
> break;
>   }
> }
> if (!found) {
>   defaultServers.add(server);
> }
>   }
>   return defaultServers;
> }
> {code}
> That is a logic of 2 nest loops. And for each server, listRSGroups() 
> allocates a new LinkedList and calls Map#values(), both of which are very 
> heavy operations.
> Maybe the inner loop could be moved out, that is
> # Build a list of servers of other groups than default group
> # Iterate each online servers and check if it is in the list above. If it is 
> not, then it belongs to default group.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22067) Fix log line in StochasticLoadBalancer when balancer is an ill-fit for cluster size

2019-03-19 Thread Xu Cang (JIRA)
Xu Cang created HBASE-22067:
---

 Summary: Fix log line in StochasticLoadBalancer when balancer is 
an ill-fit for cluster size
 Key: HBASE-22067
 URL: https://issues.apache.org/jira/browse/HBASE-22067
 Project: HBase
  Issue Type: Bug
Reporter: Xu Cang


HBASE-21338 Added log lines regarding load balancer warnings. There is a bug in 
log that uses wrong parameter.
'maxRunningTime' is used , should be maxSteps.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21952) Test Failure: TestClientOperationInterrupt.testInterrupt50Percent

2019-02-25 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21952:
---

 Summary: Test Failure: 
TestClientOperationInterrupt.testInterrupt50Percent
 Key: HBASE-21952
 URL: https://issues.apache.org/jira/browse/HBASE-21952
 Project: HBase
  Issue Type: Improvement
Reporter: Xu Cang
 Fix For: 1.5.0


---
Test set: org.apache.hadoop.hbase.client.TestClientOperationInterrupt
---
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 51.861 s <<< 
FAILURE! - in org.apache.hadoop.hbase.client.TestClientOperationInterrupt
testInterrupt50Percent(org.apache.hadoop.hbase.client.TestClientOperationInterrupt)
  Time elapsed: 50.108 s  <<< FAILURE!
java.lang.AssertionError:  noEx: 53, badEx=0, noInt=0
at 
org.apache.hadoop.hbase.client.TestClientOperationInterrupt.testInterrupt50Percent(TestClientOperationInterrupt.java:149)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21848) Fix tests in TestRegionLocationCaching

2019-02-05 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-21848.
-
Resolution: Fixed

> Fix tests in TestRegionLocationCaching 
> ---
>
> Key: HBASE-21848
> URL: https://issues.apache.org/jira/browse/HBASE-21848
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Minor
>
> There are 4 flaky tests in TestRegionLocationCaching.
> They are in flaky tests list too: 
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html
>  
> [ERROR]   
> TestRegionLocationCaching.testCachingForHTableMultiPut:133->checkRegionLocationIsCached:148
>  Expected non-zero number of cached region locations. Actual: 0
> [ERROR]   
> TestRegionLocationCaching.testCachingForHTableMultiplexerMultiPut:95->checkRegionLocationIsCached:148
>  Expected non-zero number of cached region locations. Actual: 0
> [ERROR]   
> TestRegionLocationCaching.testCachingForHTableMultiplexerSinglePut:73->checkRegionLocationIsCached:148
>  Expected non-zero number of cached region locations. Actual: 0
> [ERROR]   
> TestRegionLocationCaching.testCachingForHTableSinglePut:116->checkRegionLocationIsCached:148
>  Expected non-zero number of cached region locations. Actual: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21848) Fix tests in TestRegionLocationCaching

2019-02-05 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21848:
---

 Summary: Fix tests in TestRegionLocationCaching 
 Key: HBASE-21848
 URL: https://issues.apache.org/jira/browse/HBASE-21848
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Xu Cang


There are 4 flaky tests in TestRegionLocationCaching.
They are in flaky tests list too: 
https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html
 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21847) Fix test TestRegionServerMetrics#testRequestCount

2019-02-05 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21847:
---

 Summary: Fix test  TestRegionServerMetrics#testRequestCount
 Key: HBASE-21847
 URL: https://issues.apache.org/jira/browse/HBASE-21847
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Xu Cang


This test is also in flaky test list:
[ERROR]   TestRegionServerMetrics.testRequestCount:137 Metrics Counters should 
be equal expected:<59> but was:<89>
The failutre is consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21846) Flaky Test: testMultiRowRangeWithFilterListOrOperatorWithBlkCnt

2019-02-05 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21846:
---

 Summary: Flaky Test: 
testMultiRowRangeWithFilterListOrOperatorWithBlkCnt
 Key: HBASE-21846
 URL: https://issues.apache.org/jira/browse/HBASE-21846
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.5.0, 1.3.0
Reporter: Xu Cang


Flaky test:
[ERROR]   
TestFilterListOrOperatorWithBlkCnt.testMultiRowRangeWithFilterListOrOperatorWithBlkCnt:127
 expected:<4> but was:<5>


Added some debugging logs and test result as below:
1028 2019-02-05 01:14:13,525 INFO  [main] 
filter.TestFilterListOrOperatorWithBlkCnt(118): 0. blocksStart: 0
1029 2019-02-05 01:14:13,572 INFO  [main] 
filter.TestFilterListOrOperatorWithBlkCnt(121): found 20 results
1030 2019-02-05 01:14:13,572 INFO  [main] 
filter.TestFilterListOrOperatorWithBlkCnt(124): 1. Diff in number of blocks 3 
blocksEnd is: 3 blocksStart: 0
1031 2019-02-05 01:14:13,573 INFO  [main] 
filter.TestFilterListOrOperatorWithBlkCnt(129): 2. Diff in number of blocks 4 
blocksEnd is: 4 blocksStart: 0
1032 2019-02-05 01:14:13,576 INFO  [main] 
filter.TestFilterListOrOperatorWithBlkCnt(136): 3. Diff in number of blocks 5 
blocksEnd is: 5 blocksStart: 0

Basically,in my testing environment the scan with filterList read 3 blocks. 
Latter 2 scans read 1 respectively. 

According to this flaky tests 
list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html
This test is always failing.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21752) Backport getProcedures() to branch-1 from branch-2 in HMaster class

2019-01-21 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21752:
---

 Summary: Backport getProcedures() to branch-1 from branch-2 in 
HMaster class
 Key: HBASE-21752
 URL: https://issues.apache.org/jira/browse/HBASE-21752
 Project: HBase
  Issue Type: Improvement
 Environment: Backport getProcedures() to branch-1 from branch-2 in 
HMaster class
Reporter: Xu Cang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21553) schedLock not released ni MasterProcedureScheduler

2018-12-05 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21553:
---

 Summary: schedLock not released ni MasterProcedureScheduler
 Key: HBASE-21553
 URL: https://issues.apache.org/jira/browse/HBASE-21553
 Project: HBase
  Issue Type: Improvement
Reporter: Xu Cang


https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21552) backport HBASE-16735(Procedure v2 - Fix yield while holding locks) to branch-1 .

2018-12-05 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21552:
---

 Summary: backport  HBASE-16735(Procedure v2 - Fix yield while 
holding locks)  to branch-1 . 
 Key: HBASE-21552
 URL: https://issues.apache.org/jira/browse/HBASE-21552
 Project: HBase
  Issue Type: Improvement
  Components: proc-v2
Reporter: Xu Cang
Assignee: Xu Cang
 Attachments: Screen Shot 2018-12-05 at 4.34.05 PM.png

Please see screenshot for the stack trace. 
We met this issue in production: many createNamespaceProcedures cannot proceed.
After some debugging and JIRA digging, I think HBASE-16735 addressed this 
issue. It fixed the issue that WAITING procedure fails to be added back to the 
runQueue. 
But that change wasn't ported to branch-1. I am creating this JIRA for 
backporting it to branch-1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21224) Handle compaction queue duplication

2018-09-25 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21224:
---

 Summary: Handle compaction queue duplication
 Key: HBASE-21224
 URL: https://issues.apache.org/jira/browse/HBASE-21224
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Xu Cang


Mentioned by [~allan163] that we may want to handle compaction queue 
duplication in this Jira https://issues.apache.org/jira/browse/HBASE-18451 

Creating this item for further assessment and discussion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21117) Backport HBASE-18350 (RSGroups are broken underAMv2) to branch-1 :

2018-08-24 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21117:
---

 Summary: Backport HBASE-18350  (RSGroups are broken underAMv2)  to 
branch-1 :
 Key: HBASE-21117
 URL: https://issues.apache.org/jira/browse/HBASE-21117
 Project: HBase
  Issue Type: Bug
  Components: backport, rsgroup, shell
Affects Versions: 1.3.2
Reporter: Xu Cang
Assignee: Xu Cang


When working on HBASE-20666, I found out HBASE-18350 did not get ported to 
branch-1, which causes procedure to hang when #moveTables called sometimes. 

After looking into the 18350 patch, seems it's important since it fixes 4 
issues. This Jira is an attempt to backport it to branch-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21066) Improve isTableState() method to ensure caller gets correct info

2018-08-17 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang resolved HBASE-21066.
-
Resolution: Won't Fix

> Improve isTableState() method to ensure caller gets correct info
> 
>
> Key: HBASE-21066
> URL: https://issues.apache.org/jira/browse/HBASE-21066
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.3.0, 2.0.0
>Reporter: Xu Cang
>Priority: Minor
> Attachments: HBASE-21066.master.001.patch, 
> HBASE-21066.master.002.patch
>
>
>  
> {code:java}
> public boolean isTableState(TableName tableName, TableState.State... states) {
>  try {
>  TableState tableState = getTableState(tableName);
>  return tableState.isInStates(states);
>  } catch (IOException e) {
>  LOG.error("Unable to get table " + tableName + " state", e);
>  // XXX: is it safe to just return false here?
>  return false;
>  }
>  }
>  
> {code}
>  
> When cannot get table state, returning false is not always safe or correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21066) Improve isTableState() method to ensure caller gets correct info

2018-08-17 Thread Xu Cang (JIRA)
Xu Cang created HBASE-21066:
---

 Summary: Improve isTableState() method to ensure caller gets 
correct info
 Key: HBASE-21066
 URL: https://issues.apache.org/jira/browse/HBASE-21066
 Project: HBase
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Xu Cang


{{public boolean isTableState(TableName tableName, TableState.State... states) 
{}}
{{try {}}
{{TableState tableState = getTableState(tableName);}}
{{return tableState.isInStates(states);}}
{{} catch (IOException e) {}}
{{LOG.error("Unable to get table " + tableName + " state", e);}}
{{// XXX: is it safe to just return false here?}}
{{return false;}}
{{}}}
{{}}}

 

When cannot get table state, returning false is not always safe or correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20928) Rewrite calculation of midpoint in binarySearch functions to prevent overflow

2018-07-24 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang reopened HBASE-20928:
-

> Rewrite calculation of midpoint in binarySearch functions to prevent overflow
> -
>
> Key: HBASE-20928
> URL: https://issues.apache.org/jira/browse/HBASE-20928
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Reporter: saurabh singh
>Assignee: saurabh singh
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HBASE-20928-addendum.patch, 
> HBASE-20928-fix-binarySearch-v5.patch, HBASE-20928-fix-binarySearch-v5.patch
>
>
> There are couple of issues in the function:
>  * {{>>>}} operator would mess the values if {{low}} + {{high}} end up being 
> negative. This shouldn't happen but I don't see anything to prevent this from 
> happening.
>  * The code fails around boundary values of {{low}} and {{high}}. This is a 
> well known binary search catch. 
> [https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html]
>  
> Most of the code should already be covered by tests. I would have liked to 
> add a test that actually fails without the fix but given these are private 
> methods I am not sure on the best place to add the test. Suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20925) Canary test to expose results per table

2018-07-23 Thread Xu Cang (JIRA)
Xu Cang created HBASE-20925:
---

 Summary: Canary test to expose results per table
 Key: HBASE-20925
 URL: https://issues.apache.org/jira/browse/HBASE-20925
 Project: HBase
  Issue Type: Improvement
  Components: canary
Reporter: Xu Cang


Canary test to expose results per table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20858) port HBASE-20695 to branch-1

2018-07-06 Thread Xu Cang (JIRA)
Xu Cang created HBASE-20858:
---

 Summary: port HBASE-20695 to branch-1
 Key: HBASE-20858
 URL: https://issues.apache.org/jira/browse/HBASE-20858
 Project: HBase
  Issue Type: Improvement
Reporter: Xu Cang


port HBASE-20695 to branch-1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20695) Implement table level RegionServer replication metrics

2018-06-06 Thread Xu Cang (JIRA)
Xu Cang created HBASE-20695:
---

 Summary: Implement table level RegionServer replication metrics 
 Key: HBASE-20695
 URL: https://issues.apache.org/jira/browse/HBASE-20695
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Xu Cang
Assignee: Xu Cang


Region server metrics now are mainly global metrics. It would be nice to have 
table level metrics such as table level source.AgeOfLastShippedOp to indicate 
operators which table's replication is lagging behind.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)