date:20180628

[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Pankaj Kumar (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527217#comment-16527217
 ] 

Pankaj Kumar commented on HBASE-20357:
--

Please commit in branch-2 also, v3 can be applied there also.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20716) Unsafe access cleanup

2018-06-28 Thread Sahil Aggarwal (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527215#comment-16527215
 ] 

Sahil Aggarwal commented on HBASE-20716:


Benchmark                                                       Mode      Cnt   
 Score  Error  Units

HbaseBytes.testToLongCheckAndDispatch     thrpt        6       352925235.054 ± 
7423950.697  ops/s

HbaseBytes.testToLongStaticInvoke                thrpt        6       
353652817.747 ± 6536569.256  ops/s

 

org.apache.hadoop.hbase.util.Bytes::toLongCheckAndDispatch (73 bytes)    - 
current impl

org.apache.hadoop.hbase.util.Bytes::toLong (33 bytes)   - impl using 
BEST_CONVERTER

Reduced the size of method by 40 bytes.

> Unsafe access cleanup
> -
>
> Key: HBASE-20716
> URL: https://issues.apache.org/jira/browse/HBASE-20716
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: stack
>Assignee: Sahil Aggarwal
>Priority: Critical
>  Labels: beginner
> Attachments: HBASE-20716.master.001.patch, Screen Shot 2018-06-26 at 
> 11.37.49 AM.png
>
>
> We have two means of getting at unsafe; UnsafeAccess and then internal to the 
> Bytes class. They are effectively doing the same thing. We should have one 
> avenue to Unsafe only.
> Many of our paths to Unsafe via UnsafeAccess traverse flags to check if 
> access is available, if it is aligned and the order in which words are 
> written on the machine. Each check costs -- especially if done millions of 
> times a second -- and on occasion adds bloat in hot code paths. The unsafe 
> access inside Bytes checks on startup what the machine is capable off and 
> then does a static assign of the appropriate class-to-use from there on out. 
> UnsafeAccess does not do this running the checks everytime. Would be good to 
> have the Bytes behavior pervasive.
> The benefit of one access to Unsafe only is plain. The benefits we gain 
> removing checks will be harder to measure though should be plain when you 
> disassemble a hot-path; in a (very) rare case, the saved byte codes could be 
> the difference between inlining or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20357:
---
Fix Version/s: 3.0.0

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527213#comment-16527213
 ] 

Hadoop QA commented on HBASE-20789:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} hbase-server: The patch generated 0 new + 94 
unchanged - 1 fixed = 94 total (was 95) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}125m  
1s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20789 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12929665/HBASE-20789.v3.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 42a12ab790b2 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 78e7dd6537 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13448/testReport/ |
| Max. process+thread count | 4904 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13448/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message

[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527214#comment-16527214
 ] 

Ted Yu commented on HBASE-20357:


Just committed patch v3.

Please attach addendum.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Pankaj Kumar (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527211#comment-16527211
 ] 

Pankaj Kumar commented on HBASE-20357:
--

Please review the release notes.

I will update the acl matrix for hasPermission and attach the v4 patch.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Pankaj Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-20357:
-
Release Note: 
This enhances the AccessControlClient APIs to retrieve the permissions based on 
namespace, table name, family and qualifier for specific user. 
AccessControlClient can also validate a user whether allowed to perform 
specified operations on a particular table.
Following APIs have been added,
1) getUserPermissions(Connection connection, String tableRegex, byte[] 
columnFamily, byte[] columnQualifier, String userName) 
 Scope of retrieving permission will be same as existing.
2) hasPermission(onnection connection, String tableName, byte[] columnFamily, 
byte[] columnQualifier, String userName, Permission.Action... actions)
 Scope of validating user privilege,
   User can perform self check without any special privilege but ADMIN 
privilege will be required to perform check for other users.
   For example, suppose there are two users "userA" & "userB" then 
there can be below scenarios,
a. When userA want to check whether userA have privilege to perform 
mentioned actions
 userA don't need ADMIN privilege, as it's a self query.
b. When userA want to check whether userB have privilege to perform 
mentioned actions,
 userA must have ADMIN or superuser privilege, as it's trying 
to query for other user.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527199#comment-16527199
 ] 

Hadoop QA commented on HBASE-20769:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
51s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
42s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 59s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:1f3957d |
| JIRA Issue | HBASE-20769 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12929661/HBASE-20769.branch-1.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 917c1712dd8d 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (HBASE-20819) Use TableDescriptor to replace HTableDescriptor in hbase-shell module

2018-06-28 Thread Xiaolin Ha (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaolin Ha updated HBASE-20819:
---
Summary: Use TableDescriptor to replace HTableDescriptor in hbase-shell 
module  (was: Use TableDescriptor to replace HTableDescriptor in admin.rb)

> Use TableDescriptor to replace HTableDescriptor in hbase-shell module
> -
>
> Key: HBASE-20819
> URL: https://issues.apache.org/jira/browse/HBASE-20819
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Minor
>
> HTableDescriptor is deprecated as of release 2.0.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527184#comment-16527184
 ] 

Reid Chan commented on HBASE-18201:
---

Not sure if it is matter.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20814) fix error prone assertion failure ignored warnings

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527183#comment-16527183
 ] 

Hadoop QA commented on HBASE-20814:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 14 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 1s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hbase-build-configuration {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m  
2s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m  
8s{color} | {color:green} hbase-build-configuration in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} hbase-common generated 0 new + 40 unchanged - 2 
fixed = 40 total (was 42) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} hbase-procedure generated 0 new + 7 unchanged - 2 
fixed = 7 total (was 9) {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} hbase-rsgroup generated 0 new + 105 unchanged - 1 
fixed = 105 total (was 106) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} hbase-endpoint generated 0 new + 123 unchanged - 2 
fixed = 123 total (was 125) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} hbase-it generated 0 new + 49 unchanged - 2 fixed = 
49 total (was 51) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch hbase-build-configuration passed 
checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} The patch hbase-common passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
28s{color} | {color:red} hbase-client: The patch generated 1 new + 49 unchanged 
- 0 fixed = 50 total (was 49) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} |

[jira] [Commented] (HBASE-19997) [rolling upgrade] 1.x => 2.x

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527179#comment-16527179
 ] 

stack commented on HBASE-19997:
---

+1

The resolution here is rough with edges but way above what I thought we would 
achieve.

> [rolling upgrade] 1.x => 2.x
> 
>
> Key: HBASE-19997
> URL: https://issues.apache.org/jira/browse/HBASE-19997
> Project: HBase
>  Issue Type: Umbrella
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Blocker
> Fix For: 2.1.0
>
> Attachments: Screenshot from 2018-05-03 14-43-46.png
>
>
> An umbrella issue of issues needed so folks can do a rolling upgrade from 
> hbase-1.x to hbase-2.x.
> (Recent) Notables:
>  * hbase-1.x can't read hbase-2.x WALs -- hbase-1.x doesn't know the 
> AsyncProtobufLogWriter class used writing the WAL -- see 
> https://issues.apache.org/jira/browse/HBASE-19166?focusedCommentId=16362897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16362897
>  for exception.
>  ** Might be ok... means WAL split fails on an hbase1 RS... must wait till an 
> hbase-2.x RS picks up the WAL for it to be split.
>  * hbase-1 can't open regions from tables created by hbase-2; it can't find 
> the Table descriptor. See 
> https://issues.apache.org/jira/browse/HBASE-19116?focusedCommentId=16363276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16363276
>  ** This might be ok if the tables we are doing rolling upgrade over were 
> written with hbase-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527174#comment-16527174
 ] 

Reid Chan edited comment on HBASE-18201 at 6/29/18 4:36 AM:


{code:title=DataBlockEncodingTool#checkStatistics}
rawKVs = uncompressedOutputStream.toByteArray();
{code}
I doubt it a real rawKVs, since i see no about writing tags (if a kv has).


was (Author: reidchan):
{code:title=DataBlockEncodingTool#checkStatistics}
rawKVs = uncompressedOutputStream.toByteArray();
{code}
I doubt the it is real rawKVs, since i see no about writing tags (if a kv has).

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527174#comment-16527174
 ] 

Reid Chan commented on HBASE-18201:
---

{code:title=DataBlockEncodingTool#checkStatistics}
rawKVs = uncompressedOutputStream.toByteArray();
{code}
I doubt the it is real rawKVs, since i see no about writing tags (if a kv has).

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-19997) [rolling upgrade] 1.x => 2.x

2018-06-28 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-19997:
-

Assignee: Duo Zhang

> [rolling upgrade] 1.x => 2.x
> 
>
> Key: HBASE-19997
> URL: https://issues.apache.org/jira/browse/HBASE-19997
> Project: HBase
>  Issue Type: Umbrella
>Reporter: stack
>Assignee: Duo Zhang
>Priority: Blocker
> Fix For: 2.1.0
>
> Attachments: Screenshot from 2018-05-03 14-43-46.png
>
>
> An umbrella issue of issues needed so folks can do a rolling upgrade from 
> hbase-1.x to hbase-2.x.
> (Recent) Notables:
>  * hbase-1.x can't read hbase-2.x WALs -- hbase-1.x doesn't know the 
> AsyncProtobufLogWriter class used writing the WAL -- see 
> https://issues.apache.org/jira/browse/HBASE-19166?focusedCommentId=16362897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16362897
>  for exception.
>  ** Might be ok... means WAL split fails on an hbase1 RS... must wait till an 
> hbase-2.x RS picks up the WAL for it to be split.
>  * hbase-1 can't open regions from tables created by hbase-2; it can't find 
> the Table descriptor. See 
> https://issues.apache.org/jira/browse/HBASE-19116?focusedCommentId=16363276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16363276
>  ** This might be ok if the tables we are doing rolling upgrade over were 
> written with hbase-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527156#comment-16527156
 ] 

Reid Chan edited comment on HBASE-18201 at 6/29/18 4:10 AM:


{quote}
Encoder ROW_INDEX_V1 throw error, things go wrong in class EncodedDataBlock
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
the problem is ROW_INDEX_V1 write onDiskDataSize in out(DataOutputStream), the 
others write onDisDataSize in baosBytes(byte array) directly,
since onDiskDataSize is neccessary in the next steps, we need to flush out 
again after endBlockEncoding to write onDiskDataSize.
{quote}

I think adjust the call order like following should works. No need to add 
another if branch, kind of confusing.
{code}
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
baos.flush();
baosBytes = baos.toByteArray();
{code}

bq. boolean useTag = (prevKV.getTagsLength() > 0);
Could we add {{useTag = currentKV.getTagsLength() > 0}} in while loop above? 
Once it is set true, the rest no needs to check.

{code}
HStoreFile hsf = new HStoreFile(fs, path, conf, cacheConf, BloomType.NONE, 
true);
StoreFileReader reader = hsf.getReader();
boolean useTag = reader.getHFileReader().getFileContext().isIncludesTags();
{code}
Kinds of heavy to create a HStoreFile instance just to use its 
{{isIncludesTags}} method.

Few style problems: blank between =, {, \}


was (Author: reidchan):
{quote}
Encoder ROW_INDEX_V1 throw error, things go wrong in class EncodedDataBlock
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
the problem is ROW_INDEX_V1 write onDiskDataSize in out(DataOutputStream), the 
others write onDisDataSize in baosBytes(byte array) directly,
since onDiskDataSize is neccessary in the next steps, we need to flush out 
again after endBlockEncoding to write onDiskDataSize.
{quote}

I think adjust the call order like following should works. No need to add 
another if branch, kind of confusing.
{code}
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
baos.flush();
baosBytes = baos.toByteArray();
{code}

bq. boolean useTag = (prevKV.getTagsLength() > 0);
Could we add {{useTag = currentKV.getTagsLength() > 0}} in while loop above? 
Once it is set true, the rest no needs to check.

{code}
HStoreFile hsf = new HStoreFile(fs, path, conf, cacheConf, BloomType.NONE, 
true);
StoreFileReader reader = hsf.getReader();
boolean useTag = reader.getHFileReader().getFileContext().isIncludesTags();
{code}
Kinds of heavy to create a HStoreFile instance just to use its 
{{isIncludesTags}} method.

Few style problems: blank between '=', '{', '}', '(', ')'.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Reid Chan (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527156#comment-16527156
 ] 

Reid Chan commented on HBASE-18201:
---

{quote}
Encoder ROW_INDEX_V1 throw error, things go wrong in class EncodedDataBlock
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
the problem is ROW_INDEX_V1 write onDiskDataSize in out(DataOutputStream), the 
others write onDisDataSize in baosBytes(byte array) directly,
since onDiskDataSize is neccessary in the next steps, we need to flush out 
again after endBlockEncoding to write onDiskDataSize.
{quote}

I think adjust the call order like following should works. No need to add 
another if branch, kind of confusing.
{code}
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
baos.flush();
baosBytes = baos.toByteArray();
{code}

bq. boolean useTag = (prevKV.getTagsLength() > 0);
Could we add {{useTag = currentKV.getTagsLength() > 0}} in while loop above? 
Once it is set true, the rest no needs to check.

{code}
HStoreFile hsf = new HStoreFile(fs, path, conf, cacheConf, BloomType.NONE, 
true);
StoreFileReader reader = hsf.getReader();
boolean useTag = reader.getHFileReader().getFileContext().isIncludesTags();
{code}
Kinds of heavy to create a HStoreFile instance just to use its 
{{isIncludesTags}} method.

Few style problems: blank between '=', '{', '}', '(', ')'.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527147#comment-16527147
 ] 

Hudson commented on HBASE-19722:


Results for branch branch-2
[build #920 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/920/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/920//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/920//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/920//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20810) Include the procedure id in the exception message in HBaseAdmin for better debugging

2018-06-28 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20810:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+.

Thanks [~stack] for reviewing.

> Include the procedure id in the exception message in HBaseAdmin for better 
> debugging
> 
>
> Key: HBASE-20810
> URL: https://issues.apache.org/jira/browse/HBASE-20810
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20810.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19997) [rolling upgrade] 1.x => 2.x

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527135#comment-16527135
 ] 

Duo Zhang commented on HBASE-19997:
---

So can we close this issue now? Or at least it is not a blocker any more? And 
can be moved out of 2.1.0?

> [rolling upgrade] 1.x => 2.x
> 
>
> Key: HBASE-19997
> URL: https://issues.apache.org/jira/browse/HBASE-19997
> Project: HBase
>  Issue Type: Umbrella
>Reporter: stack
>Priority: Blocker
> Fix For: 2.1.0
>
> Attachments: Screenshot from 2018-05-03 14-43-46.png
>
>
> An umbrella issue of issues needed so folks can do a rolling upgrade from 
> hbase-1.x to hbase-2.x.
> (Recent) Notables:
>  * hbase-1.x can't read hbase-2.x WALs -- hbase-1.x doesn't know the 
> AsyncProtobufLogWriter class used writing the WAL -- see 
> https://issues.apache.org/jira/browse/HBASE-19166?focusedCommentId=16362897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16362897
>  for exception.
>  ** Might be ok... means WAL split fails on an hbase1 RS... must wait till an 
> hbase-2.x RS picks up the WAL for it to be split.
>  * hbase-1 can't open regions from tables created by hbase-2; it can't find 
> the Table descriptor. See 
> https://issues.apache.org/jira/browse/HBASE-19116?focusedCommentId=16363276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16363276
>  ** This might be ok if the tables we are doing rolling upgrade over were 
> written with hbase-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20810) Include the procedure id in the exception message in HBaseAdmin for better debugging

2018-06-28 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20810:
--
Fix Version/s: 2.2.0
   2.0.2
   2.1.0
   3.0.0

> Include the procedure id in the exception message in HBaseAdmin for better 
> debugging
> 
>
> Key: HBASE-20810
> URL: https://issues.apache.org/jira/browse/HBASE-20810
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20810.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20792) info:servername and info:sn inconsistent for OPEN region

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527131#comment-16527131
 ] 

Duo Zhang commented on HBASE-20792:
---

Pushed to branch-2.1+. Leave it open as we haven't decided whether to port it 
to branch-2.0.

> info:servername and info:sn inconsistent for OPEN region
> 
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-04.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>  => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-12.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-06.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20193) Basic Replication Web UI - Regionserver

2018-06-28 Thread Jingyun Tian (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527121#comment-16527121
 ] 

Jingyun Tian edited comment on HBASE-20193 at 6/29/18 3:31 AM:
---

[~Apache9] new patch uploaded based on the discussion of review board.
screen shots as follows:
!replication_rs_1.jpg!
!replication_rs_2.jpg!
please check it out.


was (Author: tianjingyun):
[~Apache9] new patch uploaded based on the discussion of review board.
screen shots as follows:
!replication_rs_1.jpg!
!replication_rs_2.jpg!

> Basic Replication Web UI - Regionserver 
> 
>
> Key: HBASE-20193
> URL: https://issues.apache.org/jira/browse/HBASE-20193
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication, Usability
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HBASE-20193.master.001.patch, 
> HBASE-20193.master.002.patch, HBASE-20193.master.003.patch, 
> HBASE-20193.master.004.patch, HBASE-20193.master.004.patch, 
> HBASE-20193.master.005.patch, HBASE-20193.master.006.patch, 
> HBASE-20193.master.006.patch, HBASE-20193.master.007.patch, 
> HBASE-20193.master.008.patch, HBASE-20193.master.009.patch, 
> HBASE-20193.master.010.patch, HBASE-20193.master.011.patch, 
> replication_rs_1.jpg, replication_rs_2.jpg
>
>
> subtask of HBASE-15809. Implementation of replication UI on Regionserver web 
> page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20193) Basic Replication Web UI - Regionserver

2018-06-28 Thread Jingyun Tian (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527121#comment-16527121
 ] 

Jingyun Tian commented on HBASE-20193:
--

[~Apache9] new patch uploaded based on the discussion of review board.
screen shots as follows:
!replication_rs_1.jpg!
!replication_rs_2.jpg!

> Basic Replication Web UI - Regionserver 
> 
>
> Key: HBASE-20193
> URL: https://issues.apache.org/jira/browse/HBASE-20193
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication, Usability
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HBASE-20193.master.001.patch, 
> HBASE-20193.master.002.patch, HBASE-20193.master.003.patch, 
> HBASE-20193.master.004.patch, HBASE-20193.master.004.patch, 
> HBASE-20193.master.005.patch, HBASE-20193.master.006.patch, 
> HBASE-20193.master.006.patch, HBASE-20193.master.007.patch, 
> HBASE-20193.master.008.patch, HBASE-20193.master.009.patch, 
> HBASE-20193.master.010.patch, HBASE-20193.master.011.patch, 
> replication_rs_1.jpg, replication_rs_2.jpg
>
>
> subtask of HBASE-15809. Implementation of replication UI on Regionserver web 
> page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20819) Use TableDescriptor to replace HTableDescriptor in admin.rb

2018-06-28 Thread Xiaolin Ha (JIRA)

Xiaolin Ha created HBASE-20819:
--

 Summary: Use TableDescriptor to replace HTableDescriptor in 
admin.rb
 Key: HBASE-20819
 URL: https://issues.apache.org/jira/browse/HBASE-20819
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 2.0.0
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha


HTableDescriptor is deprecated as of release 2.0.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20818) Use TableDescriptor to replace HTableDescriptor in admin.rb

2018-06-28 Thread Xiaolin Ha (JIRA)

Xiaolin Ha created HBASE-20818:
--

 Summary: Use TableDescriptor to replace HTableDescriptor in 
admin.rb
 Key: HBASE-20818
 URL: https://issues.apache.org/jira/browse/HBASE-20818
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 2.0.0
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha


HTableDescriptor is deprecated as of release 2.0.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20193) Basic Replication Web UI - Regionserver

2018-06-28 Thread Jingyun Tian (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-20193:
-
Attachment: replication_rs_1.jpg
replication_rs_2.jpg

> Basic Replication Web UI - Regionserver 
> 
>
> Key: HBASE-20193
> URL: https://issues.apache.org/jira/browse/HBASE-20193
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication, Usability
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HBASE-20193.master.001.patch, 
> HBASE-20193.master.002.patch, HBASE-20193.master.003.patch, 
> HBASE-20193.master.004.patch, HBASE-20193.master.004.patch, 
> HBASE-20193.master.005.patch, HBASE-20193.master.006.patch, 
> HBASE-20193.master.006.patch, HBASE-20193.master.007.patch, 
> HBASE-20193.master.008.patch, HBASE-20193.master.009.patch, 
> HBASE-20193.master.010.patch, HBASE-20193.master.011.patch, 
> replication_rs_1.jpg, replication_rs_2.jpg
>
>
> subtask of HBASE-15809. Implementation of replication UI on Regionserver web 
> page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Kuan-Po Tseng (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527115#comment-16527115
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---

[~chia7712] OK, will submit next patch soon, thanks for your patience.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20816) Run ITBLL for branch-2.1

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527112#comment-16527112
 ] 

Duo Zhang commented on HBASE-20816:
---

Setup a 5 nodes cluster, and run this command

{noformat}
./bin/hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList -m 
org.apache.hadoop.hbase.chaos.factorie.SlowDeterministicMonkeyFactory Loop 10 
10 1 itbll-output 100 > itbll.log 2>&1 &
{noformat}

The first loop passed. And for the second loop, the reducers of verify stage 
fail with OOME...

Let me increase the number of mappers and reducers and reduce the rows per node 
to see if it helps.

> Run ITBLL for branch-2.1
> 
>
> Key: HBASE-20816
> URL: https://issues.apache.org/jira/browse/HBASE-20816
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Chia-Ping Tsai (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527110#comment-16527110
 ] 

Chia-Ping Tsai commented on HBASE-18201:


We will get the useTag from the hfile rather than iterating all cell to check 
the useTag? It is ok to me since the useTag is not the point of 
#checkStatistics.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-28 Thread Zheng Hu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-20789:
-
Attachment: HBASE-20789.v3.patch

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: 
> 0001-HBASE-20789-TestBucketCache-testCacheBlockNextBlockM.patch, 
> HBASE-20789.v1.patch, HBASE-20789.v2.patch, HBASE-20789.v3.patch, 
> bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527103#comment-16527103
 ] 

Hadoop QA commented on HBASE-19722:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} hbase-common: The patch generated 1 new + 9 unchanged 
- 1 fixed = 10 total (was 10) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
50s{color} | {color:red} hbase-server: The patch generated 2 new + 0 unchanged 
- 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
2m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 11s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License

[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-28 Thread Jingyun Tian (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527098#comment-16527098
 ] 

Jingyun Tian commented on HBASE-20769:
--

[~apurtell] Sorry for the delay. Patch uploaded.

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.branch-1.001.patch, 
> HBASE-20769.master.001.patch, HBASE-20769.master.002.patch, 
> HBASE-20769.master.003.patch, HBASE-20769.master.004.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Kuan-Po Tseng (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527099#comment-16527099
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---

Yes, I think the way to check if hfile use tag by getting the 
HFileContext#isIncludesTags()  is a more simple way.  But I am not sure if 
there would be any bad consequence by doing this.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-28 Thread Jingyun Tian (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-20769:
-
Attachment: HBASE-20769.branch-1.001.patch

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.branch-1.001.patch, 
> HBASE-20769.master.001.patch, HBASE-20769.master.002.patch, 
> HBASE-20769.master.003.patch, HBASE-20769.master.004.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522157#comment-16522157
 ] 

Ted Yu edited comment on HBASE-20357 at 6/29/18 2:50 AM:
-

[~pankaj2461]
Please fill out release note.


was (Author: yuzhih...@gmail.com):
Please fill out release note.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-28 Thread Zheng Hu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527096#comment-16527096
 ] 

Zheng Hu commented on HBASE-20769:
--

bq. Issues left open in half resolved/committed state are hard for a RM to deal 
with. 
Yeah , It's true.. [~tianjingyun] will prepare the patch today ... Thanks. 

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.master.001.patch, 
> HBASE-20769.master.002.patch, HBASE-20769.master.003.patch, 
> HBASE-20769.master.004.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-28 Thread Chia-Ping Tsai (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527095#comment-16527095
 ] 

Chia-Ping Tsai commented on HBASE-18201:


{quote}Could we replace the way checking *_useTag_* with this code ?
{quote}
Pardon me. I don't catch the point. The useTag you mentioned is in 
DataBlockEncodingTool#checkStatistics?

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18974) Document "Becoming a Committer"

2018-06-28 Thread Chia-Ping Tsai (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527088#comment-16527088
 ] 

Chia-Ping Tsai commented on HBASE-18974:


oh..we shouldn't make the great docs sink in oblivion. [~mdrob]  Could I give 
you a hand if you have no free cycle?

> Document "Becoming a Committer"
> ---
>
> Key: HBASE-18974
> URL: https://issues.apache.org/jira/browse/HBASE-18974
> Project: HBase
>  Issue Type: Bug
>  Components: community, documentation
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-18974-copyedit-addendum.patch, HBASE-18974.patch, 
> HBASE-18974.v2.patch, HBASE-18974.v3.patch
>
>
> Based on the mailing list discussion at 
> https://lists.apache.org/thread.html/81c633cbe1f6f78421cbdad5b9549643c67803a723a9d86a513264c0@%3Cdev.hbase.apache.org%3E
>  it sounds like we should record some of the thoughts for future contributors 
> to refer to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19753) Miscellany of fixes for hbase-zookeeper tests to make them more robust

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527082#comment-16527082
 ] 

Hadoop QA commented on HBASE-19753:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-19753 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19753 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905728/0001-HBASE-19753-Miscellany-of-fixes-for-hbase-zookeeper-.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13446/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Miscellany of fixes for hbase-zookeeper tests to make them more robust
> --
>
> Key: HBASE-19753
> URL: https://issues.apache.org/jira/browse/HBASE-19753
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 
> 0001-HBASE-19753-Miscellany-of-fixes-for-hbase-zookeeper-.patch, 
> HBASE-19753.branch-2.001.patch, HBASE-19753.branch-2.002.patch, 
> HBASE-19753.branch-2.003.patch, HBASE-19753.branch-2.004.patch, 
> HBASE-19753.branch-2.005.patch, HBASE-19753.branch-2.006.patch, 
> HBASE-19753.branch-2.007.patch, HBASE-19753.branch-2.008.patch, keepalive.diff
>
>
> On my cluster which slows zk, tests hbase-zookeeper rarely all pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-18974) Document "Becoming a Committer"

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18974:
--
Status: Open  (was: Patch Available)

> Document "Becoming a Committer"
> ---
>
> Key: HBASE-18974
> URL: https://issues.apache.org/jira/browse/HBASE-18974
> Project: HBase
>  Issue Type: Bug
>  Components: community, documentation
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-18974-copyedit-addendum.patch, HBASE-18974.patch, 
> HBASE-18974.v2.patch, HBASE-18974.v3.patch
>
>
> Based on the mailing list discussion at 
> https://lists.apache.org/thread.html/81c633cbe1f6f78421cbdad5b9549643c67803a723a9d86a513264c0@%3Cdev.hbase.apache.org%3E
>  it sounds like we should record some of the thoughts for future contributors 
> to refer to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19753) Miscellany of fixes for hbase-zookeeper tests to make them more robust

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527081#comment-16527081
 ] 

Mike Drob commented on HBASE-19753:
---

[~stack] - can we resolve this and open a new issue for the keepalive work?

> Miscellany of fixes for hbase-zookeeper tests to make them more robust
> --
>
> Key: HBASE-19753
> URL: https://issues.apache.org/jira/browse/HBASE-19753
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 
> 0001-HBASE-19753-Miscellany-of-fixes-for-hbase-zookeeper-.patch, 
> HBASE-19753.branch-2.001.patch, HBASE-19753.branch-2.002.patch, 
> HBASE-19753.branch-2.003.patch, HBASE-19753.branch-2.004.patch, 
> HBASE-19753.branch-2.005.patch, HBASE-19753.branch-2.006.patch, 
> HBASE-19753.branch-2.007.patch, HBASE-19753.branch-2.008.patch, keepalive.diff
>
>
> On my cluster which slows zk, tests hbase-zookeeper rarely all pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-17528) Remove the Scan.ReadType flag introduced in HBASE-17045

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-17528:
--
Status: Open  (was: Patch Available)

> Remove the Scan.ReadType flag introduced in HBASE-17045
> ---
>
> Key: HBASE-17528
> URL: https://issues.apache.org/jira/browse/HBASE-17528
> Project: HBase
>  Issue Type: Task
>  Components: scan
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: stack
>Priority: Major
> Attachments: HBASE-17528.branch-2.001.patch
>
>
> It is used to keep the old behavior of small scan. We should have a mechanism 
> which selects pread or stream automatically at server side. And in general, 
> we have discussed many times whether it is OK to always use pread for a user 
> scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20604) ProtobufLogReader#readNext can incorrectly loop to the same position in the stream until the the WAL is rolled

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527079#comment-16527079
 ] 

Mike Drob commented on HBASE-20604:
---

Was this related to HBASE-20403?

> ProtobufLogReader#readNext can incorrectly loop to the same position in the 
> stream until the the WAL is rolled
> --
>
> Key: HBASE-20604
> URL: https://issues.apache.org/jira/browse/HBASE-20604
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, wal
>Affects Versions: 3.0.0
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Critical
> Attachments: HBASE-20604.002.patch, HBASE-20604.patch
>
>
> Every time we call {{ProtobufLogReader#readNext}} we consume the input stream 
> associated to the {{FSDataInputStream}} from the WAL that we are reading. 
> Under certain conditions, e.g. when using the encryption at rest 
> ({{CryptoInputStream}}) the stream can return partial data which can cause a 
> premature EOF that cause {{inputStream.getPos()}} to return to the same 
> origina position causing {{ProtobufLogReader#readNext}} to re-try over the 
> reads until the WAL is rolled.
> The side effect of this issue is that {{ReplicationSource}} can get stuck 
> until the WAL is rolled and causing replication delays up to an hour in some 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20489) Update Reference Guide that CLUSTER_KEY value is present on the Master UI info page.

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20489:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolving since this has commits landed in multiple branches. If we need 
additional backports, please open a new issue.

> Update Reference Guide that CLUSTER_KEY value is present on the Master UI 
> info page.
> 
>
> Key: HBASE-20489
> URL: https://issues.apache.org/jira/browse/HBASE-20489
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
> Attachments: hbase-20489.master.001.patch
>
>
> CLUSTER_KEY info is now present on the Master UI info page. The reference 
> guide's section that defines the CLUSTER_KEY, must contain the info that it 
> is now present in the UI as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20489) Update Reference Guide that CLUSTER_KEY value is present on the Master UI info page.

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20489:
--
Fix Version/s: 2.1.0
   3.0.0

> Update Reference Guide that CLUSTER_KEY value is present on the Master UI 
> info page.
> 
>
> Key: HBASE-20489
> URL: https://issues.apache.org/jira/browse/HBASE-20489
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
> Fix For: 3.0.0, 2.1.0
>
> Attachments: hbase-20489.master.001.patch
>
>
> CLUSTER_KEY info is now present on the Master UI info page. The reference 
> guide's section that defines the CLUSTER_KEY, must contain the info that it 
> is now present in the UI as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20817) Infinite loop when executing ReopenTableRegionsProcedure

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527077#comment-16527077
 ] 

Duo Zhang commented on HBASE-20817:
---

[~stack] [~elserj]. FYI.

> Infinite loop when executing ReopenTableRegionsProcedure 
> -
>
> Key: HBASE-20817
> URL: https://issues.apache.org/jira/browse/HBASE-20817
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Duo Zhang
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
>
> As discussed in HBASE-20792, it seems that a region's openSeqNum could remain 
> the same after a sucessful reopen, which causes the RTRP loop infinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20040) Master UI should include "Cluster Key" needed to use the cluster as a replication sink

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20040:
--
   Resolution: Fixed
Fix Version/s: (was: 2.2.0)
   2.1.0
   Status: Resolved  (was: Patch Available)

Resolving since this has commits landed in multiple branches. If we need 
additional backports, please open a new issue.

> Master UI should include "Cluster Key" needed to use the cluster as a 
> replication sink
> --
>
> Key: HBASE-20040
> URL: https://issues.apache.org/jira/browse/HBASE-20040
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication, Usability
>Reporter: Sean Busbey
>Assignee: Sakthi
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0, 2.1.0
>
> Attachments: hbase-20040.branch-1.001.patch, 
> hbase-20040.master.001.patch, hbase-20040.master.002.patch
>
>
> The ref guide defines a "Cluster Key" needed to add an hbase cluster as a 
> replication peer
> {quote}
> CLUSTER_KEY: composed using the following template, with appropriate 
> place-holders: 
> {code}hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent{code}
> {quote}
> the Master UI has all of the pieces displayed currently, but it should 
> include a single field that operators can copy/paste.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20270) Turn off command help that follows all errors in shell

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20270:
--
   Resolution: Fixed
Fix Version/s: (was: 2.2.0)
   2.1.0
   Status: Resolved  (was: Patch Available)

Resolving this because it has been committed to multiple branches. If folks 
want further backports, please open new issues.

> Turn off command help that follows all errors in shell
> --
>
> Key: HBASE-20270
> URL: https://issues.apache.org/jira/browse/HBASE-20270
> Project: HBase
>  Issue Type: Task
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sakthi
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: hbase-20270.master.001.patch, 
> hbase-20270.master.002.patch, hbase-20270.master.003.patch, 
> hbase-20270.master.004.patch, hbase-20270.master.005.patch, 
> hbase-20270.master.006.patch
>
>
> Right now if a shell command gives an error, any error, it then echos the 
> command help. It makes it harder to see the actual error text and is annoying.
> example:
> {code}
>   
>   
>
> hbase(main):007:0> create 'test:a_table', 'family', { NUMREGIONS => 20, 
> SPLITALGO => 'HexStringSplit'}
> ERROR: Unknown namespace test!
> Creates a table. Pass a table name, and a set of column family
> specifications (at least one), and, optionally, table configuration.
> Column specification can be a simple string (name), or a dictionary
> (dictionaries are described below in main help output), necessarily
> including NAME attribute.
> Examples:
> Create a table with namespace=ns1 and table qualifier=t1
>   hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
> Create a table with namespace=default and table qualifier=t1
>   hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
>   hbase> # The above in shorthand would be the following:
>   hbase> create 't1', 'f1', 'f2', 'f3'
>   hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, 
> BLOCKCACHE => true}
>   hbase> create 't1', {NAME => 'f1', CONFIGURATION => 
> {'hbase.hstore.blockingStoreFiles' => '10'}}
>   hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 
> 100, MOB_COMPACT_PARTITION_POLICY => 'weekly'}
> Table configuration options can be put at the end.
> Examples:
>   hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
>   hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
>   hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
>   hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 
> 'myvalue' }
>   hbase> # Optionally pre-split the table into NUMREGIONS, using
>   hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
>   hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
>   hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', 
> REGION_REPLICATION => 2, CONFIGURATION => 
> {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
>   hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}
> You can also keep around a reference to the created table:
>   hbase> t1 = create 't1', 'f1'
> Which gives you a reference to the table named 't1', on which you can then
> call methods.
> Took 0.0221 seconds   
>   
> 
> hbase(main):008:0> create_namespace 'test'
> Took 0.2554 seconds   
>   
> 
> hbase(main):009:0> create 'test:a_table', 'family', { NUMREGIONS => 20, 
> SPLITALGO => 'HexStringSplit'}
> Created table test:a_table
> Took 1.2264 seconds 
> {code}
> I was trying to make a table in the test namespace before making the 
> namespace. Much faster to recognize and move on when the error text isn't 
> followed by 80x the text.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20674) clean up short circuit read logic and docs

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527072#comment-16527072
 ] 

Mike Drob commented on HBASE-20674:
---

[~ram_krish] - did I answer your question? Do you think this is good to commit?

> clean up short circuit read logic and docs
> --
>
> Key: HBASE-20674
> URL: https://issues.apache.org/jira/browse/HBASE-20674
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20674.patch, HBASE-20674.v2.patch, 
> HBASE-20674.v3.patch, HBASE-20674.v4.patch, HBASE-20674.v5.patch
>
>
> Mailing list discussion at 
> https://lists.apache.org/thread.html/f6f73df0ceae29f762f9b9088e3ffd0bf8f109d3dd692df100bf4fd6@%3Cdev.hbase.apache.org%3E
> There are several inconsistencies between how our docs claim we do things and 
> how we actually do things.
> There are two docs sections that attempt to address how SCR should work.
> dfs.client.read.shortcircuit.skip.checksum is advised to set to true, but our 
> code in separate places ignores it and then later sets it to true anyway.
> CommonFSUtils and FSUtils duplicate code related to SCR setup.
> There is a workaround in HFileSystem for a bug that's been fixed in all 
> versions of hadoop that we support. (HADOOP-9307)
> We suggest setting dfs.client.read.shortcircuit.buffer.size to a value that 
> is very close to what we'd set it to anyway, without clearly explaining why 
> this is important.
> There are other properties that we claim are important, but we don't offer 
> any suggestions or explanations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20764) build broken when latest commit is gpg signed

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527069#comment-16527069
 ] 

Mike Drob commented on HBASE-20764:
---

anybody able to do reviews here?

> build broken when latest commit is gpg signed
> -
>
> Key: HBASE-20764
> URL: https://issues.apache.org/jira/browse/HBASE-20764
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-20764.master.001.patch, 
> HBASE-20764.master.002.patch, HBASE-20764.patch
>
>
> I broke the build by digitally signing a commit:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile 
> (default-compile) on project hbase-common: Compilation failure: Compilation 
> failure:
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[11,41]
>  unclosed string literal
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[12,4]
>   expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[12,30]
>  ';' expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[12,35]
>  malformed floating point literal
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[13,4]
>  ';' expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[13,20]
>  ';' expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[13,25]
>   expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[13,76]
>  illegal start of type
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[13,85]
>  ';' expected
> [ERROR] 
> /Users/mdrob/IdeaProjects/hbase/hbase-common/target/generated-sources/java/org/apache/hadoop/hbase/Version.java:[14,41]
>  unclosed string literal
> {noformat}
> Which complains because:
> {code}
>   public static final String revision = "gpg: Signature made Wed Jun 20 
> 09:42:38 2018 PDT
> gpg:using RSA key 86EDB9C33B8517228E88A8F93E48C0C6EF362B9E
> gpg: Good signature from "Mike Drob (CODE SIGNING KEY) " 
> [ultimate]
> d1cad1a25432ffcd75cd654e9bf68233ca7e1957";
> {code}
> And this comes from {{src/saveVersion.sh}} where it does:
> {noformat}
>   revision=`git log -1 --pretty=format:"%H"`
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20817) Infinite loop when executing ReopenTableRegionsProcedure

2018-06-28 Thread Duo Zhang (JIRA)

Duo Zhang created HBASE-20817:
-

 Summary: Infinite loop when executing ReopenTableRegionsProcedure 
 Key: HBASE-20817
 URL: https://issues.apache.org/jira/browse/HBASE-20817
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Duo Zhang
 Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0


As discussed in HBASE-20792, it seems that a region's openSeqNum could remain 
the same after a sucessful reopen, which causes the RTRP loop infinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20792) info:servername and info:sn inconsistent for OPEN region

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527043#comment-16527043
 ] 

Duo Zhang commented on HBASE-20792:
---

Anyway, let me commit the patch here to branch-2.1+ first. The discussions are 
off topic to RTRP now. I think we can open a new issue for it...

> info:servername and info:sn inconsistent for OPEN region
> 
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-04.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>  => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-12.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-06.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20816) Run ITBLL for branch-2.1

2018-06-28 Thread Duo Zhang (JIRA)

Duo Zhang created HBASE-20816:
-

 Summary: Run ITBLL for branch-2.1
 Key: HBASE-20816
 URL: https://issues.apache.org/jira/browse/HBASE-20816
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang
Assignee: Duo Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527040#comment-16527040
 ] 

Mike Drob commented on HBASE-20812:
---

bq. UOE sends wrong message. NIE seemed more apt
fair
bq. I thought commons3. Plain commons bad
you're right, i was confused
bq. On close, let me remove the default and leave it as a must implement. 
That's a better message
I like this.

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527037#comment-16527037
 ] 

Duo Zhang commented on HBASE-20796:
---

+1 on the patch. Just fixed an existing problem. We can continue improving it 
later in other issues.

> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
> org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeIoException: 
> syscall:read(..) failed: Connection reset by peer (Notice how state for 
> region is OPEN and 'SUCCESS').
>  * Then says 2018-06-26 12:33:06,380 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Retry=1 of max=10; 
> pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.X.Y.Z,22101,1529611443424
>  * And finally...  2018-06-26 12:34:10,727 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OFFLINE, location=null, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2
> Restart of Master got rid of the STUCK complaints.
> This is interesting because the stuck rpc and the successful reassign are all 
> riding on the same pid.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527031#comment-16527031
 ] 

stack commented on HBASE-20812:
---

Thanks for review.  UOE sends wrong message. NIE seemed more apt.  I thought 
commons3. Plain commons bad.  Let me poke around.  On close, I should probably 
add a throw. On close, let me remove the default and leave it as a must 
implement.  That's a better message.  Thanks for review.

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527025#comment-16527025
 ] 

Duo Zhang commented on HBASE-20802:
---

OK, the description implies me that you want to add a method to the 
RemoteProcedure. Let's see the patch then.

Thanks.

> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527019#comment-16527019
 ] 

Mike Drob commented on HBASE-20812:
---

why NotImplementedException? why not UnsupportedOperationException which is 
standard bundled with JDK? I thought lang3 was one the libs we're not supposed 
to use? maybe it's ok, there's so many i can't keep track anymore.

why does close() not need to throw NIException like the rest?

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527006#comment-16527006
 ] 

stack edited comment on HBASE-20802 at 6/29/18 1:24 AM:


Advantage would be we'd promptly shutdown hung rpc.

I want to interrupt it in remotecallfailed that the scp is running...that's why 
I want to add an interrupt your to the dispatch api.


was (Author: stack):
Advantage would be we'd promptly shutdown hung rpc.

I want to interrupt it in remotecallfailed ...that's why I want to add this api.

> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527006#comment-16527006
 ] 

stack commented on HBASE-20802:
---

Advantage would be we'd promptly shutdown hung rpc.

I want to interrupt it in remotecallfailed ...that's why I want to add this api.

> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20814) fix error prone assertion failure ignored warnings

2018-06-28 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527005#comment-16527005
 ] 

Mike Drob commented on HBASE-20814:
---

v2: address checkstyle, and a new assertion ignored that crept in

> fix error prone assertion failure ignored warnings
> --
>
> Key: HBASE-20814
> URL: https://issues.apache.org/jira/browse/HBASE-20814
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20814.master.001.patch, 
> HBASE-20814.master.002.patch
>
>
> when we have assertion failures ignored, that likely means we're missing a 
> test case, let's make sure our tests are actually running and covering what 
> we think they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20814) fix error prone assertion failure ignored warnings

2018-06-28 Thread Mike Drob (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20814:
--
Attachment: HBASE-20814.master.002.patch

> fix error prone assertion failure ignored warnings
> --
>
> Key: HBASE-20814
> URL: https://issues.apache.org/jira/browse/HBASE-20814
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20814.master.001.patch, 
> HBASE-20814.master.002.patch
>
>
> when we have assertion failures ignored, that likely means we're missing a 
> test case, let's make sure our tests are actually running and covering what 
> we think they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527004#comment-16527004
 ] 

stack commented on HBASE-20796:
---

I'm wondering where the SCP would get the epoch to volunteer to the running AP 
for it to compare? Where would it get it from? We might be able to pass the 
epoch when we dispatch the RPC for it to send back into the Procedure on 
failure but SCP has no relation to outstanding APs.

Otherwise, yeah, idea of handing off a sequenceid or epoch to be passed in 
callbacks would help make the system more resilient. Would be good to bake this 
into the Procedure framework in the same way nonces are integral rather than do 
it as one-off.

Patch also fixes fact that AM undo of OPENING was not triggering because the 
offlining was happening before we made the AM call. Offlining was also being 
done outside of a synchronize on state -- no harm having it under synchronize 
so the state change is done clean.

> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
> org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeIoException: 
> syscall:read(..) failed: Connection reset by peer (Notice how state for 
> region is OPEN and 'SUCCESS').
>  * Then says 2018-06-26 12:33:06,380 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Retry=1 of max=10; 
> pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.X.Y.Z,22101,1529611443424
>  * And

[jira] [Commented] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526998#comment-16526998
 ] 

Duo Zhang commented on HBASE-20802:
---

{quote}
Indeed, interrupt won't always work.
{quote}

I think the only way to implement this is to set a flag and then try to 
interrupt it. And when it comes back, it checks the flag first and decides 
whether to call the remoteCallFailed method. No big advantage comparing to 
adding a check in the remoteCallFailed method itself? And if we really want to 
interrupt the call, just interrupt it inside remoteCallFailed?

> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526996#comment-16526996
 ] 

stack commented on HBASE-19722:
---

Thanks for the backport [~apurtell]

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526991#comment-16526991
 ] 

stack edited comment on HBASE-19722 at 6/29/18 12:56 AM:
-

[~xucang] Can we have a picture or output of what the new CP shows in the RN or 
is it the JMX you posted early in the issue? Would be good to hoist that up 
into the RN.

Thanks.

It looks safe to pull into 2.0.x. It is only on if you enable it. Operators 
will love this.


was (Author: stack):
[~xucang] Can we have a picture or output of what the new CP shows in the RN?

Thanks.

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526995#comment-16526995
 ] 

Duo Zhang commented on HBASE-20796:
---

{quote}
Would Dispatch then need to know about Procedures? They've been intentionally 
done as distinct systems up to this.
{quote}

I mean add something in the Assign/Unassign procedures to prevent redundant 
call. For example, for unassign, it is a one time deal, we just need to add a 
simple flag, if it is finished by a SCP then no one call finish it again. And 
for assign, maybe we could introduce something like an epoch, it will be 
increased if we restart from the beginning to assign the region again. If a 
remoteCallXXX is coming, we first need to check if the epoch matches, if not, 
just ignore it.

> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
> org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeIoException: 
> syscall:read(..) failed: Connection reset by peer (Notice how state for 
> region is OPEN and 'SUCCESS').
>  * Then says 2018-06-26 12:33:06,380 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Retry=1 of max=10; 
> pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.X.Y.Z,22101,1529611443424
>  * And finally...  2018-06-26 12:34:10,727 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OFFLINE, location=null, 
>

[jira] [Commented] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526991#comment-16526991
 ] 

stack commented on HBASE-19722:
---

[~xucang] Can we have a picture or output of what the new CP shows in the RN?

Thanks.

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526992#comment-16526992
 ] 

Andrew Purtell commented on HBASE-20557:


bq. I'm leaning towards not including this on branch-1.4 since this by default 
creates two threads for deletion (one for large and one for small HFiles) so 
there is no way to only have a single thread deleting anymore. What are your 
thoughts?

Close enough.

bq. Also regarding your comment on api checker is this a tool I can run? I'm 
not familiar with it.

See ./dev-support/checkcompatibility.py . Usage example in the file. 

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526989#comment-16526989
 ] 

stack commented on HBASE-20802:
---

bq.  and we also need to take care of the timing on when to interrupt the 
call during the server crash processing

This is done already (we call cleanup AFTER log replay) I believe.

bq. Add a guard in the remoteCallFailed implementation itself will be more 
suitable here I think.

Yeah, thats the parent issue.

bq.  I think it is a bit hard to implement as we use sync rpc calls

What you thinking?

Indeed, interrupt won't always work. Then we defer to the timeout. Would be 
good to make work though rather than having stuck RPCs hanging out.




> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20792) info:servername and info:sn inconsistent for OPEN region

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526990#comment-16526990
 ] 

Duo Zhang commented on HBASE-20792:
---

Looked at the log above, I think there maybe a corner case that we haven't 
handle it correctly. When closing a region, we will do a flush, and it will 
also increase the sequence id as we need to write a flush marker, and this 
value will be persist to the store file. And when closing, we will also write a 
close marker and update the max sequence id file.

And here, for [~elserj]'s scenario, we do not write anything into the region, 
so there will be no flush, so we can not persist the sequence id to store file. 
And maybe the checks around the updating max sequence id file also prevents the 
updating, and since this is a normal close/reopen, we will not replay the wals, 
so the close region marker will not be considered either, then the openSeqNum 
will remain the same.

Let me try it locally to see if I can find the problem.

> info:servername and info:sn inconsistent for OPEN region
> 
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-04.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>  => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-12.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-06.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Andrew Purtell (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19722:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.2
   1.4.6
   1.5.0
   2.1.0
   3.0.0
   Status: Resolved  (was: Patch Available)

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20814) fix error prone assertion failure ignored warnings

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526985#comment-16526985
 ] 

Hadoop QA commented on HBASE-20814:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 13 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hbase-build-configuration {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m  
0s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m  
9s{color} | {color:green} hbase-build-configuration in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} hbase-common generated 0 new + 40 unchanged - 2 
fixed = 40 total (was 42) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} hbase-procedure generated 0 new + 7 unchanged - 2 
fixed = 7 total (was 9) {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  0s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} hbase-mapreduce in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} hbase-rsgroup generated 0 new + 105 unchanged - 1 
fixed = 105 total (was 106) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} hbase-endpoint generated 0 new + 123 unchanged - 2 
fixed = 123 total (was 125) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} hbase-it generated 0 new + 49 unchanged - 2 fixed = 
49 total (was 51) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
28s{color} | {color:red} hbase-client: The patch generated 1 new + 49 unchanged 
- 0 fixed = 50 total (was 49) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 4 
unchanged - 0 fixed = 5 total (was 4) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
58s{color} | {color:red} hbase-server: The patch generated 6 new + 118 
unchanged - 1 fixed = 124 total (was 119) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} hbase-mapreduce: The patch generated 1 new + 3 
unchanged - 0

[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526983#comment-16526983
 ] 

Zach York commented on HBASE-20557:
---

Ah you probably mean this: 
[https://github.com/apache/hbase/blob/master/dev-support/checkcompatibility.py] 
I'll try that.

 

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20792) info:servername and info:sn inconsistent for OPEN region

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526979#comment-16526979
 ] 

Duo Zhang commented on HBASE-20792:
---

Oh, your region is always reopened at the same RS...

> info:servername and info:sn inconsistent for OPEN region
> 
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-04.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>  => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-12.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-06.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526980#comment-16526980
 ] 

stack commented on HBASE-20796:
---

bq. Yes, that's what I said above, guard.

Are you referring to the sequenceid comment or ignoring the handleFailure call 
if context is not what was expected?

bq. On the patch, I'm a little nervous that we seem to put the guard deeply 
into the AM, this makes things complicated and we need more comments to say 
what is going on.

Thats one way to look at it.

I was thinking it actually cleans up the flow, has cancel of OPENING in one 
location rather than spread between AM and AP as it was previously. We also are 
making AP match what UP is already doing.

What was there in AP would unconditionally 'wake' up the Procedure whoever 
called handleFailure whatever its situation or whoever the caller.

bq. Maybe we could do something at the remote procedure layer first to filter 
out the redundant calls? Not sure, need to read the code more carefully...

Would Dispatch then need to know about Procedures? They've been intentionally 
done as distinct systems up to this.

I was thinking that we do the sub-issue and add interrupt support. Then the 
cancelled RPC handling and cleanup could be covered by the SCP cleanup. We'd 
still need this checking for context as the handleFailure can still have two 
sources, the dispatcher or SCP.

I started up a bigger ITBLL job... will report back.



> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
>

[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526976#comment-16526976
 ] 

Zach York commented on HBASE-20557:
---

[~apurtell] I'm leaning towards not including this on branch-1.4 since this by 
default creates two threads for deletion (one for large and one for small 
HFiles) so there is no way to only have a single thread deleting anymore. What 
are your thoughts?

Also regarding your comment on api checker is this a tool I can run? I'm not 
familiar with it. (I looked through the plugins but didn't see one that jumped 
out immediately).

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20792) info:servername and info:sn inconsistent for OPEN region

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526975#comment-16526975
 ] 

Duo Zhang commented on HBASE-20792:
---

{quote}
So, what I'm gathering is that we pull out a seqId=1 on open which makes our 
openSeqNum=2 when we actually complete the OPEN. However, when we close the 
region down, we don't update that new seqNum anywhere; thus, the next time we 
open the region, we get 2 again. It's not an issue from a data correctness (I 
think..), but it seems to cause this inf+loop in RTRP as we see.
{quote}

Which branch [~elserj]? 2.1 or 2.0?

There is a issue for the openSeqNum bumping 
https://issues.apache.org/jira/browse/HBASE-20242, but it is only committed to 
branch-2.1+ I believe. And a successful reopen should bump the openSeqNum as we 
write a open region event to the WAL and when replaying the sequence number 
will be increased?

Anyway, there is a way to rider over this is that, even if the region is in 
OPEN state before, beside comparing the openSeqNum, we can also check the 
location, if the location is changed then we can also make sure that the region 
is reopened.

But I still there maybe other problems as the openSeqNum should be bumped 
during reopening...

> info:servername and info:sn inconsistent for OPEN region
> 
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-04.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>  => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-12.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-06.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.   
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20802) Add an interruptCall to RemoteProcedureDispatch

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526966#comment-16526966
 ] 

Duo Zhang commented on HBASE-20802:
---

We can change the timeout value but for the interrupt call I think it is a bit 
hard to implement as we use sync rpc calls, and we also need to take care of 
the timing on when to interrupt the call during the server crash processing. 
Add a guard in the remoteCallFailed implementation itself will be more suitable 
here I think.

> Add an interruptCall to RemoteProcedureDispatch
> ---
>
> Key: HBASE-20802
> URL: https://issues.apache.org/jira/browse/HBASE-20802
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> Follow-on from the parent. In summary, RPC's to zombie servers can get 
> stuck/hang. We'll notice the server has gone non-responsive after a while and 
> will effect repair but the RPCs will remain up until they go to their timeout 
> (default 3minutes).
> This issue is about adding a means of interrupting an ongoing RPC. 
> ServerCrashProcedure does cleanup of any ongoing, unsatisfied 
> assigns/unassigns. As part of this cleanup, it could interrupt any 
> outstanding RPCs.
> We'd add an interruptCall to the below interface in RemoteProcedureDispatch
> {code}
>   public interface RemoteProcedure {
> RemoteOperation remoteCallBuild(TEnv env, TRemote remote);
> void remoteCallCompleted(TEnv env, TRemote remote, RemoteOperation 
> response);
> void remoteCallFailed(TEnv env, TRemote remote, IOException exception);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526964#comment-16526964
 ] 

Duo Zhang commented on HBASE-20796:
---

{quote}
Whats added here is ignoring remoteCallFailed calls that arrive into a context 
that doesn't match their expectation (Been thinking we need sequenceids on 
procedure steps).
{quote}

Yes, that's what I said above, guard.

On the patch, I'm a little nervous that we seem to put the guard deeply into 
the AM, this makes things complicated and we need more comments to say what is 
going on. Maybe we could do something at the remote procedure layer first to 
filter out the redundant calls? Not sure, need to read the code more 
carefully...

> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
> org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeIoException: 
> syscall:read(..) failed: Connection reset by peer (Notice how state for 
> region is OPEN and 'SUCCESS').
>  * Then says 2018-06-26 12:33:06,380 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Retry=1 of max=10; 
> pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.X.Y.Z,22101,1529611443424
>  * And finally...  2018-06-26 12:34:10,727 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OFFLINE, location=null, 
> table=IntegrationTestBigLinkedList_20180612114844, 
>

[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526956#comment-16526956
 ] 

Andrew Purtell edited comment on HBASE-20704 at 6/29/18 12:22 AM:
--

bq. I think the better solution is to keep the list of active (and possibly 
compacted) storefiles in meta

The incomplete "new FS layout" patch for branch-2 and up has this as a feature 
IIRC. Perhaps we can break it out?

See also the StoreCommitTransaction proposal on HBASE-20431 . We could manage a 
list of storefiles in meta indicating the state of the respective files as part 
of that.


was (Author: apurtell):
bq. I think the better solution is to keep the list of active (and possibly 
compacted) storefiles in meta

The incomplete "new FS layout" patch for branch-2 and up has this as a feature 
IIRC. Perhaps we can break it out?

See also the StoreCommitTransaction proposal on HBASE-20431

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526956#comment-16526956
 ] 

Andrew Purtell edited comment on HBASE-20704 at 6/29/18 12:20 AM:
--

bq. I think the better solution is to keep the list of active (and possibly 
compacted) storefiles in meta

The incomplete "new FS layout" patch for branch-2 and up has this as a feature 
IIRC. Perhaps we can break it out?

See also the StoreCommitTransaction proposal on HBASE-20431


was (Author: apurtell):
bq. I think the better solution is to keep the list of active (and possibly 
compacted) storefiles in meta

The incomplete "new FS layout" patch for branch-2 and up has this as a feature 
IIRC. Perhaps we can break it out?

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526956#comment-16526956
 ] 

Andrew Purtell commented on HBASE-20704:


bq. I think the better solution is to keep the list of active (and possibly 
compacted) storefiles in meta

The incomplete "new FS layout" patch for branch-2 and up has this as a feature 
IIRC. Perhaps we can break it out?

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526955#comment-16526955
 ] 

Andrew Purtell commented on HBASE-19722:


Thanks for the updated branch-1 patch. Testing again. If good will commit 
everywhere

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Xu Cang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-19722:

Description: 
Implement a meta query statistics metrics source, created whenever a 
regionserver starts hosting meta, removed when meta hosting moves. Provide 
views on top tables by request counts, top meta rowkeys by request count, top 
clients making requests by their hostname.

Can be implemented as a coprocessor.

 

 

 

 

===

*Release Note* (WIP)

*1. Usage:*

Use this coprocessor by adding below section to hbase-site.xml

{{}}
{{    hbase.coprocessor.region.classes}}
{{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
{{}}

  was:
Implement a meta query statistics metrics source, created whenever a 
regionserver starts hosting meta, removed when meta hosting moves. Provide 
views on top tables by request counts, top meta rowkeys by request count, top 
clients making requests by their hostname. 

Can be implemented as a coprocessor.


> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname.
> Can be implemented as a coprocessor.
>  
>  
>  
>  
> ===
> *Release Note* (WIP)
> *1. Usage:*
> Use this coprocessor by adding below section to hbase-site.xml
> {{}}
> {{    hbase.coprocessor.region.classes}}
> {{    org.apache.hadoop.hbase.coprocessor.MetaTableMetrics}}
> {{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Francis Liu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526933#comment-16526933
 ] 

Francis Liu edited comment on HBASE-20704 at 6/29/18 12:05 AM:
---

Hmm actually even if we add the list of parent storefiles in there's still a 
corner case during regionserver failover that we won't cover (ie a a dead RS 
commits a compacted storefile before aborting and the region has already been 
opened elsewhere). It seems the more straightforward and intuitive way to solve 
this is the current proposed way of closing and cleaning up the compacted 
storefiles on close. And make sure the still relevant compaction markers are 
replayed on the region to address HBASE-20724. Let me go down this route and 
see how that goes. 

Long term tho I think the better solution is to keep the list of active (and 
possibly compacted) storefiles in meta. That way we can update the changes 
atomically. Tho that will probably only be a viable option once splitting meta 
is available.


was (Author: toffer):
Hmm actually even if at add the list of parent storefiles in there's still a 
corner case during regionserver failover that we won't cover (ie a a dead RS 
commits a compacted storefile before aborting and the region has already been 
opened elsewhere). It seems the more straightforward and intuitive way to solve 
this is the current proposed way of closing and cleaning up the compacted 
storefiles on close. And make sure the still relevant compaction markers are 
replayed on region for HBASE-20724. Let me go down this route and see how that 
goes. 

Long term tho I think the better solution is to keep the list of active (and 
possibly compacted) storefiles in meta. That way we can update the changes 
atomically. Tho that will probably only be a viable option once splitting meta 
is available.

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19722) Meta query statistics metrics source

2018-06-28 Thread Xu Cang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-19722:

Attachment: HBASE-19722.branch-1.v002.patch

> Meta query statistics metrics source
> 
>
> Key: HBASE-19722
> URL: https://issues.apache.org/jira/browse/HBASE-19722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-19722.branch-1.v001.patch, 
> HBASE-19722.branch-1.v002.patch, HBASE-19722.master.010.patch, 
> HBASE-19722.master.011.patch, HBASE-19722.master.012.patch, 
> HBASE-19722.master.013.patch, HBASE-19722.master.014.patch, 
> HBASE-19722.master.015.patch, HBASE-19722.master.016.patch
>
>
> Implement a meta query statistics metrics source, created whenever a 
> regionserver starts hosting meta, removed when meta hosting moves. Provide 
> views on top tables by request counts, top meta rowkeys by request count, top 
> clients making requests by their hostname. 
> Can be implemented as a coprocessor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Francis Liu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526933#comment-16526933
 ] 

Francis Liu commented on HBASE-20704:
-

Hmm actually even if at add the list of parent storefiles in there's still a 
corner case during regionserver failover that we won't cover (ie a a dead RS 
commits a compacted storefile before aborting and the region has already been 
opened elsewhere). It seems the more straightforward and intuitive way to solve 
this is the current proposed way of closing and cleaning up the compacted 
storefiles on close. And make sure the still relevant compaction markers are 
replayed on region for HBASE-20724. Let me go down this route and see how that 
goes. 

Long term tho I think the better solution is to keep the list of active (and 
possibly compacted) storefiles in meta. That way we can update the changes 
atomically. Tho that will probably only be a viable option once splitting meta 
is available.

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526897#comment-16526897
 ] 

stack commented on HBASE-20812:
---

Review please. Will make it so patches like that in HBASE-15320, the kafka 
proxy, can be smaller and will ease creating implementations.

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20806) Split style journal for flushes and compactions

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526896#comment-16526896
 ] 

stack commented on HBASE-20806:
---

Sounds great. Task manager has seen no love.  Here maybe where that changes.  
BTW, hbase2 flushes are now at least on par w hbase1 after some work.  What 
will the log look like?

> Split style journal for flushes and compactions
> ---
>
> Key: HBASE-20806
> URL: https://issues.apache.org/jira/browse/HBASE-20806
> Project: HBase
>  Issue Type: Improvement
>Reporter: Abhishek Singh Chouhan
>Assignee: Abhishek Singh Chouhan
>Priority: Minor
>
> In 1.x we have split transaction journal that gives a clear picture of when 
> various stages of splits took place. We should have a similar thing for 
> flushes and compactions so as to have insights into time spent in various 
> stages, which we can use to identify regressions that might creep up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526892#comment-16526892
 ] 

Hadoop QA commented on HBASE-20812:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for 
instructions. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
42s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
31s{color} | {color:red} hbase-client: The patch generated 17 new + 17 
unchanged - 0 fixed = 34 total (was 17) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
57s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20812 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12929634/20812.txt |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c91ecad357df 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6198e1fc7d |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
|

[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526889#comment-16526889
 ] 

Zach York commented on HBASE-20557:
---

+1, reviewed on PR.

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20796) STUCK RIT though region successfully assigned

2018-06-28 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526866#comment-16526866
 ] 

stack commented on HBASE-20796:
---

Ran a 1B ITBLL with chaos and all passed. Will run some others but I think this 
safe to backport.

> STUCK RIT though region successfully assigned
> -
>
> Key: HBASE-20796
> URL: https://issues.apache.org/jira/browse/HBASE-20796
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2
>
> Attachments: HBASE-20796.branch-2.0.001.patch
>
>
> This is a good one. We keep logging messages like this:
> {code}
> 2018-06-26 12:32:24,859 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, 
> location=vd0410.X.Y.com,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180525080406, 
> region=e10b35d49528e2453a04c7038e3393d7
> {code}
> ...though the region is successfully assigned.
> Story:
>  * Dispatch an assign 2018-06-26 12:31:27,390 INFO 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch 
> pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * It gets stuck 2018-06-26 12:32:29,860 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OPENING, location=vd0410.X.Y.Z,22101,1529611445046, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 (Because the server was killed)
>  * We stay STUCK for a while.
>  * The Master notices the server as crashed and starts a SCP.
>  * SCP kills ongoing assign: 2018-06-26 12:32:54,809 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: pid=371105 
> found RIT pid=370829, ppid=370391, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> AssignProcedure table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046
>  * The kill brings on a retry ... 2018-06-26 12:32:54,810 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPENING, 
> location=vd0410.X.Y.Z,22101,1529611445046; exception=ServerCrashProcedure 
> pid=371105, server=vd0410.X.Y.Z,22101,1529611445046
>  * Which eventually succeeds. Successfully deployed to new server 
> 2018-06-26 12:32:55,429 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=370829, 
> ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2 in 1mins, 35.379sec
>  * But then, it looks like the RPC was ongoing and it broke in following way 
> 2018-06-26 12:33:06,378 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Remote 
> call failed pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.halxg.cloudera.com,22101,1529611443424; exception=Call to 
> vd0410.X.Y.Z/10.10.10.10:22101 failed on local exception: 
> org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeIoException: 
> syscall:read(..) failed: Connection reset by peer (Notice how state for 
> region is OPEN and 'SUCCESS').
>  * Then says 2018-06-26 12:33:06,380 INFO 
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Retry=1 of max=10; 
> pid=370829, ppid=370391, state=SUCCESS; AssignProcedure 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2; rit=OPEN, 
> location=vc0614.X.Y.Z,22101,1529611443424
>  * And finally...  2018-06-26 12:34:10,727 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=OFFLINE, location=null, 
> table=IntegrationTestBigLinkedList_20180612114844, 
> region=f69ccf7d9178ce166b515e0e2ef019d2
> Restart of Master got rid of the STUCK complaints.
> This is interesting because the stuck rpc and the successful reassign are all 
> riding on the same pid.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20814) fix error prone assertion failure ignored warnings

2018-06-28 Thread Umesh Agashe (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526865#comment-16526865
 ] 

Umesh Agashe commented on HBASE-20814:
--

+1, lgtm

> fix error prone assertion failure ignored warnings
> --
>
> Key: HBASE-20814
> URL: https://issues.apache.org/jira/browse/HBASE-20814
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Attachments: HBASE-20814.master.001.patch
>
>
> when we have assertion failures ignored, that likely means we're missing a 
> test case, let's make sure our tests are actually running and covering what 
> we think they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20812:
--
Assignee: stack
  Status: Patch Available  (was: Open)

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20812) Add defaults to Table Interface so implementors don't have to

2018-06-28 Thread stack (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20812:
--
Attachment: 20812.txt

> Add defaults to Table Interface so implementors don't have to
> -
>
> Key: HBASE-20812
> URL: https://issues.apache.org/jira/browse/HBASE-20812
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Major
> Attachments: 20812.txt
>
>
> Lets add default implementaitons -- even if they are just throw 
> NotImplementedException -- to our Table Interface now we are up on jdk8. 
> Table implementations are how the likes of hbase-indexer modify hbase --via  
> a publically supported API -- and I notice that the kafka proxy now goes the 
> same route. Typically, these customizations are only interested in one or two 
> methods of Table adding in their own implementations but they have to supply 
> implementations for all Table methods in their override. Lets help them out 
> by adding defaults (I had a patch but lost it...). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-28 Thread Zach York (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526844#comment-16526844
 ] 

Zach York commented on HBASE-20789:
---

[~openinx]
{quote}bq. If the existingBlock has nextBlockOnDiskSize set , while cachedItem 
has nextBlockOnDiskSize(default = -1) unset, the comparison should be positive 
number ? 
 So there is a typo ?
{quote}
No, cachedItem will be smaller in that case and so the comparison will be -1. I 
think this is why you were having difficulty getting the tests to pass. Please 
flip the '>' back to a '<'

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: 
> 0001-HBASE-20789-TestBucketCache-testCacheBlockNextBlockM.patch, 
> HBASE-20789.v1.patch, HBASE-20789.v2.patch, bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-16549) Procedure v2 - Add new AM metrics

2018-06-28 Thread Umesh Agashe (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526833#comment-16526833
 ] 

Umesh Agashe commented on HBASE-16549:
--

Done,HBASE-20815. Thanks [~mdrob]!

> Procedure v2 - Add new AM metrics
> -
>
> Key: HBASE-16549
> URL: https://issues.apache.org/jira/browse/HBASE-16549
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.0.0
>Reporter: Matteo Bertozzi
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-16549-hbase-14614.v1.patch, 
> HBASE-16549-hbase-14614.v2-v3.patch, HBASE-16549-hbase-14614.v2.patch, 
> HBASE-16549-hbase-14614.v3.patch, HBASE-16549-hbase-14614.v3.patch, 
> HBASE-16549.master.v4.patch, HBASE-16549.master.v4.patch, 
> HBASE-16549.master.v4.patch, HBASE-16549.master.v5.patch
>
>
> With the new AM we can add a bunch of metrics
>  - assign/unassign time
>  - server crash time
>  - grouping related metrics? (how many batch we do, and similar?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-28 Thread Andrew Purtell (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526835#comment-16526835
 ] 

Andrew Purtell commented on HBASE-20704:


bq. I think this would cause 1-2 additional NN list operations per file in a 
region? When opening a large table with many regions, I worry about swamping 
the NN more than we already do. Maybe it's ok though.

Open work is bounded by configuration parameters so we can control how much 
open load is thrown at HDFS all at once. I don't see a concern.

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 >

1 - 100 of 243 matches

Mail list logo