[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546018#comment-16546018
 ] 

stack commented on HBASE-20875:
---

bq. In the jitviewer also noticed that in the hot path (in read path in this 
case) there are instances where it says 'callee too big' because of which 
inlining does not happen.

Yeah. If can make it inline by making stuff smaller and/or dumbing down the 
options/types, usually goes faster.

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546016#comment-16546016
 ] 

ramkrishna.s.vasudevan commented on HBASE-20875:


In the jitviewer also noticed that in the hot path (in read path in this case) 
there are instances where it says 'callee too big' because of which inlining 
does not happen. 

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546012#comment-16546012
 ] 

ramkrishna.s.vasudevan commented on HBASE-20875:


Nice one. +1 on the patch. 

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545988#comment-16545988
 ] 

Ted Yu edited comment on HBASE-20901 at 7/17/18 3:27 AM:
-

{code}
+  public static byte[] getRegionStateColumn(int replicaId) {
{code}
The new methods can be package private, right ?


was (Author: yuzhih...@gmail.com):
{code}
+  public static byte[] getRegionStateColumn(int replicaId) {
{code}
The new methods can be private, right (only accessed in MetaTableAccessor) ?

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545988#comment-16545988
 ] 

Ted Yu commented on HBASE-20901:


{code}
+  public static byte[] getRegionStateColumn(int replicaId) {
{code}
The new methods can be private, right (only accessed in MetaTableAccessor) ?

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-07-16 Thread Kuan-Po Tseng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuan-Po Tseng updated HBASE-18201:
--
Attachment: HBASE-18201.master.005.patch

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, 
> HBASE-18201.master.005.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545985#comment-16545985
 ] 

Hadoop QA commented on HBASE-20704:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 44s{color} 
| {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 9 new + 53 unchanged 
- 0 fixed = 62 total (was 53) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}146m 
50s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}189m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20704 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931863/HBASE-20704.004.draft.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux a48e258cbb76 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2997b6d071 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13643/artifact/patchprocess/diff-compile-javac-hbase-server.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13643/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545969#comment-16545969
 ] 

Duo Zhang commented on HBASE-20846:
---

Thanks [~stack]. I checked the code, for a procedure in ROLLEDBACK state, we 
will call store.delete to remove it so we should not update the lock operation 
any more.

And we are getting closer. Let me check the failed UTs.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-16 Thread Chia-Ping Tsai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545966#comment-16545966
 ] 

Chia-Ping Tsai commented on HBASE-20873:


Nice docs. +1

I’m on vacation so the patch will be committed later. 

> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HBASE-20873.master.001.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20893:
---
Attachment: HBASE-20893.branch-2.0.002.patch

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20893.branch-2.0.001.patch, 
> HBASE-20893.branch-2.0.002.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545949#comment-16545949
 ] 

Hadoop QA commented on HBASE-20875:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
18s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 44s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}156m 
51s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}200m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20875 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931851/HBASE-20875.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 10b47f4d2675 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2997b6d071 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13642/testReport/ |
| Max. process+thread count | 5200 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13642/console |
| Powered by | 

[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-16 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545941#comment-16545941
 ] 

Wei-Chiu Chuang commented on HBASE-20873:
-

[~chia7712] mind take a look? Thanks

> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HBASE-20873.master.001.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20901:
--
Status: Patch Available  (was: Open)

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-20901 started by Ankit Singhal.
-
> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work stopped] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-20901 stopped by Ankit Singhal.
-
> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20901:
--
Description: 
While reducing the region replica, server name(sn) and state column of the 
replica are not getting deleted, resulting in assignment manager to think that 
these regions are CLOSED and assign them again.


> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20901:
--
Attachment: HBASE-20901.patch

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20901:
--
Environment: (was: While reducing the region replica, server name(sn) 
and state column of the replica are not getting deleted, resulting in 
assignment manager to think that these regions are CLOSED and assign them again.
)

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20901) Reducing region replica has no effect

2018-07-16 Thread Ankit Singhal (JIRA)
Ankit Singhal created HBASE-20901:
-

 Summary: Reducing region replica has no effect
 Key: HBASE-20901
 URL: https://issues.apache.org/jira/browse/HBASE-20901
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
 Environment: While reducing the region replica, server name(sn) and 
state column of the replica are not getting deleted, resulting in assignment 
manager to think that these regions are CLOSED and assign them again.

Reporter: Ankit Singhal
Assignee: Ankit Singhal






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may get killed while master restarts

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20867:
---
Attachment: HBASE-20867.branch-2.0.004.patch

> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545914#comment-16545914
 ] 

Hudson commented on HBASE-18477:


Results for branch HBASE-18477
[build #266 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20900) Improve FsDelegationToken to support KMS delegation tokens

2018-07-16 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HBASE-20900:
---

 Summary: Improve FsDelegationToken to support KMS delegation tokens
 Key: HBASE-20900
 URL: https://issues.apache.org/jira/browse/HBASE-20900
 Project: HBase
  Issue Type: Sub-task
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


Currently FsDelegationToken acquires HDFS delegation token. Any tools that use 
it to access encryption zone files could fail because they don't have KMS 
delegation token. We should fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests

2018-07-16 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HBASE-20899:
---

 Summary: Add Hadoop KMS dependency and basic HDFS at-rest 
encryption tests
 Key: HBASE-20899
 URL: https://issues.apache.org/jira/browse/HBASE-20899
 Project: HBase
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 2.0.0
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


We should start by adding hadoop-kms dependency in HBase test scope, and add 
basic HDFS at-rest encryption tests using the hadoop-kms dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-16 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545899#comment-16545899
 ] 

huaxiang sun commented on HBASE-20697:
--

[~stack] The fix is generic, getAllRegionLocations is not caching all regions' 
locations, instead, it only caches the first entry. With the fix, the case for 
region replicas is also taken care of. I think we need to backport this to 1.2 
and 1.3 as well.

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20686) Asyncfs should retry upon RetryStartFileException

2018-07-16 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-20686:

Issue Type: Sub-task  (was: Bug)
Parent: HBASE-20898

> Asyncfs should retry upon RetryStartFileException
> -
>
> Key: HBASE-20686
> URL: https://issues.apache.org/jira/browse/HBASE-20686
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient
>Affects Versions: 2.0.0-beta-1
> Environment: HBase 2.0, Hadoop 3 with at-rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-20686.master.001.patch, 
> HBASE-20686.master.002.patch
>
>
> In Hadoop-2.6 and above, HDFS client retries on RetryStartFileException when 
> NameNode experience encryption zone related issue. The code exists in 
> DFSOutputStream#newStreamForCreate(). (HDFS-6970)
> In HBase-2's asyncfs implementation, 
> FanOutOneBlockAsyncDFSOutputHelper#createOutput() is somewhat an imitation of 
> HDFS's DFSOutputStream#newStreamForCreate(). However it does not retry upon 
> RetryStartFileException. So it is less resilient to such issues.
> Also, DFSOutputStream#newStreamForCreate() upwraps RemoteExceptions, but 
> asyncfs does not. Therefore, hbase gets different exceptions than before.
> File this jira to get this corrected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545896#comment-16545896
 ] 

stack commented on HBASE-20697:
---

So, this issue fixes caching of region replicas? We weren't doing it previous?

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20898) Improve support for HDFS at-rest encryption

2018-07-16 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HBASE-20898:
---

 Summary: Improve support for HDFS at-rest encryption
 Key: HBASE-20898
 URL: https://issues.apache.org/jira/browse/HBASE-20898
 Project: HBase
  Issue Type: Umbrella
  Components: encryption
Affects Versions: 2.0.0
 Environment: HBase 2 on Hadoop 2.6.0+ (HDFS at-rest encryption)
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


* Note * this has nothing to do with HBase's Transparent Encryption of Data At 
Rest.

HDFS's at rest encryption is "transparent" in that encrypt/decrypt itself 
doesn't require client side change. However, in practice, there re a few cases 
that need to be taken care of. For example, accessing KMS requires KMS 
delegation tokens. If HBase tools get only HDFS delegation tokens, it would 
fail to access files in HDFS encryption zone. Cases such as HBASE-20403 
suggests in some cases HBase behaves differently in HDFS-encrypted cluster.

I propose an umbrella jira to revisit the HDFS at-rest encryption support in 
various HBase subcomponents and tools, add additional tests and enhance the 
tools as we visit them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()

2018-07-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545890#comment-16545890
 ] 

stack commented on HBASE-20697:
---

[~zghaobac] Thanks for the backport.

> Can't cache All region locations of the specify table by calling 
> table.getRegionLocator().getAllRegionLocations()
> -
>
> Key: HBASE-20697
> URL: https://issues.apache.org/jira/browse/HBASE-20697
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.2.6, 2.0.1
>Reporter: zhaoyuan
>Assignee: zhaoyuan
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20697.branch-1.2.001.patch, 
> HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, 
> HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, 
> HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, 
> HBASE-20697.master.002.patch, HBASE-20697.master.003.patch
>
>
> When we upgrade and restart  a new version application which will read and 
> write to HBase, we will get some operation timeout. The time out is expected 
> because when the application restarts,It will not hold any region locations 
> cache and do communication with zk and meta regionserver to get region 
> locations.
> We want to avoid these timeouts so we do warmup work and as far as I am 
> concerned,the method table.getRegionLocator().getAllRegionLocations() will 
> fetch all region locations and cache them. However, it didn't work good. 
> There are still a lot of time outs,so it confused me. 
> I dig into the source code and find something below
> {code:java}
> // code placeholder
> public List getAllRegionLocations() throws IOException {
>   TableName tableName = getName();
>   NavigableMap locations =
>   MetaScanner.allTableRegions(this.connection, tableName);
>   ArrayList regions = new ArrayList<>(locations.size());
>   for (Entry entry : locations.entrySet()) {
> regions.add(new HRegionLocation(entry.getKey(), entry.getValue()));
>   }
>   if (regions.size() > 0) {
> connection.cacheLocation(tableName, new RegionLocations(regions));
>   }
>   return regions;
> }
> In MetaCache
> public void cacheLocation(final TableName tableName, final RegionLocations 
> locations) {
>   byte [] startKey = 
> locations.getRegionLocation().getRegionInfo().getStartKey();
>   ConcurrentMap tableLocations = 
> getTableLocations(tableName);
>   RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, 
> locations);
>   boolean isNewCacheEntry = (oldLocation == null);
>   if (isNewCacheEntry) {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Cached location: " + locations);
> }
> addToCachedServers(locations);
> return;
>   }
> {code}
> It will collect all regions into one RegionLocations object and only cache 
> the first not null region location and then when we put or get to hbase, we 
> do getCacheLocation() 
> {code:java}
> // code placeholder
> public RegionLocations getCachedLocation(final TableName tableName, final 
> byte [] row) {
>   ConcurrentNavigableMap tableLocations =
> getTableLocations(tableName);
>   Entry e = tableLocations.floorEntry(row);
>   if (e == null) {
> if (metrics!= null) metrics.incrMetaCacheMiss();
> return null;
>   }
>   RegionLocations possibleRegion = e.getValue();
>   // make sure that the end key is greater than the row we're looking
>   // for, otherwise the row actually belongs in the next region, not
>   // this one. the exception case is when the endkey is
>   // HConstants.EMPTY_END_ROW, signifying that the region we're
>   // checking is actually the last region in the table.
>   byte[] endKey = 
> possibleRegion.getRegionLocation().getRegionInfo().getEndKey();
>   if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) ||
>   getRowComparator(tableName).compareRows(
>   endKey, 0, endKey.length, row, 0, row.length) > 0) {
> if (metrics != null) metrics.incrMetaCacheHit();
> return possibleRegion;
>   }
>   // Passed all the way through, so we got nothing - complete cache miss
>   if (metrics != null) metrics.incrMetaCacheMiss();
>   return null;
> }
> {code}
> It will choose the first location to be possibleRegion and possibly it will 
> miss match.
> So did I forget something or may be wrong somewhere? If this is indeed a bug 
> I think it can be fixed not very hard.
> Hope commiters and PMC review this !
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-16 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545873#comment-16545873
 ] 

Francis Liu edited comment on HBASE-20704 at 7/17/18 12:14 AM:
---

{quote}expecting eventual GC to call a finalizer that cleans things up
{quote}
AFAIK it should get cleaned up either via another next() rpc (that fails 
bacause the region is cloased) or scanner lease expiration processing. The 
readers won't be garbage until the scanner state is cleaned up. In any case it 
would objects that would give gc more work, tho it doesn't sounds like it's 
going to be significant and generally just part of normal operation. ie scan 
lease expiring and pauses between next() rpc calls. 

The trade off is tho now we have to have concurrent threads access a map during 
storefilescanner creation and and close for streaming scans. The overhead may 
be negligible assuming streaming scans are meant for doing large scans. I've 
attached a rough patch on how it would look. Let me know what you think. 


was (Author: toffer):
{quote}expecting eventual GC to call a finalizer that cleans things up
{quote}
AFAIK it should get cleaned up either via another next() rpc (that fails 
bacause the region is cloased) or scanner lease expiration processing. The 
readers won't be garbage until the scanner state is cleaned up. In any case it 
would objects that would give gc more work.

The trade off is tho now we have to have concurrent threads access a map during 
storefilescanner creation and and close for streaming scans. The overhead may 
be negligible assuming streaming scans are meant for doing large scans. I've 
attached a rough patch on how it would look. Let me know what you think. 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch, HBASE-20704.004.draft.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-16 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545873#comment-16545873
 ] 

Francis Liu commented on HBASE-20704:
-

{quote}expecting eventual GC to call a finalizer that cleans things up
{quote}
AFAIK it should get cleaned up either via another next() rpc (that fails 
bacause the region is cloased) or scanner lease expiration processing. The 
readers won't be garbage until the scanner state is cleaned up. In any case it 
would objects that would give gc more work.

The trade off is tho now we have to have concurrent threads access a map during 
storefilescanner creation and and close for streaming scans. The overhead may 
be negligible assuming streaming scans are meant for doing large scans. I've 
attached a rough patch on how it would look. Let me know what you think. 

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch, HBASE-20704.004.draft.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545869#comment-16545869
 ] 

Hudson commented on HBASE-20884:


Results for branch branch-1.3
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545870#comment-16545870
 ] 

Hudson commented on HBASE-20889:


Results for branch branch-1.3
[build #394 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-07-16 Thread Francis Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-20704:

Attachment: HBASE-20704.004.draft.patch

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch, HBASE-20704.004.draft.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545861#comment-16545861
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 
unchanged - 14 fixed = 39 total (was 52) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 46s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}197m 24s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 7s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}255m 33s{color} | 

[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20875:
--
Attachment: HBASE-20875.master.001.patch

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-16 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20875:
--
Attachment: HBASE-20875.master.002.patch

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.2
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545810#comment-16545810
 ] 

Hudson commented on HBASE-20889:


Results for branch branch-1.4
[build #387 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545809#comment-16545809
 ] 

Hudson commented on HBASE-20884:


Results for branch branch-1.4
[build #387 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545773#comment-16545773
 ] 

Vladimir Rodionov commented on HBASE-20894:
---

What are criteria of "better", [~mdrob]? Protobuf is much heavier on CPU and 
Memory than java - serialization. From performance point of view, I do not 
think protobuf is faster, but I would gladly accept perf numbers. Protobuf adds 
a additional (useless) generated code to HBase code base. What else? Yes, this 
BuckeCache  is totally internal feature which is not supposed to be exposed to 
a public (I mean serialized data).

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15654) Optimize client's MetaCache handling

2018-07-16 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545721#comment-16545721
 ] 

huaxiang sun commented on HBASE-15654:
--

link HBASE-20697 with this jira.

> Optimize client's MetaCache handling
> 
>
> Key: HBASE-15654
> URL: https://issues.apache.org/jira/browse/HBASE-15654
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
>
> This is an umbrella jira to track all individual issues, bugfixes and small 
> optimizations around MetaCache (region locations cache) in the client. 
> Motivation is that under the load one could see a spikes in the number of 
> requests going to meta - reaching tens of thousands requests per second.
> That covers issues when we clear entries from location cache unnecessary, as 
> well as when we do more lookups than necessary when entries are legitimately 
> evicted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545671#comment-16545671
 ] 

Mike Drob commented on HBASE-20894:
---

Want to check with folks to see if there is agreement that this is a reasonable 
approach to take. A few design questions:
* Should I be building a separate IO Engine implementation to do this instead 
of trying to handle it inline?
* Is it ok to put the messy PB logic in the persist/retrieve methods, or should 
that go to various classes with toPB/fromPB methods in those? I see some 
examples of both in our code.
* What is the difference for PB between writeTo and writeDelimitedTo (and the 
corresponding read methods)
* Are my protobuf message definitions fine or do they need to be organized 
differently? I haven't spent too much thought on these.

Regarding my previous question, I think recording cache size and IO Engine 
class seems fine, but tracking the backing map class is probably not necessary.

Also, maybe we can simplify the logic and not worry about the old serialization 
types - it's "just" a cache hint anyway so nothing critical lost if it doesn't 
come up with the RS.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20894:
--
Attachment: HBASE-20894.WIP-2.patch

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-16 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545637#comment-16545637
 ] 

Andrew Purtell commented on HBASE-20866:


Oh, I see commit was done. Updated this JIRA. I opened two subtasks for follow 
up

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.3.3
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT 

[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20896:
---
Fix Version/s: 1.4.6
   1.5.0

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20897) Port HBASE-20866 to branch-2 and up

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20897:
---
Fix Version/s: 2.2.0
   2.0.2
   2.1.0
   3.0.0

> Port HBASE-20866 to branch-2 and up
> ---
>
> Key: HBASE-20897
> URL: https://issues.apache.org/jira/browse/HBASE-20897
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20897) Port HBASE-20866 to branch-2 and up

2018-07-16 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-20897:
--

 Summary: Port HBASE-20866 to branch-2 and up
 Key: HBASE-20897
 URL: https://issues.apache.org/jira/browse/HBASE-20897
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Vikas Vishwakarma






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-16 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-20896:
--

 Summary: Port HBASE-20866 to branch-1 and branch-1.4 
 Key: HBASE-20896
 URL: https://issues.apache.org/jira/browse/HBASE-20896
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Vikas Vishwakarma






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20866:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 1.4.6)
   (was: 1.2.7)
   (was: 1.5.0)
   Status: Resolved  (was: Patch Available)

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.3.3
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-16 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545635#comment-16545635
 ] 

Andrew Purtell commented on HBASE-20866:


Ok. Then the same advice applies, after committing set the fix version here to 
what was committed and open another JIRA for follow on work (with fix versions 
set appropriately there). Thanks!

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.3.3
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int 

[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545614#comment-16545614
 ] 

Mike Drob commented on HBASE-20894:
---

Hmm, starting to try to do the actual read/write here and maybe we don't need 
the cache size or the class names recorded. Will leave them in for now, and 
then prune them later if we can get away with it.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20889:
---
Fix Version/s: 1.4.6
   1.5.0

> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Description: 
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.
{quote}count = channelRead(channel, data);
 1761 ---> if (count >= 0 && *data.remaining()* == 0)
 \{ process(); }{quote}
Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether concurrent access to the Connection is happening and intended.The above 
is just a theory. We should also look at other execution sequences that could 
lead to 'data' being null in this location. At a glance I didn't find one but 
the store to 'data' happens behind conditionals so it is possible. 

  was:
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.
{quote}count = channelRead(channel, data);
 1761 ---> if (count >= 0 && *data.remaining()* == 0)
Unknown macro: \{ process(); }{quote}
Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether concurrent access to the Connection is happening and intended.The above 
is just a theory. We should also look at other execution sequences that could 
lead to 'data' being null in this location. At a glance I didn't find one the 
store to 'data' happens behind conditionals so it is possible. 


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> 

[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Description: 
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.
{quote}count = channelRead(channel, data);
 1761 ---> if (count >= 0 && *data.remaining()* == 0)
Unknown macro: \{ process(); }{quote}
Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether concurrent access to the Connection is happening and intended.The above 
is just a theory. We should also look at other execution sequences that could 
lead to 'data' being null in this location. At a glance I didn't find one the 
store to 'data' happens behind conditionals so it is possible. 

  was:
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.
{quote}count = channelRead(channel, data);
 1761 ---> if (count >= 0 && *data.remaining()* == 0) { 
 process();
 }
{quote}
Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether concurrent access to the Connection is happening and intended.The above 
is just a theory. We should also look at other execution sequences that could 
lead to 'data' being null in this location. At a glance I didn't find one but 
'data' is allocated behind conditionals so it is possible. 


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> 

[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Description: 
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.
{quote}count = channelRead(channel, data);
 1761 ---> if (count >= 0 && *data.remaining()* == 0) { 
 process();
 }
{quote}
Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether concurrent access to the Connection is happening and intended.The above 
is just a theory. We should also look at other execution sequences that could 
lead to 'data' being null in this location. At a glance I didn't find one but 
'data' is allocated behind conditionals so it is possible. 

  was:
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.

{quote} 
count = channelRead(channel, data);
1761 --->   if (count >= 0 && *data.remaining()* == 0) \{ 
process();
   \}
{quote} 

Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether the concurrent access to the Connection is intended.


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> 

[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-16 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Description: 
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.

{quote} 
count = channelRead(channel, data);
1761 --->   if (count >= 0 && *data.remaining()* == 0) \{ 
process();
   \}
{quote} 

Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether the concurrent access to the Connection is intended.

  was:
{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.

{quote} 
count = channelRead(channel, data);
1761 --->   if (count >= 0 && *data.remaining()* == 0) { // count==0 if 
dataLength == 0
process();
   }
{quote} 

Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether the concurrent access to the Connection is intended.


> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we 

[jira] [Created] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-16 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-20895:
--

 Summary: NPE in RpcServer#readAndProcess
 Key: HBASE-20895
 URL: https://issues.apache.org/jira/browse/HBASE-20895
 Project: HBase
  Issue Type: Bug
  Components: rpc
Affects Versions: 1.3.2
Reporter: Andrew Purtell
Assignee: Monani Mihir
 Fix For: 1.5.0, 1.3.3, 1.4.6


{noformat}
2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
RpcServer.listener,port=60020: Caught exception while reading:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
at 
org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

This looks like it could be a use after close problem if there is concurrent 
access to a Connection.

In process() we might store a null back to the 'data' field.

Meanwhile in readAndProcess() we have a case where we might be blocked on a 
channel read and then after coming back from the read we go to use 'data' after 
a null has been written back, leading to a NPE.

{quote} 
count = channelRead(channel, data);
1761 --->   if (count >= 0 && *data.remaining()* == 0) { // count==0 if 
dataLength == 0
process();
   }
{quote} 

Whether a NPE happens or not is going to depend on the timing of the store back 
to 'data' in another thread and use of 'data' in this thread and whether or not 
the JVM has optimized away a reload of 'data' (it's not declared volatile)

We should do a null check here just to be defensive. We should also look at 
whether the concurrent access to the Connection is intended.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545527#comment-16545527
 ] 

Hudson commented on HBASE-20889:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #435 (See 
[https://builds.apache.org/job/HBase-1.3-IT/435/])
HBASE-20889 PE scan is failing with NullPointerException (tedyu: rev 
08f9837795164e1603825a382d9bb1cab9c2cb3e)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java


> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.3.3
>
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545516#comment-16545516
 ] 

Mike Drob commented on HBASE-20894:
---

Attaching WIP patch that includes the proposed new proto definitions. Would 
appreciate some review before I start trying to glue that in to the code paths, 
since I don't have a ton of experience with protos in general.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20894:
--
Attachment: HBASE-20894.WIP.patch

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20894:
--
Description: 
We should use a better serialization format instead of Java Serialization for 
the BucketCache entry persistence.

Suggested by Chris McCown, who does not appear to have a JIRA account.

  was:We should use a better serialization format instead of Java Serialization 
for the BucketCache entry persistence.


> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-16 Thread Mike Drob (JIRA)
Mike Drob created HBASE-20894:
-

 Summary: Move BucketCache from java serialization to protobuf
 Key: HBASE-20894
 URL: https://issues.apache.org/jira/browse/HBASE-20894
 Project: HBase
  Issue Type: Task
  Components: BucketCache
Affects Versions: 2.0.0
Reporter: Mike Drob
 Fix For: 3.0.0


We should use a better serialization format instead of Java Serialization for 
the BucketCache entry persistence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20889:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.3.3
   Status: Resolved  (was: Patch Available)

Thanks for the review, Vikas

> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.3.3
>
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException

2018-07-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20889:
---
Summary: PE scan is failing with NullPointerException  (was: PE scan is 
failing with NullPointer)

> PE scan is failing with NullPointerException
> 
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec

2018-07-16 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545463#comment-16545463
 ] 

Andrew Purtell commented on HBASE-20883:


bq. , it doesn't seem like it hurts to expose that same information in JMX.

It doesn't, and I was talking about the UI, so let's be clear about that. 

> HMaster Read / Write Requests Per Sec across RegionServers, currently only 
> Total Requests Per Sec 
> --
>
> Key: HBASE-20883
> URL: https://issues.apache.org/jira/browse/HBASE-20883
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, master, metrics, monitoring, UI, Usability
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> HMaster currently shows Requests Per Second per RegionServer under HMaster 
> UI's /master-status page -> Region Servers -> Base Stats section in the Web 
> UI.
> Please add Reads Per Second and Writes Per Second per RegionServer alongside 
> this in the HMaster UI, and also expose the Read/Write/Total requests per sec 
> information in the HMaster JMX API.
> This will make it easier to find read or write hotspotting on HBase as a 
> combined total will minimize and mask differences between RegionServers. For 
> example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, 
> so write skew will be masked as it won't show enough significant difference 
> in the much larger combined Total Requests Per Second stat.
> For now I've written a Python tool to calculate this info from RegionServers 
> JMX read/write/total request counts but since HMaster is collecting this info 
> anyway it shouldn't be a big change to improve it to also show Reads / Writes 
> Per Sec as well as Total.
> Find my tools for more granular Read/Write Requests Per Sec Per Regionserver 
> and also Per Region at my [PyTools github 
> repo|https://github.com/harisekhon/pytools] along with a selection of other 
> HBase tools I've used for performance debugging over the years.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545465#comment-16545465
 ] 

Hadoop QA commented on HBASE-18201:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
22s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
11s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
22s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}202m 42s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}280m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.snapshot.TestMobSecureExportSnapshot |
|   | hadoop.hbase.snapshot.TestExportSnapshot |
|   | hadoop.hbase.snapshot.TestMobExportSnapshot 

[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545441#comment-16545441
 ] 

Hadoop QA commented on HBASE-20893:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 1s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
12s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
48s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
45s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
35s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
0m 56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 
2 total (was 1) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 
38s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20893 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931791/HBASE-20893.branch-2.0.001.patch
 |
| Optional Tests |  asflicense  cc  unit  hbaseprotoc  javac  javadoc  findbugs 
 shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 98df0921cd94 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 

[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545367#comment-16545367
 ] 

Hudson commented on HBASE-20884:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #433 (See 
[https://builds.apache.org/job/HBase-1.3-IT/433/])
HBASE-20884 Reclassify Base64 as IA.Private (mdrob: rev 
830d105eade8b8549418a0bcd6a8915bdcef82f4)
* (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Base64.java


> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545348#comment-16545348
 ] 

Hudson commented on HBASE-20884:


SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1133 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1133/])
HBASE-20884 Reclassify Base64 as IA.Private (mdrob: rev 
fe7306ebc5bb5b8e0103c2db27961da63b6db8a1)
* (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Base64.java


> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may get killed while master restarts

2018-07-16 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545330#comment-16545330
 ] 

Allan Yang commented on HBASE-20867:


{code}
Will we be stuck there for ever when master shutdown? The reason we close the 
connection when shutdown master is that we want the operations against the 
connection fail quickly and give up immediately.
{code}
It won't stuck. Retrying happens in a thread pool in RemoteProcedureDispatcher, 
we shut it down when stopping.

> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545317#comment-16545317
 ] 

Hadoop QA commented on HBASE-20853:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 7s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 45m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20853 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931798/HBASE-20853.master.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d0b4ca9a9bc3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2997b6d071 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13640/testReport/ |
| Max. process+thread count | 270 (vs. ulimit of 1) |
| modules | C: hbase-client U: hbase-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13640/console |
| Powered by | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545309#comment-16545309
 ] 

stack commented on HBASE-20846:
---

bq. Now the biggest problem is that, the original code does not allow storing 
ROLLEDBACK procedure into the procedure store.

You can't store ROLLBACK steps as we do forward steps; the framework does not 
currently support this. Shout if you want more detail.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545274#comment-16545274
 ] 

Mike Drob commented on HBASE-20870:
---

+1

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch, 
> HBASE-20870.branch-2.0.002.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545267#comment-16545267
 ] 

Hadoop QA commented on HBASE-20870:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
50s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
16s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}116m  3s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
57s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore |
|   | hadoop.hbase.regionserver.TestCompactingMemStore |
|   | hadoop.hbase.regionserver.TestDefaultMemStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20870 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931773/HBASE-20870.branch-2.0.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 08493ac87cb7 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-16 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-20884:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to branch-1 family, thanks for reviews Andrew and Ted.

> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20846:
--
Attachment: HBASE-20846-v2.patch

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545260#comment-16545260
 ] 

Hadoop QA commented on HBASE-20878:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}419m 
52s{color} | {color:red} Docker failed to build yetus/hbase:6f01af0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-20878 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931746/HBASE-20878.branch-2.0.003.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13633/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Data loss if merging regions while ServerCrashProcedure executing
> -
>
> Key: HBASE-20878
> URL: https://issues.apache.org/jira/browse/HBASE-20878
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20878.branch-2.0.001.patch, 
> HBASE-20878.branch-2.0.002.patch, HBASE-20878.branch-2.0.003.patch
>
>
> In MergeTableRegionsProcedure, we close the regions to merge using 
> UnassignProcedure. But, if the RS these regions on is crashed, a 
> ServerCrashProcedure will execute at the same time. UnassignProcedures will 
> be blockd until all logs are split. But since these regions are closed for 
> merging, the regions won't open again, the recovered.edit in the region dir 
> won't be replay, thus, data will loss.
> I provided a test to repo this case. I seriously doubt Split region procedure 
> also has this kind of problem. I will check later



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-16 Thread Balazs Meszaros (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545238#comment-16545238
 ] 

Balazs Meszaros commented on HBASE-20853:
-

I added a default implementation for getTableDescriptor. I did not change 
getName, because it does not throw an exception unlike getDescriptor..

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-16 Thread Balazs Meszaros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Meszaros updated HBASE-20853:

Attachment: HBASE-20853.master.003.patch

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec

2018-07-16 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545231#comment-16545231
 ] 

Hari Sekhon commented on HBASE-20883:
-

[~andrewcheng]  thanks for mentioning the other ticket but it's not exactly the 
same issue.

That asks to use a more accurate counting to account for multi requests.

I'm just asking that the Read + Writes Requests Per Sec are shown in the UI 
next to each RegionServer which already shows the Total Requests Per Sec, to be 
able to detect Read or Write skew more easily.

> HMaster Read / Write Requests Per Sec across RegionServers, currently only 
> Total Requests Per Sec 
> --
>
> Key: HBASE-20883
> URL: https://issues.apache.org/jira/browse/HBASE-20883
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, master, metrics, monitoring, UI, Usability
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> HMaster currently shows Requests Per Second per RegionServer under HMaster 
> UI's /master-status page -> Region Servers -> Base Stats section in the Web 
> UI.
> Please add Reads Per Second and Writes Per Second per RegionServer alongside 
> this in the HMaster UI, and also expose the Read/Write/Total requests per sec 
> information in the HMaster JMX API.
> This will make it easier to find read or write hotspotting on HBase as a 
> combined total will minimize and mask differences between RegionServers. For 
> example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, 
> so write skew will be masked as it won't show enough significant difference 
> in the much larger combined Total Requests Per Second stat.
> For now I've written a Python tool to calculate this info from RegionServers 
> JMX read/write/total request counts but since HMaster is collecting this info 
> anyway it shouldn't be a big change to improve it to also show Reads / Writes 
> Per Sec as well as Total.
> Find my tools for more granular Read/Write Requests Per Sec Per Regionserver 
> and also Per Region at my [PyTools github 
> repo|https://github.com/harisekhon/pytools] along with a selection of other 
> HBase tools I've used for performance debugging over the years.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec

2018-07-16 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545229#comment-16545229
 ] 

Hari Sekhon commented on HBASE-20883:
-

{quote}This won't scale
{quote}
HMaster UI already shows Total Requests Per Sec next to each RegionServer, 
which I think is already calculated from readRequestCount + writeRequestCount 
or totalRequestCount differentials. It's just two more columns to expose that 
information in the existing table.

I already have OpenTSDB but it's handy for some tools and scripts to be able to 
get this information from HBase directly, perhaps you don't want to have to set 
up OpenTSDB on HBase to be able to debug somebody's HBase installation and 
since it appears that HMaster is already collecting and averaging the 
information, it doesn't seem like it hurts to expose that same information in 
JMX.

> HMaster Read / Write Requests Per Sec across RegionServers, currently only 
> Total Requests Per Sec 
> --
>
> Key: HBASE-20883
> URL: https://issues.apache.org/jira/browse/HBASE-20883
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin, master, metrics, monitoring, UI, Usability
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> HMaster currently shows Requests Per Second per RegionServer under HMaster 
> UI's /master-status page -> Region Servers -> Base Stats section in the Web 
> UI.
> Please add Reads Per Second and Writes Per Second per RegionServer alongside 
> this in the HMaster UI, and also expose the Read/Write/Total requests per sec 
> information in the HMaster JMX API.
> This will make it easier to find read or write hotspotting on HBase as a 
> combined total will minimize and mask differences between RegionServers. For 
> example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, 
> so write skew will be masked as it won't show enough significant difference 
> in the much larger combined Total Requests Per Second stat.
> For now I've written a Python tool to calculate this info from RegionServers 
> JMX read/write/total request counts but since HMaster is collecting this info 
> anyway it shouldn't be a big change to improve it to also show Reads / Writes 
> Per Sec as well as Total.
> Find my tools for more granular Read/Write Requests Per Sec Per Regionserver 
> and also Per Region at my [PyTools github 
> repo|https://github.com/harisekhon/pytools] along with a selection of other 
> HBase tools I've used for performance debugging over the years.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may get killed while master restarts

2018-07-16 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20867:
---
Summary: RS may get killed while master restarts  (was: RS may got killed 
while master restarts)

> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545192#comment-16545192
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 9s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 
unchanged - 14 fixed = 39 total (was 52) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 42s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}241m 18s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}294m 57s{color} | 

[jira] [Commented] (HBASE-20867) RS may got killed while master restarts

2018-07-16 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545187#comment-16545187
 ] 

Duo Zhang commented on HBASE-20867:
---

Will we be stuck there for ever when master shutdown? The reason we close the 
connection when shutdown master is that we want the operations against the 
connection fail quickly and give up immediately.

The patch LGTM. Above is the only concern for me.

Thanks.

> RS may got killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20893:
---
Attachment: HBASE-20893.branch-2.0.001.patch

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20893.branch-2.0.001.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20893:
---
Status: Patch Available  (was: Open)

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.1, 3.0.0, 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-16 Thread Allan Yang (JIRA)
Allan Yang created HBASE-20893:
--

 Summary: Data loss if splitting region while ServerCrashProcedure 
executing
 Key: HBASE-20893
 URL: https://issues.apache.org/jira/browse/HBASE-20893
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.1, 3.0.0, 2.1.0
Reporter: Allan Yang
Assignee: Allan Yang


Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may got killed while master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545122#comment-16545122
 ] 

Hadoop QA commented on HBASE-20867:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
22s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
43s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
25s{color} | {color:red} hbase-client: The patch generated 1 new + 13 unchanged 
- 2 fixed = 14 total (was 15) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 24s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
57s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}183m 
26s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}231m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20867 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931749/HBASE-20867.branch-2.0.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c4f9dd9c6829 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 5594f0b9fd |
| maven | version: Apache Maven 3.5.4 

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-07-16 Thread Kuan-Po Tseng (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545111#comment-16545111
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---

It seems fail test isn't relate to this patch, resummit patch.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-07-16 Thread Kuan-Po Tseng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuan-Po Tseng updated HBASE-18201:
--
Attachment: HBASE-18201.master.005.patch

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, 
> HBASE-18201.master.005.patch, HBASE-18201.master.005.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20870:
---
Attachment: (was: HBASE-20870.branch-2.0.002.patch)

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch, 
> HBASE-20870.branch-2.0.002.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20870:
---
Attachment: HBASE-20870.branch-2.0.002.patch

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch, 
> HBASE-20870.branch-2.0.002.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545068#comment-16545068
 ] 

Hadoop QA commented on HBASE-18201:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
17s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
44s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  6m  
4s{color} | {color:blue} patch has no errors when building the reference guide. 
See footer for rendered docs, which you should manually inspect. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}139m 45s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 6s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}222m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce 

[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545046#comment-16545046
 ] 

Hadoop QA commented on HBASE-20870:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 16s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestDefaultMemStore |
|   | hadoop.hbase.regionserver.TestCompactingMemStore |
|   | hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20870 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12931748/HBASE-20870.branch-2.0.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7a5153609589 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-16 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544979#comment-16544979
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

The code in branch-1.4 onwards is similar to the master branch and will require 
considerable change for implementing the above change in these branches. But 
once done it should be easy to apply the same from branch-1.4 to master branch. 
Will work on the same and update. [~apurtell] so for now I was able to commit 
the patch only for 1.3 branch. 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } 

[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20846:
--
Attachment: (was: HBASE-20846-v1.patch)

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20846:
--
Attachment: HBASE-20846-v1.patch

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may got killed while master restarts

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20867:
---
Issue Type: Sub-task  (was: Bug)
Parent: HBASE-20828

> RS may got killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may got killed while master restarts

2018-07-16 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544935#comment-16544935
 ] 

Allan Yang commented on HBASE-20867:


[~Apache9], can you review this one? Thanks!

> RS may got killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may got killed while master restarts

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20867:
---
Attachment: HBASE-20867.branch-2.0.003.patch

> RS may got killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool

2018-07-16 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-20870:
---
Attachment: HBASE-20870.branch-2.0.002.patch

> Wrong HBase root dir in ITBLL's Search Tool
> ---
>
> Key: HBASE-20870
> URL: https://issues.apache.org/jira/browse/HBASE-20870
> Project: HBase
>  Issue Type: Bug
>  Components: integration tests
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Minor
> Attachments: HBASE-20870.branch-2.0.001.patch, 
> HBASE-20870.branch-2.0.002.patch
>
>
> When using IntegrationTestBigLinkedList's Search tools, it always fails since 
> it tries to read WALs in a wrong HBase root dir. Turned out that when 
> initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its 
> super class HBaseTestingUtility will change hbase.rootdir to a local random 
> dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. 
> But for IntegrationTest runs on distributed clusters, we should change it 
> back.
>  Here is the error info.
> {code:java}
> 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting 
> hbase.rootdir to 
> /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb
> 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running 
> command-line tool java.io.FileNotFoundException: File 
> file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >