[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718548#comment-16718548
 ] 

Hadoop QA commented on HBASE-21514:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 30 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m 20s{color} 
| {color:red} hbase-server generated 4 new + 184 unchanged - 4 fixed = 188 
total (was 188) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} hbase-server: The patch generated 0 new + 868 
unchanged - 58 fixed = 868 total (was 926) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 9s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}138m 26s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRecoveredEdits |
|   | hadoop.hbase.regionserver.TestCompoundBloomFilter |
|   | hadoop.hbase.regionserver.TestMultiColumnScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21514 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951456/HBASE-21514.master.010.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9086fd7353cf 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 67d6d5084c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 

[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-11 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: HBASE-21565.master.004.patch

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch, 
> HBASE-21565.master.004.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21582) If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time

2018-12-11 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21582:
-
Attachment: HBASE-21582.v3.patch

> If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then 
> SnapshotHFileCleaner will skip to run every time
> --
>
> Key: HBASE-21582
> URL: https://issues.apache.org/jira/browse/HBASE-21582
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 2.1.2, 1.2.10, 1.4.10, 2.0.5
>
> Attachments: HBASE-21582.v1.patch, HBASE-21582.v2.patch, 
> HBASE-21582.v3.patch
>
>
> This is because we remove the SnapshotSentinel  from snapshotHandlers in 
> SnapshotManager#cleanupSentinels.  Only when the following 3 case, the  
> cleanupSentinels will be called: 
> 1.  SnapshotManager#isSnapshotDone; 
> 2.  SnapshotManager#takeSnapshot; 
> 3. SnapshotManager#restoreOrCloneSnapshot
> So if no isSnapshotDone called, or no further snapshot taking, or snapshot 
> restore/clone.  the SnapshotSentinel will always be keep in snapshotHandlers. 
> But after HBASE-21387,  Only when no snapshot taking, the 
> SnapshotHFileCleaner will check the unref files and clean. 
> I found this bug, because in our XiaoMi branch-2,  we implement the soft 
> delete feature, which means if someone delete a table, then master will 
> create a snapshot firstly, after that, the table deletion begain.  the 
> implementation is quite simple, we use the snapshotManager to create a 
> snapshot. 
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index 8f42e4a..6da6a64 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -2385,12 +2385,6 @@ public class HMaster extends HRegionServer implements 
> MasterServices {
>protected void run() throws IOException {
>  getMaster().getMasterCoprocessorHost().preDeleteTable(tableName);
>  
> +if (snapshotBeforeDelete) {
> +  LOG.info("Take snaposhot for " + tableName + " before deleting");
> +  snapshotManager
> +  
> .takeSnapshot(SnapshotDescriptionUtils.getSnapshotNameForDeletedTable(tableName));
> +}
> +
>  LOG.info(getClientIdAuditPrefix() + " delete " + tableName);
>  
>  // TODO: We can handle/merge duplicate request
> {code}
> In the master,  I found the endless log after delete a table: 
> {code}
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache: Not checking 
> unreferenced files since snapshot is running, it will skip to clean the 
> HFiles this time
> {code}
> This is because the snapshotHandlers never be cleaned after call the  
> snapshotManager#takeSnapshot.  I think the asynSnapshot may has the same 
> problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21582) If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718522#comment-16718522
 ] 

Hadoop QA commented on HBASE-21582:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
18s{color} | {color:red} hbase-server: The patch generated 1 new + 16 unchanged 
- 2 fixed = 17 total (was 18) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}135m 
17s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21582 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951454/HBASE-21582.v2.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux cf696a5d9662 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 67d6d5084c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15255/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15255/testReport/ |
| Max. process+thread count | 4747 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718479#comment-16718479
 ] 

Hudson commented on HBASE-21568:


Results for branch branch-2.1
[build #677 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/677/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/677//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/677//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/677//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718468#comment-16718468
 ] 

Hadoop QA commented on HBASE-21538:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HBASE-21512 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
18s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
40s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
51s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} HBASE-21512 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} hbase-client: The patch generated 0 new + 82 
unchanged - 5 fixed = 82 total (was 87) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} hbase-server: The patch generated 0 new + 74 
unchanged - 3 fixed = 74 total (was 77) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}134m  5s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}189m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestMultiColumnScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21538 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951443/HBASE-21538-HBASE-21512-v2.patch
 |
| Optional Tests |  dupname  

[jira] [Updated] (HBASE-21514) Refactor CacheConfig

2018-12-11 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21514:
---
Attachment: HBASE-21514.master.010.patch

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.master.001.patch, 
> HBASE-21514.master.002.patch, HBASE-21514.master.003.patch, 
> HBASE-21514.master.004.patch, HBASE-21514.master.005.patch, 
> HBASE-21514.master.006.patch, HBASE-21514.master.007.patch, 
> HBASE-21514.master.008.patch, HBASE-21514.master.009.patch, 
> HBASE-21514.master.010.patch
>
>
> # move the global cache instances from CacheConfig to BlockCacheFactory. Only 
> keep config stuff in CacheConfig.
>  # Move block cache to HRegionServer's member variable. One rs has one block 
> cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718437#comment-16718437
 ] 

Hudson commented on HBASE-21568:


Results for branch branch-2.0
[build #1157 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1157/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1157//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1157//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1157//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21582) If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time

2018-12-11 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21582:
-
Attachment: HBASE-21582.v2.patch

> If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then 
> SnapshotHFileCleaner will skip to run every time
> --
>
> Key: HBASE-21582
> URL: https://issues.apache.org/jira/browse/HBASE-21582
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 2.1.2, 1.2.10, 1.4.10, 2.0.5
>
> Attachments: HBASE-21582.v1.patch, HBASE-21582.v2.patch
>
>
> This is because we remove the SnapshotSentinel  from snapshotHandlers in 
> SnapshotManager#cleanupSentinels.  Only when the following 3 case, the  
> cleanupSentinels will be called: 
> 1.  SnapshotManager#isSnapshotDone; 
> 2.  SnapshotManager#takeSnapshot; 
> 3. SnapshotManager#restoreOrCloneSnapshot
> So if no isSnapshotDone called, or no further snapshot taking, or snapshot 
> restore/clone.  the SnapshotSentinel will always be keep in snapshotHandlers. 
> But after HBASE-21387,  Only when no snapshot taking, the 
> SnapshotHFileCleaner will check the unref files and clean. 
> I found this bug, because in our XiaoMi branch-2,  we implement the soft 
> delete feature, which means if someone delete a table, then master will 
> create a snapshot firstly, after that, the table deletion begain.  the 
> implementation is quite simple, we use the snapshotManager to create a 
> snapshot. 
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index 8f42e4a..6da6a64 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -2385,12 +2385,6 @@ public class HMaster extends HRegionServer implements 
> MasterServices {
>protected void run() throws IOException {
>  getMaster().getMasterCoprocessorHost().preDeleteTable(tableName);
>  
> +if (snapshotBeforeDelete) {
> +  LOG.info("Take snaposhot for " + tableName + " before deleting");
> +  snapshotManager
> +  
> .takeSnapshot(SnapshotDescriptionUtils.getSnapshotNameForDeletedTable(tableName));
> +}
> +
>  LOG.info(getClientIdAuditPrefix() + " delete " + tableName);
>  
>  // TODO: We can handle/merge duplicate request
> {code}
> In the master,  I found the endless log after delete a table: 
> {code}
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache: Not checking 
> unreferenced files since snapshot is running, it will skip to clean the 
> HFiles this time
> {code}
> This is because the snapshotHandlers never be cleaned after call the  
> snapshotManager#takeSnapshot.  I think the asynSnapshot may has the same 
> problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20404) Ugly cleanerchore complaint that dir is not empty

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20404:
---
Fix Version/s: 1.3.3

> Ugly cleanerchore complaint that dir is not empty
> -
>
> Key: HBASE-20404
> URL: https://issues.apache.org/jira/browse/HBASE-20404
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.4.4, 2.0.0
>Reporter: stack
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1
>
> Attachments: HBASE-20404.0.patch, HBASE-20404.1.patch, 
> HBASE-20404.2.patch
>
>
>  I see these big dirty exceptions in my master log during a long-run Lets 
> clean them up (Are they exceptions I as an operator can actually do something 
> about? Are they 'problems'? Should they be LOG.warn?)
> {code}
> 2018-04-12 16:02:09,911 WARN  [ForkJoinPool-1-worker-15] 
> cleaner.CleanerChore: Could not delete dir under 
> hdfs://ve0524.halxg.cloudera.com:8020/hbase/archive/data/default/IntegrationTestBigLinkedList/1e24549061df3adc4858fbcaf1929553/meta;
>  {}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.PathIsNotEmptyDirectoryException):
>  
> `/hbase/archive/data/default/IntegrationTestBigLinkedList/1e24549061df3adc4858fbcaf1929553/meta
>  is non empty': Directory is not empty
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:115)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2848)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1048)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:847)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:790)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2486)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1489)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1435)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1345)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy26.delete(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:568)
>   at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
>   at com.sun.proxy.$Proxy27.delete(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
> ...
> {code}
> Looks like log format is off too...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20352) [Chore] Backport HBASE-18309 to branch-1

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20352:
---
Fix Version/s: 1.3.3

> [Chore] Backport HBASE-18309 to branch-1
> 
>
> Key: HBASE-20352
> URL: https://issues.apache.org/jira/browse/HBASE-20352
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.4
>
> Attachments: HBASE-20352.branch-1.001.patch
>
>
> Using multiple threads to scan directory and to clean old WALs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20554) "WALs outstanding" message from CleanerChore is noisy

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20554:
---
Fix Version/s: 1.3.3

> "WALs outstanding" message from CleanerChore is noisy
> -
>
> Key: HBASE-20554
> URL: https://issues.apache.org/jira/browse/HBASE-20554
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5
>
> Attachments: HBASE-20554.patch
>
>
> WARN level "WALs outstanding" from CleanerChore should be DEBUG and are not 
> always correct. 
> I left a cluster configured for ITBLL (retaining all WALs for post hoc 
> analysis) and in the morning found the master log full of "WALs outstanding" 
> warnings from CleanerChore. 
> Should this really be a warning?
> {quote}
> 2018-05-09 16:42:03,893 WARN  
> [node-1.cluster,16000,1525851521469_ChoreService_2] cleaner.CleanerChore: 
> WALs outstanding under hdfs://node-1.cluster/hbase/oldWALs
> {quote}
> If someone has configured really long WAL retention then having WALs in 
> oldWALs will be normal. 
> Also, it seems the warning is sometimes incorrect.
> {quote}
> 2018-05-09 16:42:24,751 WARN  
> [node-1.cluster,16000,1525851521469_ChoreService_1] cleaner.CleanerChore: 
> WALs outstanding under hdfs://node-1.cluster/hbase/archive
> {quote}
> There are no WALs under archive/. 
> Even at DEBUG level, if it is not correct, then it can lead an operator to be 
> concerned about nothing, so better to just remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-11 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718421#comment-16718421
 ] 

Jingyun Tian commented on HBASE-21565:
--

[~Apache9]These 2 failed UTs are not related to my patch. Please check this out.

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19364) Truncate_preserve fails with table when replica region > 1

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19364:
---
Fix Version/s: 1.3.3

> Truncate_preserve fails with table when replica region > 1
> --
>
> Key: HBASE-19364
> URL: https://issues.apache.org/jira/browse/HBASE-19364
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.3
>
> Attachments: HBASE-19364-branch-1-v2.patch, 
> HBASE-19364-branch-1-v3.patch, HBASE-19364-branch-1.patch, 
> HBASE-19364-branch-1.patch
>
>
> Root cause is same as HBASE-17319, here we need to exclude secondary regions 
> while reading meta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20052) TestRegionOpen#testNonExistentRegionReplica fails due to NPE

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20052:
---
Fix Version/s: 1.3.3

> TestRegionOpen#testNonExistentRegionReplica fails due to NPE
> 
>
> Key: HBASE-20052
> URL: https://issues.apache.org/jira/browse/HBASE-20052
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.3, 2.0.0
>
> Attachments: 20052.v1.txt, 20052.v2.txt
>
>
> After HBASE-19391 was integrated, the following test failure can be observed:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.TestRegionOpen.testNonExistentRegionReplica(TestRegionOpen.java:122)
> {code}
> This was due null being returned from 
> HRegionFileSystem#createRegionOnFileSystem().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19553) Old replica regions should be cleared from AM memory after primary region split or merge

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19553:
---
Fix Version/s: 1.3.3

> Old replica regions should be cleared from AM memory after primary region 
> split or merge
> 
>
> Key: HBASE-19553
> URL: https://issues.apache.org/jira/browse/HBASE-19553
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: huaxiang sun
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 1.5.0, 1.3.3, 1.4.3
>
> Attachments: HBASE-19553-branch-1-v2.patch, 
> HBASE-19553-branch-1-v3.patch, HBASE-19553-branch-1-v4.patch, 
> HBASE-19553-branch-1-v4.patch, HBASE-19553-branch-1.patch
>
>
> Similar to HBASE-18025, the replica parent's info is not removed from master. 
> Actually I think it can be removed after replica region is split or merged, I 
> will check the logic and apply one patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19391) Calling HRegion#initializeRegionInternals from a region replica can still re-create a region directory

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19391:
---
Fix Version/s: 1.3.3

> Calling HRegion#initializeRegionInternals from a region replica can still 
> re-create a region directory
> --
>
> Key: HBASE-19391
> URL: https://issues.apache.org/jira/browse/HBASE-19391
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-4, 1.4.2
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.3.3, 1.4.3, 2.0.0
>
> Attachments: 
> 0001-HBASE-19391-Calling-HRegion-initializeRegionInternal.patch, 
> HBASE-19391.master.v0.patch
>
>
> This is a follow up from HBASE-18024. There stills a chance that attempting 
> to open a region that is not the default region replica can still create a 
> GC'd region directory by the CatalogJanitor causing inconsistencies with hbck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19905) ReplicationSyncUp tool will not exit if a peer replication is disabled

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19905:
---
Fix Version/s: 1.3.3

> ReplicationSyncUp tool will not exit if a peer replication is disabled
> --
>
> Key: HBASE-19905
> URL: https://issues.apache.org/jira/browse/HBASE-19905
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.3.3, 1.4.2, 2.0.0
>
> Attachments: HBASE-19905.patch
>
>
> In our test cluster we had two peer clusters, in which one peer cluster 
> replication was disabled. Now when used ReplicationSyncUp tool to replicate 
> the data to peer cluster, the tool replicated the data to the enabled peer 
> cluster but it was keep on retrying to replicate the data to disabled peer 
> cluster and hence it was not getting terminated. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19358) Improve the stability of splitting log when do fail over

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19358:
---
Fix Version/s: 1.3.3

> Improve the stability of splitting log when do fail over
> 
>
> Key: HBASE-19358
> URL: https://issues.apache.org/jira/browse/HBASE-19358
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR
>Affects Versions: 0.98.24
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 1.4.1, 1.5.0, 2.0.0-beta-1, 1.3.3, 2.0.0
>
> Attachments: HBASE-18619-branch-2-v2.patch, 
> HBASE-19358-branch-1-v2.patch, HBASE-19358-branch-1-v3.patch, 
> HBASE-19358-branch-1.patch, HBASE-19358-branch-2-v3.patch, 
> HBASE-19358-v1.patch, HBASE-19358-v4.patch, HBASE-19358-v5.patch, 
> HBASE-19358-v6.patch, HBASE-19358-v7.patch, HBASE-19358-v8.patch, 
> HBASE-19358.patch, split-1-log.png, split-logic-new.jpg, split-logic-old.jpg, 
> split-table.png, split_test_result.png
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12905027/split-logic-old.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region and retain 
> it until the end. If the cluster is small and the number of regions per rs is 
> large, it will create too many HDFS streams at the same time. Then it is 
> prone to failure since each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12905028/split-logic-new.jpg!
> We try to cache all the recovered edits, but if it exceeds the MaxHeapUsage, 
> we will pick the largest EntryBuffer and write it to a file (close the writer 
> after finish). Then after we read all entries into memory, we will start a 
> writeAndCloseThreadPool, it starts a certain number of threads to write all 
> buffers to files. Thus it will not create HDFS streams more than 
> *_hbase.regionserver.hlog.splitlog.writer.threads_* we set.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19163) "Maximum lock count exceeded" from region server's batch processing

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19163:
---
Fix Version/s: 1.3.3

> "Maximum lock count exceeded" from region server's batch processing
> ---
>
> Key: HBASE-19163
> URL: https://issues.apache.org/jira/browse/HBASE-19163
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 3.0.0, 2.0.0-alpha-3, 1.2.7
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Fix For: 1.4.1, 1.5.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-19163-branch-1-v001.patch, 
> HBASE-19163-branch-1-v001.patch, HBASE-19163-master-v001.patch, 
> HBASE-19163.master.001.patch, HBASE-19163.master.002.patch, 
> HBASE-19163.master.004.patch, HBASE-19163.master.005.patch, 
> HBASE-19163.master.006.patch, HBASE-19163.master.007.patch, 
> HBASE-19163.master.008.patch, HBASE-19163.master.009.patch, 
> HBASE-19163.master.009.patch, HBASE-19163.master.010.patch, unittest-case.diff
>
>
> In one of use cases, we found the following exception and replication is 
> stuck.
> {code}
> 2017-10-25 19:41:17,199 WARN  [hconnection-0x28db294f-shared--pool4-t936] 
> client.AsyncProcess: #3, table=foo, attempt=5/5 failed=262836ops, last 
> exception: java.io.IOException: java.io.IOException: Maximum lock count 
> exceeded
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
> Caused by: java.lang.Error: Maximum lock count exceeded
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1327)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5163)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3018)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2877)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2819)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:753)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:715)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2148)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
> ... 3 more
> {code}
> While we are still examining the data pattern, it is sure that there are too 
> many mutations in the batch against the same row, this exceeds the maximum 
> 64k shared lock count and it throws an error and failed the whole batch.
> There are two approaches to solve this issue.
> 1). Let's say there are mutations against the same row in the batch, we just 
> need to acquire the lock once for the same row vs to acquire the lock for 
> each mutation.
> 2). We catch the error and start to process whatever it gets and loop back.
> With HBASE-17924, approach 1 seems easy to implement now. 
> Create the jira and will post update/patch when investigation moving forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20141) Fix TooManyFiles exception when RefreshingChannels in FileIOEngine

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20141:
---
Fix Version/s: 1.3.3

> Fix TooManyFiles exception when RefreshingChannels in FileIOEngine
> --
>
> Key: HBASE-20141
> URL: https://issues.apache.org/jira/browse/HBASE-20141
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.4.0, 2.0.0-beta-1, 1.4.2
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.3, 2.0.0
>
> Attachments: HBASE-20141.master.001.patch, 
> HBASE-20141.master.002.patch, HBASE-20141.master.003.patch, 
> HBASE-20141.master.004.patch
>
>
> HBASE-19435 implements a fix for reopening file channels when they are 
> unnexpected closed
> to avoid disabling the BucketCache. However, it was missed that the the 
> channels might not
> actually be completely closed (the write or read channel might still be open
> (see 
> https://docs.oracle.com/javase/7/docs/api/java/nio/channels/ClosedChannelException.html)
> This commit closes any open channels before creating a new channel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19435) Reopen Files for ClosedChannelException in BucketCache

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19435:
---
Fix Version/s: 1.3.3

> Reopen Files for ClosedChannelException in BucketCache
> --
>
> Key: HBASE-19435
> URL: https://issues.apache.org/jira/browse/HBASE-19435
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.1, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 1.4.0, 2.0.0-beta-1, 1.3.3, 2.0.0
>
> Attachments: HBASE-19435.branch-1.001.patch, 
> HBASE-19435.master.001.patch, HBASE-19435.master.002.patch, 
> HBASE-19435.master.003.patch, HBASE-19435.master.004.patch, 
> HBASE-19435.master.005.patch, HBASE-19435.master.006.patch, 
> HBASE-19435.master.007.patch, HBASE-19435.master.007.patch
>
>
> When using the FileIOEngine for BucketCache, the cache will be disabled if 
> the connection is interrupted or closed. HBase will then get 
> ClosedChannelExceptions trying to access the file. After 60s, the RS will 
> disable the cache. This causes severe read performance degradation for 
> workloads that rely on this cache. FileIOEngine never tries to reopen the 
> connection. This JIRA is to reopen files when the BucketCache encounters a 
> ClosedChannelException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17718) Difference between RS's servername and its ephemeral node cause SSH stop working

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17718:
---
Fix Version/s: (was: 1.3.3)

> Difference between RS's servername and its ephemeral node cause SSH stop 
> working
> 
>
> Key: HBASE-17718
> URL: https://issues.apache.org/jira/browse/HBASE-17718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.4, 1.1.8, 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 1.4.0, 2.0.0
>
> Attachments: 0001-HBASE-17718-amendment.patch, 
> HBASE-17718.branch-1.001.patch, HBASE-17718.branch-1.002.patch, 
> HBASE-17718.branch-1.003.patch, HBASE-17718.master.001.patch, 
> HBASE-17718.master.002.patch, HBASE-17718.master.003.patch
>
>
> After HBASE-9593, RS put up an ephemeral node in ZK before reporting for 
> duty. But if the hosts config (/etc/hosts) is different between master and 
> RS, RS's serverName can be different from the one stored the ephemeral zk 
> node. The email metioned in HBASE-13753 
> (http://mail-archives.apache.org/mod_mbox/hbase-user/201505.mbox/%3CCANZDn9ueFEEuZMx=pZdmtLsdGLyZz=rrm1N6EQvLswYc1z-H=g...@mail.gmail.com%3E)
>  is exactly what happened in our production env. 
> But what the email didn't point out is that the difference between serverName 
> in RS and zk node can cause SSH stop to work. as we can see from the code in 
> {{RegionServerTracker}}
> {code}
>   @Override
>   public void nodeDeleted(String path) {
> if (path.startsWith(watcher.rsZNode)) {
>   String serverName = ZKUtil.getNodeName(path);
>   LOG.info("RegionServer ephemeral node deleted, processing expiration [" 
> +
> serverName + "]");
>   ServerName sn = ServerName.parseServerName(serverName);
>   if (!serverManager.isServerOnline(sn)) {
> LOG.warn(serverName.toString() + " is not online or isn't known to 
> the master."+
>  "The latter could be caused by a DNS misconfiguration.");
> return;
>   }
>   remove(sn);
>   this.serverManager.expireServer(sn);
> }
>   }
> {code}
> The server will not be processed by SSH/ServerCrashProcedure. The regions on 
> this server will not been assigned again until master restart or failover.
> I know HBASE-9593 was to fix the issue if RS report to duty and crashed 
> before it can put up a zk node. It is a very rare case(And controllable, just 
> fix the bug making rs to crash). But The issue I metioned can happened more 
> often(and uncontrollable, can't be fixed in HBase, due to DNS, hosts config, 
> etc.) and have more severe consequence.
> So here I offer some solutions to discuss:
> 1. Revert HBASE-9593 from all branches, Andrew Purtell has reverted it in 
> branch-0.98
> 2. Abort RS if master return a different name, otherwise SSH can't work 
> properly
> 3. Master accepts whatever servername reported by RS and don't change it.
> 4.correct the zk node if master return another name( idea from Ted Yu)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18786) FileNotFoundException should not be silently handled for primary region replicas

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18786:
---
Fix Version/s: 1.3.3

> FileNotFoundException should not be silently handled for primary region 
> replicas
> 
>
> Key: HBASE-18786
> URL: https://issues.apache.org/jira/browse/HBASE-18786
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Ashu Pachauri
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18786-branch-1.3.patch, 
> HBASE-18786-branch-1.patch, HBASE-18786-branch-1.patch, HBASE-18786.patch, 
> HBASE-18786.patch
>
>
> This is a follow up for HBASE-18186.
> FileNotFoundException while scanning from a primary region replica can be 
> indicative of a more severe problem. Handling them silently can cause many 
> underlying issues go undetected. We should either
> 1. Hard fail the regionserver if there is a FNFE on a primary region replica, 
> OR
> 2. Report these exceptions as some region / server level metric so that these 
> can be proactively investigated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718349#comment-16718349
 ] 

Hadoop QA commented on HBASE-21564:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}257m 21s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
15s{color} | {color:green} hbase-backup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}325m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleAsyncWAL |
|   | hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleWAL |
|   | hadoop.hbase.replication.TestReplicationEndpoint |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21564 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951415/HBASE-21564.master.004.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c851628ebf29 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 

[jira] [Updated] (HBASE-18248) Warn if monitored RPC task has been tied up beyond a configurable threshold

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18248:
---
Fix Version/s: 1.3.3

> Warn if monitored RPC task has been tied up beyond a configurable threshold
> ---
>
> Key: HBASE-18248
> URL: https://issues.apache.org/jira/browse/HBASE-18248
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18248-branch-1.patch, HBASE-18248-branch-1.patch, 
> HBASE-18248.patch, HBASE-18248.patch
>
>
> Warn if monitored task has been tied up beyond a configurable threshold. We 
> especially want to do this for RPC tasks. Use a separate threshold for 
> warning about stuck RPC tasks versus other types of tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-15134) Add visibility into Flush and Compaction queues

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-15134:
---
Fix Version/s: 1.3.3

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
>Priority: Major
> Fix For: 1.4.0, 2.0.0-alpha-2, 1.3.3, 2.0.0
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21587) Improve support for protobuf 3

2018-12-11 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718345#comment-16718345
 ] 

Duo Zhang commented on HBASE-21587:
---

We have already shaded a patched protobuf-java for hbase-2.x.

And for hbase-1.x, I do not know if there is a plan to upgrade the protobuf 
dependency. [~andrew.purt...@gmail.com].

Or maybe the simplest way is as [~sduskis] said above, try to use the shaded 
hbase client

> Improve support for protobuf 3
> --
>
> Key: HBASE-21587
> URL: https://issues.apache.org/jira/browse/HBASE-21587
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mya Pitzeruse
>Priority: Major
>
> {{HBaseZeroCopyByteString}} extends {{LiteralByteString}} which was removed 
> in protobuf 3. The class was marked as package private, so hbase needed to do 
> a package trick to get to the underlying class.
> [https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/com/google/protobuf/HBaseZeroCopyByteString.java#L18]
> ejona86 references this problem in grpc-java:
> [https://github.com/grpc/grpc-java/issues/3035#issuecomment-360851817]
> The {{HBaseZeroCopyByteString}} class appears to only be used by 
> {{ByteStringer}} class.
> [https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java#L42-L49]
> I think a simple change can be made to the {{ByteStringer}} class to support 
> both proto2 and proto3.
> Proto3 offers an {{UnsafeByteOperations}} class that can be used in place of 
> the {{HBaseZeroCopyByteString}} class.
> [https://github.com/protocolbuffers/protobuf/blob/master/java/core/src/main/java/com/google/protobuf/UnsafeByteOperations.java#L97]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection

2018-12-11 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21538:
--
Attachment: HBASE-21538-HBASE-21512-v2.patch

> Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection
> ---
>
> Key: HBASE-21538
> URL: https://issues.apache.org/jira/browse/HBASE-21538
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-21538-HBASE-21512-v1.patch, 
> HBASE-21538-HBASE-21512-v2.patch, HBASE-21538-HBASE-21512.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17991) Add more details about compaction queue on /dump

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17991:
---
Fix Version/s: 1.3.3

> Add more details about compaction queue on /dump
> 
>
> Key: HBASE-17991
> URL: https://issues.apache.org/jira/browse/HBASE-17991
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Minor
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17991-master-v1.patch, 
> HBASE-17991-master-v2.patch, HBASE-17991-master-v3.patch
>
>
> RS dump information as follows
> {code}
> RS Queue:
> ===
> Compaction/Split Queue summary: compaction_queue=(20:0), split_queue=0, 
> merge_queue=0
> Compaction/Split Queue dump:
>   LargeCompation Queue:
> Request = 
> regionName=usertable,user4180497275766179957,1491904095205.1d4085e4438752f3611f66e7b043fe44.,
>  storeName=0, fileCount=1, fileSize=1.9 G (1.9 G), priority=1, 
> time=21697920409804647
> Request = 
> regionName=usertable,user4568009753557153251,1491904099262.95bf004e3c9b35a58c60ca5d5b11d190.,
>  storeName=0, fileCount=1, fileSize=1.9 G (1.9 G), priority=1, 
> time=21697920413223800
>   SmallCompation Queue:
> Store = b, pri = 108
> Store = b, pri = 108
> Store = b, pri = 108
> Store = b, pri = 108
> Store = b, pri = 108
> Store = b, pri = 109
> {code}
> Compaction queue information will be displayed on page /dump.
> If compation has selected the file, it will print the details information of 
> the compation, otherwise only print storename and priority(Store = b, pri = 
> 108) which is useless for us.
> So, we should also print more detailed information, such as regionName, 
> starttime etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718271#comment-16718271
 ] 

Hadoop QA commented on HBASE-21586:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
21s{color} | {color:red} hbase-common: The patch generated 4 new + 140 
unchanged - 0 fixed = 144 total (was 140) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
29s{color} | {color:red} hbase-client: The patch generated 5 new + 126 
unchanged - 2 fixed = 131 total (was 128) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} hbase-zookeeper: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
1s{color} | {color:red} hbase-server: The patch generated 6 new + 23 unchanged 
- 1 fixed = 29 total (was 24) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} hbase-mapreduce: The patch generated 3 new + 16 
unchanged - 0 fixed = 19 total (was 16) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hbase-rest: The patch generated 1 new + 21 unchanged - 
0 fixed = 22 total (was 21) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 18s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
44s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
15s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit 

[jira] [Commented] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718265#comment-16718265
 ] 

Hadoop QA commented on HBASE-21575:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
50s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}260m 12s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}296m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.replication.TestSyncReplicationStandbyKillRS |
|   | hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas |
|   | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21575 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951409/HBASE-21575.01.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 0aaea8383c73 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 67d6d5084c |
| maven | version: Apache Maven 3.5.4 

[jira] [Updated] (HBASE-18058) Zookeeper retry sleep time should have an upper limit

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18058:
---
Fix Version/s: 1.3.3

> Zookeeper retry sleep time should have an upper limit
> -
>
> Key: HBASE-18058
> URL: https://issues.apache.org/jira/browse/HBASE-18058
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18058-branch-1.patch, 
> HBASE-18058-branch-1.v2.patch, HBASE-18058-branch-1.v3.patch, 
> HBASE-18058.patch, HBASE-18058.v2.patch
>
>
> Now, in {{RecoverableZooKeeper}}, the retry backoff sleep time grow 
> exponentially, but it doesn't have any up limit. It directly lead to a long 
> long recovery time after Zookeeper going down for some while and come back.
> A case of damage done by high sleep time:
> If the server hosting zookeeper is disk full, the zookeeper quorum won't 
> really went down but reject all write request. So at HBase side, new zk write 
> request will suffers from exception and retry. But connection remains so the 
> session won't timeout. When disk full situation have been resolved, the 
> zookeeper quorum can work normally again. But the very high sleep time cause 
> some module of RegionServer/HMaster will still sleep for a long time(for 
> example, the balancer) before working.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17924) Consider sorting the row order when processing multi() ops before taking rowlocks

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17924:
---
Fix Version/s: 1.3.3

> Consider sorting the row order when processing multi() ops before taking 
> rowlocks
> -
>
> Key: HBASE-17924
> URL: https://issues.apache.org/jira/browse/HBASE-17924
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.8, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Allan Yang
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17924.patch, HBASE-17924.v0.patch, 
> HBASE-17924.v2.patch, HBASE-17924.v3.patch, HBASE-17924.v4.patch, 
> HBASE-17924.v5.patch
>
>
> When processing a batch mutation, we take row locks in whatever order the 
> mutations were added to the multi op by the client.
>  
> {noformat}
> RSRpcServices#multi -> RSRpcServices#mutateRows -> HRegion#mutateRow -> 
> HRegion#mutateRowsWithLocks -> HRegion#processRowsWithLocks
> {noformat}
> Or
> {noformat}
> RSRpcServices#multi -> RSRpcServices#doNonAtomicRegionMutation ->
>   HRegion#get 
> | HRegion#append 
> | HRegion#increment 
> | HRegionServer#doBatchOp -> HRegion#batchMutate -> 
> HRegion#doMiniBatchMutation
> {noformat}
>  
> multi() is fed by client APIs that accept a RowMutations object containing 
> actions for multiple rows. The container for ops inside RowMutations is an 
> ArrayList, which doesn't change the ordering of objects added to it. The 
> protobuf implementation of the messages for multi ops do not reorder the list 
> of actions. When processing multi ops we iterate over the actions in the 
> order rehydrated from protobuf.
> We should discuss sorting the order of ops by row key when processing multi() 
> ops before taking row locks. Does this make lock ordering more predictable 
> for server side operations? Yes, but potentially surprising for the client, 
> right? Is there any legitimate reason we should take locks out of row key 
> sorted order because the client has structured the request as such?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17118) StoreScanner leaked in KeyValueHeap

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17118:
---
Fix Version/s: 1.3.3

> StoreScanner leaked in KeyValueHeap
> ---
>
> Key: HBASE-17118
> URL: https://issues.apache.org/jira/browse/HBASE-17118
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.7, 1.2.4
>Reporter: binlijin
>Assignee: binlijin
>Priority: Major
> Fix For: 1.4.0, 1.2.5, 1.1.8, 1.3.3, 2.0.0
>
> Attachments: HBASE-17118-master_v1.patch, 
> HBASE-17118-master_v2.patch, HBASE-17118-master_v3.patch, 
> HBASE-17118-master_v4.patch, HBASE-17118-master_v5.patch, 
> HBASE-17118.branch-1.1.v1.patch, HBASE-17118.branch-1.2.v1.patch, 
> HBASE-17118.branch-1.addnumv1.patch, HBASE-17118.branch-1.v1.patch, 
> StoreScanner.png, StoreScannerLeakHeap.png
>
>
> KeyValueHeap#generalizedSeek
>   KeyValueScanner scanner = current;
>   while (scanner != null) {
> Cell topKey = scanner.peek();
> ..
> boolean seekResult;
> if (isLazy && heap.size() > 0) {
>   // If there is only one scanner left, we don't do lazy seek.
>   seekResult = scanner.requestSeek(seekKey, forward, useBloom);
> } else {
>   seekResult = NonLazyKeyValueScanner.doRealSeek(scanner, seekKey,
>   forward);
> }
> ..
> scanner = heap.poll();
>   }
> (1) scanner = heap.poll();  Retrieves and removes the head of this queue
> (2) scanner.requestSeek(seekKey, forward, useBloom); or 
> NonLazyKeyValueScanner.doRealSeek(scanner, seekKey, forward);
> throw exception, and scanner will have no chance to close, so will cause the 
> scanner leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17718) Difference between RS's servername and its ephemeral node cause SSH stop working

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17718:
---
Fix Version/s: 1.3.3

> Difference between RS's servername and its ephemeral node cause SSH stop 
> working
> 
>
> Key: HBASE-17718
> URL: https://issues.apache.org/jira/browse/HBASE-17718
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.4, 1.1.8, 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 0001-HBASE-17718-amendment.patch, 
> HBASE-17718.branch-1.001.patch, HBASE-17718.branch-1.002.patch, 
> HBASE-17718.branch-1.003.patch, HBASE-17718.master.001.patch, 
> HBASE-17718.master.002.patch, HBASE-17718.master.003.patch
>
>
> After HBASE-9593, RS put up an ephemeral node in ZK before reporting for 
> duty. But if the hosts config (/etc/hosts) is different between master and 
> RS, RS's serverName can be different from the one stored the ephemeral zk 
> node. The email metioned in HBASE-13753 
> (http://mail-archives.apache.org/mod_mbox/hbase-user/201505.mbox/%3CCANZDn9ueFEEuZMx=pZdmtLsdGLyZz=rrm1N6EQvLswYc1z-H=g...@mail.gmail.com%3E)
>  is exactly what happened in our production env. 
> But what the email didn't point out is that the difference between serverName 
> in RS and zk node can cause SSH stop to work. as we can see from the code in 
> {{RegionServerTracker}}
> {code}
>   @Override
>   public void nodeDeleted(String path) {
> if (path.startsWith(watcher.rsZNode)) {
>   String serverName = ZKUtil.getNodeName(path);
>   LOG.info("RegionServer ephemeral node deleted, processing expiration [" 
> +
> serverName + "]");
>   ServerName sn = ServerName.parseServerName(serverName);
>   if (!serverManager.isServerOnline(sn)) {
> LOG.warn(serverName.toString() + " is not online or isn't known to 
> the master."+
>  "The latter could be caused by a DNS misconfiguration.");
> return;
>   }
>   remove(sn);
>   this.serverManager.expireServer(sn);
> }
>   }
> {code}
> The server will not be processed by SSH/ServerCrashProcedure. The regions on 
> this server will not been assigned again until master restart or failover.
> I know HBASE-9593 was to fix the issue if RS report to duty and crashed 
> before it can put up a zk node. It is a very rare case(And controllable, just 
> fix the bug making rs to crash). But The issue I metioned can happened more 
> often(and uncontrollable, can't be fixed in HBase, due to DNS, hosts config, 
> etc.) and have more severe consequence.
> So here I offer some solutions to discuss:
> 1. Revert HBASE-9593 from all branches, Andrew Purtell has reverted it in 
> branch-0.98
> 2. Abort RS if master return a different name, otherwise SSH can't work 
> properly
> 3. Master accepts whatever servername reported by RS and don't change it.
> 4.correct the zk node if master return another name( idea from Ted Yu)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17731) Fractional latency reporting in MultiThreadedAction

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17731:
---
Fix Version/s: 1.3.3

> Fractional latency reporting in MultiThreadedAction
> ---
>
> Key: HBASE-17731
> URL: https://issues.apache.org/jira/browse/HBASE-17731
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17731.patch
>
>
> When average latency is less than one millisecond the LoadTestTool tool 
> reports a latency of 0. Better to report a fraction out to a couple of 
> decimal points. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718184#comment-16718184
 ] 

Hudson commented on HBASE-21568:


Results for branch branch-2
[build #1552 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1552/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1552//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1552//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1552//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18830) TestCanaryTool does not check Canary monitor's error code

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18830:
---
Fix Version/s: 1.3.3

> TestCanaryTool does not check Canary monitor's error code
> -
>
> Key: HBASE-18830
> URL: https://issues.apache.org/jira/browse/HBASE-18830
> Project: HBase
>  Issue Type: Bug
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18830.001.patch
>
>
> None of the tests inside TestCanaryTool check Canary monitor's error code. 
> Thus, it is possible that the monitor has registered an error and yet the 
> tests pass. We should check the value returned by the _ToolRunner.run()_ 
> method inside each unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19551) hbck -boundaries doesn't work correctly

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19551:
---
Fix Version/s: 1.3.3
   1.5.0

> hbck -boundaries doesn't work correctly
> ---
>
> Key: HBASE-19551
> URL: https://issues.apache.org/jira/browse/HBASE-19551
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 1.4.1, 1.5.0, 1.3.3
>
> Attachments: HBASE-19551.branch-1.patch, HBASE-19551.patch
>
>
> Currently, in HBaseFsck#checkRegionBoundaries(), it seems like keys from 
> reader.getFirstKey() and reader.getLastKey() are directly compared by 
> ByteArrayComparator:
> https://github.com/apache/hbase/blob/9d0c7c6dfbcba0907cbbc2244eac570fcc4d58a5/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java#L864-L865
> https://github.com/apache/hbase/blob/9d0c7c6dfbcba0907cbbc2244eac570fcc4d58a5/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java#L869-L870
> This is not correct because the keys consist of a key length and key itself. 
> We should compare rowkeys or the only key itself (by removing a key length).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17930) Avoid using Canary.sniff in HBaseTestingUtility

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17930:
---
Fix Version/s: 1.3.3

> Avoid using Canary.sniff in HBaseTestingUtility
> ---
>
> Key: HBASE-17930
> URL: https://issues.apache.org/jira/browse/HBASE-17930
> Project: HBase
>  Issue Type: Bug
>  Components: canary, test
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: trivial
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17930.patch
>
>
> It will make it easier to rewrite Canary with async client in the future.
> And it is also very easy to write a simple sniff method for 
> HBaseTestingUtility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18762) Canary sink type cast error

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18762:
---
Fix Version/s: 1.3.3

> Canary sink type cast error
> ---
>
> Key: HBASE-18762
> URL: https://issues.apache.org/jira/browse/HBASE-18762
> Project: HBase
>  Issue Type: Bug
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18762.001.patch, HBASE-18830-addendum.patch
>
>
>  When running the main method of Canary.java, we see the following error:
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.tool.Canary$RegionServerStdOutSink cannot be cast to 
> org.apache.hadoop.hbase.tool.Canary$RegionStdOutSink
>   at org.apache.hadoop.hbase.tool.Canary.newMonitor(Canary.java:911)
>   at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:796)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1571)
> This happens because we typecast the sink depending on the mode (zookeeper 
> mode/region server mode) that Canary is configured in. In case no mode is 
> specified, we typecast the sink into _RegionStdOutSink_. In general, it is 
> possible to provide inconsistent mode and sink types while running Canary. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17959) Canary timeout should be configurable on a per-table basis

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17959:
---
Fix Version/s: 1.3.3

> Canary timeout should be configurable on a per-table basis
> --
>
> Key: HBASE-17959
> URL: https://issues.apache.org/jira/browse/HBASE-17959
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Andrew Purtell
>Assignee: Chinmay Kulkarni
>Priority: Minor
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17959-branch-1.patch, HBASE-17959.002.patch, 
> HBASE-17959.003.patch, HBASE-17959.004.patch, HBASE-17959.patch
>
>
> The Canary read and write timeouts should be configurable on a per-table 
> basis, for cases where different tables have different latency SLAs. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17965) Canary tool should print the regionserver name on failure

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17965:
---
Fix Version/s: 1.3.3

> Canary tool should print the regionserver name on failure
> -
>
> Key: HBASE-17965
> URL: https://issues.apache.org/jira/browse/HBASE-17965
> Project: HBase
>  Issue Type: Task
>Reporter: churro morales
>Assignee: Karan Mehta
>Priority: Minor
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17965-branch-1.patch, HBASE-17965.001.patch
>
>
> It would be nice when we have a canary failure for a region to print the 
> associated regionserver's name in the log as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16399) Provide an API to get list of failed regions and servername in Canary

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16399:
---
Fix Version/s: 1.3.3

> Provide an API to get list of failed regions and servername in Canary
> -
>
> Key: HBASE-16399
> URL: https://issues.apache.org/jira/browse/HBASE-16399
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Affects Versions: 1.3.1, 0.98.21
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
> Fix For: 1.4.0, 0.98.22, 1.3.3, 2.0.0
>
> Attachments: HBASE-16399.0.98.00.patch, HBASE-16399.0.98.01.patch, 
> HBASE-16399.0.98.02.patch, HBASE-16399.00.patch, HBASE-16399.01.patch, 
> HBASE-16399.02.patch, HBASE-16399.03.patch, HBASE-16399.branch-1.00.patch, 
> HBASE-16399.branch-1.01.patch, HBASE-16399.branch-1.02.patch, 
> HBASE-16399.branch-1.03.patch
>
>
> At present HBase Canary tool only prints the failures as part of logs. It 
> does not provide an API to get the list or summarizes it so caller can take 
> action on the failed host. This Jira would additional API so caller can get 
> read or write canary failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718101#comment-16718101
 ] 

Hadoop QA commented on HBASE-21577:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
15s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 31s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}135m 
23s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}175m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21577 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951407/HBASE-21577.master.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e73aa808f97e 4.4.0-131-generic #157~14.04.1-Ubuntu SMP Fri Jul 
13 08:53:17 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 67d6d5084c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15249/testReport/ |
| Max. process+thread count | 4917 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Updated] (HBASE-17930) Avoid using Canary.sniff in HBaseTestingUtility

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17930:
---
Fix Version/s: (was: 1.3.3)

> Avoid using Canary.sniff in HBaseTestingUtility
> ---
>
> Key: HBASE-17930
> URL: https://issues.apache.org/jira/browse/HBASE-17930
> Project: HBase
>  Issue Type: Bug
>  Components: canary, test
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: trivial
> Fix For: 1.4.0, 2.0.0
>
> Attachments: HBASE-17930.patch
>
>
> It will make it easier to rewrite Canary with async client in the future.
> And it is also very easy to write a simple sniff method for 
> HBaseTestingUtility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16091) Canary takes lot more time when there are delete markers in the table

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16091:
---
Fix Version/s: 1.3.3

> Canary takes lot more time when there are delete markers in the table
> -
>
> Key: HBASE-16091
> URL: https://issues.apache.org/jira/browse/HBASE-16091
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Affects Versions: 2.0.0
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
>Priority: Major
> Fix For: 1.4.0, 0.98.21, 1.3.3, 2.0.0
>
> Attachments: HBASE-16091.00.patch, HBASE-16091.01.patch, 
> HBASE-16091.02.patch
>
>
> We have a table which has lot of delete markers and we running Canary test on 
> a regular interval sometimes tests are timing out because to reading first 
> row would skip all these delete markers. Since purpose of Canary is to find 
> health of the region, i think keeping raw=true would not defeat the purpose 
> but provide good perf improvement. 
> Following are the example of one such scan where 
> without changing code it took 62.3 sec for onre region scan
> 2016-06-23 08:49:11,670 INFO  [pool-2-thread-1] tool.Canary - read from 
> region  . column family 0 in 62338ms
> whereas after setting raw=true, it reduced to 58ms
> 2016-06-23 08:45:20,259 INFO  [pool-2-thread-1] tests.Canary - read from 
> region . column family 0 in 58ms
> Taking this over multiple tables , with multiple region would be a good 
> performance gain.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17912) Avoid major compactions on region server startup

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17912:
---
Fix Version/s: 1.3.3

> Avoid major compactions on region server startup
> 
>
> Key: HBASE-17912
> URL: https://issues.apache.org/jira/browse/HBASE-17912
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Affects Versions: 1.3.1, 0.98.24, 2.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17912.patch
>
>
> The HRegionServer.CompactionChecker chore wakes up every 10s and for each 
> store in each region mods against a chore frequency (by default slightly 
> under 3 hours) to see if it's time to check if a major compaction is 
> necessary for that store. 
> The check to see if it needs to check for major compactions is calculated by 
> if (iteration % multiplier != 0) continue;
> where iteration is the number of times the chore has woken up. 
> Because 0 % anything is 0, this will always check for necessary major 
> compactions on each store when this chore is first run after the region 
> server starts up. This can result in compaction storms when doing a rolling 
> restart, because, for example, the new instance of the region server might 
> get a lower jitter value than the old one had.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17930) Avoid using Canary.sniff in HBaseTestingUtility

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17930:
---
Fix Version/s: 1.3.3

> Avoid using Canary.sniff in HBaseTestingUtility
> ---
>
> Key: HBASE-17930
> URL: https://issues.apache.org/jira/browse/HBASE-17930
> Project: HBase
>  Issue Type: Bug
>  Components: canary, test
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: trivial
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17930.patch
>
>
> It will make it easier to rewrite Canary with async client in the future.
> And it is also very easy to write a simple sniff method for 
> HBaseTestingUtility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17816) HRegion#mutateRowWithLocks should update writeRequestCount metric

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17816:
---
Fix Version/s: 1.3.3

> HRegion#mutateRowWithLocks should update writeRequestCount metric
> -
>
> Key: HBASE-17816
> URL: https://issues.apache.org/jira/browse/HBASE-17816
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Reporter: Ashu Pachauri
>Assignee: Weizhan Zeng
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17816.branch-1.patch, 
> HBASE-17816.master.001.patch, HBASE-17816.master.002.patch
>
>
> Currently, all the calls that use HRegion#mutateRowWithLocks miss 
> writeRequestCount metric. The mutateRowWithLocks base method should update 
> the metric.
> Examples are checkAndMutate calls through RSRpcServices#multi, 
> Region#mutateRow api , MultiRowMutationProcessor coprocessor endpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-15727) Canary Tool for Zookeeper

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-15727:
---
Fix Version/s: 1.3.3

> Canary Tool for Zookeeper
> -
>
> Key: HBASE-15727
> URL: https://issues.apache.org/jira/browse/HBASE-15727
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: churro morales
>Assignee: churro morales
>Priority: Major
> Fix For: 1.4.0, 0.98.20, 1.3.3, 2.0.0
>
> Attachments: HBASE-15727-v1.patch, HBASE-15727-v2.patch, 
> HBASE-15727-v3.patch, HBASE-15727.patch, HBASE-15727.v4.patch, 
> HBASE-15727.v5.patch
>
>
> It would be nice to have the canary tool also monitor zookeeper.  Something 
> simple like doing a getData() call on zookeeper.znode.parent
> It would be nice to create clients for every instance in the quorum such that 
> you could monitor overloaded or poor behaving instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC; purge ThreadLocal usage

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17072:
---
Fix Version/s: 1.3.3

> CPU usage starts to climb up to 90-100% when using G1GC; purge ThreadLocal 
> usage
> 
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0, 2.0.0
>Reporter: Eiichi Sato
>Assignee: Eiichi Sato
>Priority: Critical
> Fix For: 1.4.0, 0.98.24, 1.3.3, 2.0.0
>
> Attachments: HBASE-17072-0.98.patch, HBASE-17072.branch-1.001.patch, 
> HBASE-17072.master.001.patch, HBASE-17072.master.002.patch, 
> HBASE-17072.master.003.patch, HBASE-17072.master.004.patch, 
> HBASE-17072.master.005.patch, HBASE-17072.master.005.patch, 
> disable-block-header-cache.patch, mat-threadlocals.png, mat-threads.png, 
> metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17798:
---
Fix Version/s: 1.3.3

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.4, 0.98.24, 2.0.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 17798-master-v2.patch, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch, connections.png
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16616:
---
Fix Version/s: (was: 1.3.3)

> Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
> --
>
> Key: HBASE-16616
> URL: https://issues.apache.org/jira/browse/HBASE-16616
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.2.2
>Reporter: Tomu Tsuruhara
>Assignee: Tomu Tsuruhara
>Priority: Major
> Fix For: 1.4.0, 2.0.0
>
> Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, 
> HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png
>
>
> In our HBase 1.2.2 cluster, some regionserver showed too bad 
> "QueueCallTime_99th_percentile" exceeding 10 seconds.
> Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at 
> that time.
> {noformat}
> "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 
> os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617)
> at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298)
> at java.lang.ThreadLocal.remove(ThreadLocal.java:222)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java.
> {code}
> 616:while (tab[h] != null)
> 617:h = nextIndex(h, len);
> {code}
> So I hypothesized that there're too many consecutive entries in {{tab}} array 
> and actually I found them in the heapdump.
> !ScreenShot 2016-09-09 14.17.53.png|width=50%!
> Most of these entries pointed at instance of 
> {{org.apache.hadoop.hbase.util.Counter$1}}
> which is equivarent to {{indexHolderThreadLocal}} instance-variable in the 
> {{Counter}} class.
> Because {{RpcServer$Connection}} class creates a {{Counter}} instance 
> {{rpcCount}} for every connections,
> it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances 
> in RegionServer process
> when we repeat connect-and-close from client. As a result, a ThreadLocalMap 
> can have lots of consecutive
> entires.
> Usually, since each entry is a {{WeakReference}}, these entries are collected 
> and removed
> by garbage-collector soon after connection closed.
> But if connection's life-time was long enough to survive youngGC, it wouldn't 
> be collected until old-gen collector runs.
> Furthermore, under G1GC deployment, it is possible not to be collected even 
> by old-gen GC(mixed GC)
> if entries sit in a region which doesn't have much garbages.
> Actually we used G1GC when we encountered this problem.
> We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove 
> explicitly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16616:
---
Fix Version/s: 1.3.3

> Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
> --
>
> Key: HBASE-16616
> URL: https://issues.apache.org/jira/browse/HBASE-16616
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.2.2
>Reporter: Tomu Tsuruhara
>Assignee: Tomu Tsuruhara
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, 
> HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png
>
>
> In our HBase 1.2.2 cluster, some regionserver showed too bad 
> "QueueCallTime_99th_percentile" exceeding 10 seconds.
> Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at 
> that time.
> {noformat}
> "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 
> os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617)
> at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298)
> at java.lang.ThreadLocal.remove(ThreadLocal.java:222)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java.
> {code}
> 616:while (tab[h] != null)
> 617:h = nextIndex(h, len);
> {code}
> So I hypothesized that there're too many consecutive entries in {{tab}} array 
> and actually I found them in the heapdump.
> !ScreenShot 2016-09-09 14.17.53.png|width=50%!
> Most of these entries pointed at instance of 
> {{org.apache.hadoop.hbase.util.Counter$1}}
> which is equivarent to {{indexHolderThreadLocal}} instance-variable in the 
> {{Counter}} class.
> Because {{RpcServer$Connection}} class creates a {{Counter}} instance 
> {{rpcCount}} for every connections,
> it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances 
> in RegionServer process
> when we repeat connect-and-close from client. As a result, a ThreadLocalMap 
> can have lots of consecutive
> entires.
> Usually, since each entry is a {{WeakReference}}, these entries are collected 
> and removed
> by garbage-collector soon after connection closed.
> But if connection's life-time was long enough to survive youngGC, it wouldn't 
> be collected until old-gen collector runs.
> Furthermore, under G1GC deployment, it is possible not to be collected even 
> by old-gen GC(mixed GC)
> if entries sit in a region which doesn't have much garbages.
> Actually we used G1GC when we encountered this problem.
> We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove 
> explicitly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-17205) Add a metric for the duration of region in transition

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-17205:
---
Fix Version/s: 1.3.3

> Add a metric for the duration of region in transition
> -
>
> Key: HBASE-17205
> URL: https://issues.apache.org/jira/browse/HBASE-17205
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-17205-branch-1.patch, HBASE-17205-v1.patch, 
> HBASE-17205-v1.patch, HBASE-17205.patch
>
>
> When work for HBASE-17178, I found there are not a metric for the overall 
> duration of region in transition. When move a region form A to B, the 
> transformation of region state is PENDING_CLOSE => CLOSING => CLOSED => 
> PENDING_OPEN => OPENING => OPENED. When transform old region state to new 
> region state, it update the time stamp to current time. So we can't get the 
> overall transformation's duration of region in transition. Add a rit duration 
> to RegionState for accumulating this metric.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16947) Some improvements for DumpReplicationQueues tool

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16947:
---
Fix Version/s: (was: 1.3.1)
   1.3.3

> Some improvements for DumpReplicationQueues tool
> 
>
> Key: HBASE-16947
> URL: https://issues.apache.org/jira/browse/HBASE-16947
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, Replication
>Affects Versions: 1.4.0, 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 1.4.0, 0.98.24, 1.3.3, 2.0.0
>
> Attachments: HBASE-16947-branch-1.patch, HBASE-16947-branch-1.patch, 
> HBASE-16947-branch-1.patch, HBASE-16947-v1.patch, HBASE-16947.patch
>
>
> Recently we met too many replication WALs problem in our production cluster. 
> We need the DumpReplicationQueues tool to analyze the replication queues info 
> in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some 
> improvements for it.
> 1. Show the dead regionservers under replication/rs znode. When there are too 
> many WALs under znode, it can't be atomic transferred to new rs znode. So the 
> dead rs znode will be leaved on zookeeper.
> 2. Make a summary about all the queues that belong to peer has been deleted. 
> 3. Aggregate all regionservers' size of replication queue. Now the 
> regionserver report ReplicationLoad to master, but there were not a aggregate 
> metrics for replication.
> 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not 
> Found) need more time to dig.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19972) Should rethrow the RetriesExhaustedWithDetailsException when failed to apply the batch in ReplicationSink

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19972:
---
Fix Version/s: (was: 1.3.3)

> Should rethrow  the RetriesExhaustedWithDetailsException when failed to apply 
> the batch in ReplicationSink
> --
>
> Key: HBASE-19972
> URL: https://issues.apache.org/jira/browse/HBASE-19972
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2, 2.0.0
>
> Attachments: HBASE-19972-branch-1.4.patch, HBASE-19972.v1.patch, 
> HBASE-19972.v1.patch
>
>
> As [~Apache9] said in HBASE-12091. 
> In ReplicationSink#batch,we swallow the RetriesExhaustedWithDetailsException 
> except 
> TableNotFoundException,   actually,  should rethrow the exception. 
> {code:java}
> try {
>   Connection connection = getConnection();
>   table = connection.getTable(tableName);
>   for (List rows : allRows) {
> table.batch(rows);
>   }
> } catch (RetriesExhaustedWithDetailsException rewde) {
>   for (Throwable ex : rewde.getCauses()) {
> if (ex instanceof TableNotFoundException) {
>   throw new TableNotFoundException("'"+tableName+"'");
> }
>   }
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19972) Should rethrow the RetriesExhaustedWithDetailsException when failed to apply the batch in ReplicationSink

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19972:
---
Fix Version/s: 1.3.3

> Should rethrow  the RetriesExhaustedWithDetailsException when failed to apply 
> the batch in ReplicationSink
> --
>
> Key: HBASE-19972
> URL: https://issues.apache.org/jira/browse/HBASE-19972
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 1.5.0, 2.0.0-beta-2, 1.3.3, 1.4.2, 2.0.0
>
> Attachments: HBASE-19972-branch-1.4.patch, HBASE-19972.v1.patch, 
> HBASE-19972.v1.patch
>
>
> As [~Apache9] said in HBASE-12091. 
> In ReplicationSink#batch,we swallow the RetriesExhaustedWithDetailsException 
> except 
> TableNotFoundException,   actually,  should rethrow the exception. 
> {code:java}
> try {
>   Connection connection = getConnection();
>   table = connection.getTable(tableName);
>   for (List rows : allRows) {
> table.batch(rows);
>   }
> } catch (RetriesExhaustedWithDetailsException rewde) {
>   for (Throwable ex : rewde.getCauses()) {
> if (ex instanceof TableNotFoundException) {
>   throw new TableNotFoundException("'"+tableName+"'");
> }
>   }
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19816) Replication sink list is not updated on UnknownHostException

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19816:
---
Fix Version/s: 1.3.3

> Replication sink list is not updated on UnknownHostException
> 
>
> Key: HBASE-19816
> URL: https://issues.apache.org/jira/browse/HBASE-19816
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0, 2.0.0
> Environment: We have two clusters set up with bi-directional 
> replication. The clusters are around 400 nodes each and hosted in AWS.
>Reporter: Scott Wilson
>Assignee: Scott Wilson
>Priority: Major
> Fix For: 1.4.1, 1.5.0, 2.0.0-beta-2, 1.3.3, 2.0.0
>
> Attachments: HBASE-19816.master.002.patch
>
>
> We have two clusters, call them 1 and 2. Cluster 1 was the current "primary" 
> cluster and taking all live traffic which is replicated to cluster 2. We 
> decommissioned several instances in cluster 2 which involves deleting the 
> instance and its DNS record. After this happened most of the regions servers 
> in cluster 1 showed this message in their logs repeatedly. 
>  
> {code}
> 2018-01-12 23:49:36,507 WARN 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint:
>  Can't replicate because of a local or network error:
> java.net.UnknownHostException: data-017b.hbase-2.prod
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.(AbstractRpcClient.java:315)
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.createBlockingRpcChannel(AbstractRpcClient.java:267)
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getAdmin(ConnectionManager.java:1737)
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getAdmin(ConnectionManager.java:1719)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager.getReplicationSink(ReplicationSinkManager.java:119)
> at 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:339)
> at 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.java:326)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The host data-017b.hbase-2.prod was one of those that had been removed from 
> cluster 2. Next we observed our replication lag from cluster 1 to cluster 2 
> was elevated. Some region servers reported ageOfLastShippedOperation to be 
> close to an hour.
> The only way we found to clear the message was to restart the region servers 
> that showed this message in the log. Once we did replication returned to 
> normal. Restarting the affected region servers in cluster 1 took several days 
> because we could not bring the cluster down.
> From reading the code it appears the cause was the zookeeper watch not being 
> triggered for the region server list change in cluster 2. We verified the 
> list in zookeeper for cluster 2 was correct and did not include the removed 
> nodes.
> One concrete improvement to make would be to force a refresh of the sink 
> cluster region server list when an {{UnknownHostException}} is found. This is 
> already done if the there is a {{ConnectException}} in 
> {{HBaseInterClusterReplicationEndpoint.java}}
> {code:java}
> } else if (ioe instanceof ConnectException) {
>   LOG.warn("Peer is unavailable, rechecking all sinks: ", ioe);
>   replicationSinkMgr.chooseSinks();
> {code}
> I propose that should be extended to cover {{UnknownHostException}}.
> We observed this behavior on 1.2.0-cdh-5.11.1 but it appears the same code 
> still exists on the current master branch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21587) Improve support for protobuf 3

2018-12-11 Thread Solomon Duskis (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718040#comment-16718040
 ] 

Solomon Duskis commented on HBASE-21587:


I commented on the grpc thread, and I'll also comment here.  This issue is 
related to a pretty complex issue relating to dependency management.  There are 
a variety of approaches on this issue that can be discussed at length, but the 
most pragmatic approach, IMHO, is to use 'hbase-shaded-client' to get around 
this issue.

> Improve support for protobuf 3
> --
>
> Key: HBASE-21587
> URL: https://issues.apache.org/jira/browse/HBASE-21587
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mya Pitzeruse
>Priority: Major
>
> {{HBaseZeroCopyByteString}} extends {{LiteralByteString}} which was removed 
> in protobuf 3. The class was marked as package private, so hbase needed to do 
> a package trick to get to the underlying class.
> [https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/com/google/protobuf/HBaseZeroCopyByteString.java#L18]
> ejona86 references this problem in grpc-java:
> [https://github.com/grpc/grpc-java/issues/3035#issuecomment-360851817]
> The {{HBaseZeroCopyByteString}} class appears to only be used by 
> {{ByteStringer}} class.
> [https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java#L42-L49]
> I think a simple change can be made to the {{ByteStringer}} class to support 
> both proto2 and proto3.
> Proto3 offers an {{UnsafeByteOperations}} class that can be used in place of 
> the {{HBaseZeroCopyByteString}} class.
> [https://github.com/protocolbuffers/protobuf/blob/master/java/core/src/main/java/com/google/protobuf/UnsafeByteOperations.java#L97]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Leach updated HBASE-21586:
---
Attachment: HBASE-21586.master.002.patch
Status: Patch Available  (was: Open)

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch, 
> HBASE-21586.master.002.patch, HBASE-21586.master.002.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Leach updated HBASE-21586:
---
Attachment: HBASE-21586.master.002.patch

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch, 
> HBASE-21586.master.002.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Leach updated HBASE-21586:
---
Status: Open  (was: Patch Available)

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch, 
> HBASE-21586.master.002.patch, HBASE-21586.master.002.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21587) Improve support for protobuf 3

2018-12-11 Thread Mya Pitzeruse (JIRA)
Mya Pitzeruse created HBASE-21587:
-

 Summary: Improve support for protobuf 3
 Key: HBASE-21587
 URL: https://issues.apache.org/jira/browse/HBASE-21587
 Project: HBase
  Issue Type: Improvement
Reporter: Mya Pitzeruse


{{HBaseZeroCopyByteString}} extends {{LiteralByteString}} which was removed in 
protobuf 3. The class was marked as package private, so hbase needed to do a 
package trick to get to the underlying class.

[https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/com/google/protobuf/HBaseZeroCopyByteString.java#L18]

ejona86 references this problem in grpc-java:

[https://github.com/grpc/grpc-java/issues/3035#issuecomment-360851817]

The {{HBaseZeroCopyByteString}} class appears to only be used by 
{{ByteStringer}} class.

[https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java#L42-L49]

I think a simple change can be made to the {{ByteStringer}} class to support 
both proto2 and proto3.

Proto3 offers an {{UnsafeByteOperations}} class that can be used in place of 
the {{HBaseZeroCopyByteString}} class.

[https://github.com/protocolbuffers/protobuf/blob/master/java/core/src/main/java/com/google/protobuf/UnsafeByteOperations.java#L97]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717957#comment-16717957
 ] 

Hadoop QA commented on HBASE-21406:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
9s{color} | {color:red} The patch generated 25 new + 409 unchanged - 5 fixed = 
434 total (was 414) {color} |
| {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange}  
0m  3s{color} | {color:orange} The patch generated 1 new + 748 unchanged - 1 
fixed = 749 total (was 749) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hbase-protocol in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
10s{color} | 

[jira] [Commented] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717952#comment-16717952
 ] 

Hadoop QA commented on HBASE-21586:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m  
8s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
15s{color} | {color:red} hbase-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
23s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 15s{color} 
| {color:red} hbase-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 23s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
21s{color} | {color:red} The patch fails to run checkstyle in hbase-common 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} hbase-client: The patch generated 5 new + 126 
unchanged - 2 fixed = 131 total (was 128) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-zookeeper: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} The patch fails to run checkstyle in hbase-server 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
22s{color} | {color:red} hbase-mapreduce: The patch generated 3 new + 16 
unchanged - 0 fixed = 19 total (was 16) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} hbase-rest: The patch generated 1 new + 21 unchanged - 
0 fixed = 22 total (was 21) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  1m 
12s{color} | {color:red} patch has 14 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  0m 
54s{color} | {color:red} The patch causes 14 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
51s{color} | {color:red} The patch causes 14 errors with Hadoop v3.0.0. {color} 
|
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} hbase-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} hbase-server in the patch failed. {color} |
| 

[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21564:
-
Attachment: HBASE-21564.master.004.patch

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch, 
> HBASE-21564.master.004.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717915#comment-16717915
 ] 

Sergey Shelukhin commented on HBASE-21564:
--

After some discussion, removed the global lock and separated the data 
structures to make sync code clearer

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch, 
> HBASE-21564.master.004.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18111) Replication stuck when cluster connection is closed

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18111:
---
Fix Version/s: 1.3.3

> Replication stuck when cluster connection is closed
> ---
>
> Key: HBASE-18111
> URL: https://issues.apache.org/jira/browse/HBASE-18111
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.3.1, 1.2.5, 0.98.24, 1.1.10, 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: HBASE-18111-v1.patch, HBASE-18111-v2.patch, 
> HBASE-18111.patch
>
>
> Log:
> {code}
> 2017-05-24,03:01:25,603 ERROR [regionserver13700-SendThread(hostxxx:11000)] 
> org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum 
> member failed: javax.security.sasl.SaslException: An error: 
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
> GSS initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Connection reset)]) occurred when evaluating Zookeeper 
> Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED 
> state.
> 2017-05-24,03:01:25,615 FATAL [regionserver13700-EventThread] 
> org.apache.hadoop.hbase.client.HConnectionImplementation: 
> hconnection-0x1148dd9b-0x35b6b4d4ca999c6, 
> quorum=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000,
>  baseZNode=/hbase/c3prc-xiaomi98 hconnection-0x1148dd9b-0x35b6b4d4ca999c6 
> received auth failed from ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = 
> AuthFailed
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:425)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:333)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-05-24,03:01:25,615 INFO [regionserver13700-EventThread] 
> org.apache.hadoop.hbase.client.HConnectionImplementation: Closing zookeeper 
> sessionid=0x35b6b4d4ca999c6
> 2017-05-24,03:01:25,623 WARN [regionserver13700.replicationSource,800] 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint:
>  Replicate edites to peer cluster failed.
> java.io.IOException: Call to hostxxx/10.136.22.6:24600 failed on local 
> exception: java.io.IOException: Connection closed
> {code}
> jstack
> {code}
>  java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.sleepForRetries(HBaseInterClusterReplicationEndpoint.java:127)
> at 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:199)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:905)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:492)
> {code}
> The cluster connection was aborted when the ZookeeperWatcher receive a 
> AuthFailed event. Then the HBaseInterClusterReplicationEndpoint's replicate() 
> method will stuck in a while loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717905#comment-16717905
 ] 

Sean Busbey commented on HBASE-21586:
-

+1

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sakthi updated HBASE-21569:
---
Status: Patch Available  (was: In Progress)

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
> Attachments: hbase-21569.master.001.patch
>
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717889#comment-16717889
 ] 

Hadoop QA commented on HBASE-21569:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
0s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  0m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21569 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951413/hbase-21569.master.001.patch
 |
| Optional Tests |  dupname  asflicense  shellcheck  shelldocs  |
| uname | Linux c9482b3a6609 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 67d6d5084c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| shellcheck | v0.4.4 |
| Max. process+thread count | 48 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
> Attachments: hbase-21569.master.001.patch
>
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717883#comment-16717883
 ] 

Sakthi commented on HBASE-21569:


I agree with your comment that it should have passed this time, but should have 
failed on all previous executions.

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
> Attachments: hbase-21569.master.001.patch
>
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sakthi updated HBASE-21569:
---
Attachment: hbase-21569.master.001.patch

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
> Attachments: hbase-21569.master.001.patch
>
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-16576) Shell add_peer doesn't allow setting cluster_key for custom endpoints

2018-12-11 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16576:
---
Fix Version/s: 1.3.3

> Shell add_peer doesn't allow setting cluster_key for custom endpoints
> -
>
> Key: HBASE-16576
> URL: https://issues.apache.org/jira/browse/HBASE-16576
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 1.1.5, 0.98.22, 2.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 1.4.0, 0.98.23, 1.3.3, 2.0.0
>
> Attachments: HBASE-16576-0.98.patch, HBASE-16576.patch, 
> HBASE-16576.v1.patch
>
>
> The HBase shell allows a user to create a replication peer using the add_peer 
> method, which can take a peer id and a Ruby hash. It creates a 
> ReplicationPeerConfig and passes it through to the Java 
> ReplicationAdmin#addPeer. 
> The Ruby code makes an assumption that the Java API doesn't: that CLUSTER_KEY 
> and ENDPOINT_CLASSNAME are mutually exclusive. If both are specified, it 
> throws an error. If only ENDPOINT_CLASSNAME is set, the add_peer logic 
> derives a local dummy cluster key based on the local cluster's configuration. 
> CLUSTER_KEY shouldn't be required when an ENDPOINT_CLASSNAME is specified, 
> because a custom endpoint might not need it. The dummy default logic is fine. 
>  
> But if an endpoint does require a remote cluster key, it shouldn't be 
> forbidden to provide one, especially since the Java API permits it, and even 
> the custom replication endpoint Java tests rely on this. (See 
> TestReplicationEndpoint#testCustomReplicationEndpoint)
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21545) NEW_VERSION_BEHAVIOR breaks Get/Scan with specified columns

2018-12-11 Thread Andrey Elenskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717870#comment-16717870
 ] 

Andrey Elenskiy commented on HBASE-21545:
-

[~jatsakthi] looks like I uploaded wrong patch for 4, I've fixed the 
compilation error in patch 5

> NEW_VERSION_BEHAVIOR breaks Get/Scan with specified columns
> ---
>
> Key: HBASE-21545
> URL: https://issues.apache.org/jira/browse/HBASE-21545
> Project: HBase
>  Issue Type: Bug
>  Components: API
>Affects Versions: 2.0.0, 2.1.1
> Environment: HBase 2.1.1
> Hadoop 2.8.4
> Java 8
>Reporter: Andrey Elenskiy
>Assignee: Andrey Elenskiy
>Priority: Major
> Attachments: App.java, HBASE-21545.branch-2.1.0001.patch, 
> HBASE-21545.branch-2.1.0002.patch, HBASE-21545.branch-2.1.0003.patch, 
> HBASE-21545.branch-2.1.0004.patch, HBASE-21545.branch-2.1.0005.patch
>
>
> Setting NEW_VERSION_BEHAVIOR => 'true' on a column family causes only one 
> column to be returned when columns are specified in Scan or Get query. The 
> result is always one first column by sorted order. I've attached a code 
> snipped to reproduce the issue that can be converted into a test.
> I've also validated with hbase shell and gohbase client, so it's gotta be 
> server side issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21545) NEW_VERSION_BEHAVIOR breaks Get/Scan with specified columns

2018-12-11 Thread Andrey Elenskiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Elenskiy updated HBASE-21545:

Attachment: HBASE-21545.branch-2.1.0005.patch

> NEW_VERSION_BEHAVIOR breaks Get/Scan with specified columns
> ---
>
> Key: HBASE-21545
> URL: https://issues.apache.org/jira/browse/HBASE-21545
> Project: HBase
>  Issue Type: Bug
>  Components: API
>Affects Versions: 2.0.0, 2.1.1
> Environment: HBase 2.1.1
> Hadoop 2.8.4
> Java 8
>Reporter: Andrey Elenskiy
>Assignee: Andrey Elenskiy
>Priority: Major
> Attachments: App.java, HBASE-21545.branch-2.1.0001.patch, 
> HBASE-21545.branch-2.1.0002.patch, HBASE-21545.branch-2.1.0003.patch, 
> HBASE-21545.branch-2.1.0004.patch, HBASE-21545.branch-2.1.0005.patch
>
>
> Setting NEW_VERSION_BEHAVIOR => 'true' on a column family causes only one 
> column to be returned when columns are specified in Scan or Get query. The 
> result is always one first column by sorted order. I've attached a code 
> snipped to reproduce the issue that can be converted into a test.
> I've also validated with hbase shell and gohbase client, so it's gotta be 
> server side issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717839#comment-16717839
 ] 

Sergey Shelukhin commented on HBASE-21575:
--

[~psomogyi] updated

> memstore above high watermark message is logged too much
> 
>
> Key: HBASE-21575
> URL: https://issues.apache.org/jira/browse/HBASE-21575
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21575.01.patch, HBASE-21575.patch
>
>
> 100s of Mb of logs like this, in a tight loop:
> {noformat}
> 2018-12-08 10:27:00,462 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3646ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,467 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3651ms
> 2018-12-08 10:27:00,469 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3653ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,473 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3657ms
> 2018-12-08 10:27:00,474 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3658ms
> 2018-12-08 10:27:00,475 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3659ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,477 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3661ms
> 2018-12-08 10:27:00,477 WARN  
> 

[jira] [Commented] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Peter Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717840#comment-16717840
 ] 

Peter Somogyi commented on HBASE-21569:
---

I'm not sure if my assumption is correct by sent this to dev@ a few days ago: 
[https://s.apache.org/KW9j] 

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717816#comment-16717816
 ] 

Sakthi commented on HBASE-21569:


I am wondering, why the latest index.html file doesn't contain the summary, 
whereas the old ones do have.

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21575:
-
Attachment: HBASE-21575.01.patch

> memstore above high watermark message is logged too much
> 
>
> Key: HBASE-21575
> URL: https://issues.apache.org/jira/browse/HBASE-21575
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21575.01.patch, HBASE-21575.patch
>
>
> 100s of Mb of logs like this, in a tight loop:
> {noformat}
> 2018-12-08 10:27:00,462 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3646ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,467 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3651ms
> 2018-12-08 10:27:00,469 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3653ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,473 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3657ms
> 2018-12-08 10:27:00,474 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3658ms
> 2018-12-08 10:27:00,475 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3659ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,477 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3661ms
> 2018-12-08 10:27:00,477 WARN  
> 

[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717836#comment-16717836
 ] 

Sergey Shelukhin commented on HBASE-21577:
--

[~Apache9] updated

> do not close regions when RS is dying due to a broken WAL
> -
>
> Key: HBASE-21577
> URL: https://issues.apache.org/jira/browse/HBASE-21577
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21577.master.001.patch, 
> HBASE-21577.master.002.patch
>
>
> See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
> broken, some regions whose flushes are already in flight keep retrying, 
> resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
> flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21577:
-
Attachment: HBASE-21577.master.002.patch

> do not close regions when RS is dying due to a broken WAL
> -
>
> Key: HBASE-21577
> URL: https://issues.apache.org/jira/browse/HBASE-21577
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21577.master.001.patch, 
> HBASE-21577.master.002.patch
>
>
> See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
> broken, some regions whose flushes are already in flight keep retrying, 
> resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
> flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-21569 started by Sakthi.
--
> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717780#comment-16717780
 ] 

Hadoop QA commented on HBASE-21538:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HBASE-21512 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
52s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
27s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
12s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} HBASE-21512 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} HBASE-21512 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} hbase-client: The patch generated 0 new + 82 
unchanged - 5 fixed = 82 total (was 87) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
26s{color} | {color:red} hbase-server: The patch generated 1 new + 74 unchanged 
- 3 fixed = 75 total (was 77) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 1s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
42s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}273m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}340m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.replication.TestReplicationKillSlaveRS |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.quotas.TestSpaceQuotas |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | 

[jira] [Commented] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717773#comment-16717773
 ] 

stack commented on HBASE-21586:
---

lgtm

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21586:
--
Status: Patch Available  (was: Open)

Hitting the 'submit' button

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21569) Incorrect validation in check-website-links.sh

2018-12-11 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sakthi reassigned HBASE-21569:
--

Assignee: Sakthi

> Incorrect validation in check-website-links.sh
> --
>
> Key: HBASE-21569
> URL: https://issues.apache.org/jira/browse/HBASE-21569
> Project: HBase
>  Issue Type: Bug
>Reporter: Peter Somogyi
>Assignee: Sakthi
>Priority: Major
>  Labels: beginner
>
> HBase Website Link Checker job [failed 
> recently|https://builds.apache.org/job/HBase%20Website%20Link%20Checker/179]. 
> The if statement is incorrect to validate the generated html file.
> [https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/check-website-links.sh#L59-L65]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Leach updated HBASE-21586:
---
Attachment: HBASE-21586.master.001.patch

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
> Attachments: HBASE-21586.master.001.patch
>
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21582) If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717704#comment-16717704
 ] 

Hadoop QA commented on HBASE-21582:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 2s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 50s{color} 
| {color:red} hbase-server generated 2 new + 186 unchanged - 2 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
2s{color} | {color:red} hbase-server: The patch generated 2 new + 16 unchanged 
- 2 fixed = 18 total (was 18) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}246m 34s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}284m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21582 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951365/HBASE-21582.v1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9e17e670c709 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f88224ee34 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 

[jira] [Updated] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-11 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-21568:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, Guanghao!

> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-11 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717626#comment-16717626
 ] 

Josh Elser commented on HBASE-21246:


{quote} The property 'name' looks redundant to me in FSWALIdentity:
{quote}
I think that's by design. Abstractly, we need to have a globally "unique" name 
for a WAL. In the filesystem application, the path on the filesystem is already 
that, so we can just use the name of that file in HDFS. It looks a little silly 
now, but, if you consider other implementations which won't be backed by an 
explicit FileSystem (e.g. the Ratis Log facade), it might make some more sense 
why we have a "getName" here. Happy to try to explain more/differently if it's 
not clear :)

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.HBASE-20952.003.patch, HBASE-21246.master.001.patch, 
> HBASE-21246.master.002.patch, replication-src-creates-wal-reader.jpg, 
> wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, 
> wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Leach reassigned HBASE-21586:
--

Assignee: John Leach

> Do not allocate empty arrays in hbase.
> --
>
> Key: HBASE-21586
> URL: https://issues.apache.org/jira/browse/HBASE-21586
> Project: HBase
>  Issue Type: Improvement
>Reporter: John Leach
>Assignee: John Leach
>Priority: Trivial
>
> Small nit but it is good to use the static empty arrays vs. creating new 
> ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21586) Do not allocate empty arrays in hbase.

2018-12-11 Thread John Leach (JIRA)
John Leach created HBASE-21586:
--

 Summary: Do not allocate empty arrays in hbase.
 Key: HBASE-21586
 URL: https://issues.apache.org/jira/browse/HBASE-21586
 Project: HBase
  Issue Type: Improvement
Reporter: John Leach


Small nit but it is good to use the static empty arrays vs. creating new ones.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717539#comment-16717539
 ] 

Hadoop QA commented on HBASE-21505:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch passed checkstyle in hbase-protocol-shaded 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch passed checkstyle in hbase-hadoop-compat 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch passed checkstyle in hbase-hadoop2-compat 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch passed checkstyle in hbase-protocol 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} The patch passed checkstyle in hbase-client {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} hbase-server: The patch generated 0 new + 85 
unchanged - 3 fixed = 85 total (was 88) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch passed checkstyle in hbase-shell {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
6s{color} | {color:red} The patch generated 55 new + 405 unchanged - 9 fixed = 
460 total (was 414) {color} |
| {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange}  
0m  2s{color} | {color:orange} The patch generated 3 new + 748 unchanged - 1 
fixed = 751 total (was 749) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 

[jira] [Commented] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-11 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717531#comment-16717531
 ] 

Wellington Chevreuil commented on HBASE-21406:
--

Attached new version addressing the checkstyles reported issues. The test 
failures don't seem related, but had increased sleep time inside one of the 
tests in 'hadoop.hbase.replication.TestReplicationStatus" that is passing 
locally but failing on jenkins. 

> "status 'replication'" should not show SINK if the cluster does not act as 
> sink
> ---
>
> Key: HBASE-21406
> URL: https://issues.apache.org/jira/browse/HBASE-21406
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daisuke Kobayashi
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-21406-branch-1.001.patch, 
> HBASE-21406-master.001.patch, HBASE-21406-master.002.patch, Screen Shot 
> 2018-10-31 at 18.12.54.png
>
>
> When replicating in 1 way, from source to target, {{status 'replication'}} on 
> source always dumps SINK with meaningless metrics. It only makes sense when 
> running the command on target cluster.
> {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is 
> always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the 
> time the RS started since it's not acting as sink.
> {noformat}
> source-1.com
>SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, 
> TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0
>SINK  : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 
> 23:56:53 PDT 2018
> {noformat}
> {{status 'replication'}} on target works as expected. SOURCE is empty as it's 
> not acting as source:
> {noformat}
> target-1.com
>SOURCE:
>SINK  : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 
> 23:44:08 PDT 2018
> {noformat}
> This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always 
> returns a value (not null).
> 1.X
> https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204
> 2.X
> https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-11 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21406:
-
Attachment: HBASE-21406-master.002.patch

> "status 'replication'" should not show SINK if the cluster does not act as 
> sink
> ---
>
> Key: HBASE-21406
> URL: https://issues.apache.org/jira/browse/HBASE-21406
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daisuke Kobayashi
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-21406-branch-1.001.patch, 
> HBASE-21406-master.001.patch, HBASE-21406-master.002.patch, Screen Shot 
> 2018-10-31 at 18.12.54.png
>
>
> When replicating in 1 way, from source to target, {{status 'replication'}} on 
> source always dumps SINK with meaningless metrics. It only makes sense when 
> running the command on target cluster.
> {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is 
> always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the 
> time the RS started since it's not acting as sink.
> {noformat}
> source-1.com
>SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, 
> TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0
>SINK  : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 
> 23:56:53 PDT 2018
> {noformat}
> {{status 'replication'}} on target works as expected. SOURCE is empty as it's 
> not acting as source:
> {noformat}
> target-1.com
>SOURCE:
>SINK  : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 
> 23:44:08 PDT 2018
> {noformat}
> This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always 
> returns a value (not null).
> 1.X
> https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204
> 2.X
> https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21512) Introduce an AsyncClusterConnection and replace the usage of ClusterConnection

2018-12-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717425#comment-16717425
 ] 

Hudson commented on HBASE-21512:


Results for branch HBASE-21512
[build #14 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/14/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/14//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/14//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/14//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce an AsyncClusterConnection and replace the usage of ClusterConnection
> --
>
> Key: HBASE-21512
> URL: https://issues.apache.org/jira/browse/HBASE-21512
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>
> At least for the RSProcedureDispatcher, with CompletableFuture we do not need 
> to set a delay and use a thread pool any more, which could reduce the 
> resource usage and also the latency.
> Once this is done, I think we can remove the ClusterConnection completely, 
> and start to rewrite the old sync client based on the async client, which 
> could reduce the code base a lot for our client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >