[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21570:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.0+.

Thanks [~stack] for reviewing.

> Add write buffer periodic flush support for AsyncBufferedMutator
> 
>
> Key: HBASE-21570
> URL: https://issues.apache.org/jira/browse/HBASE-21570
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3
>
> Attachments: HBASE-21570-v1.patch, HBASE-21570.patch
>
>
> Align with the BufferedMutator interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21453) Convert ReadOnlyZKClient to DEBUG instead of INFO

2018-12-10 Thread Peter Somogyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi updated HBASE-21453:
--
   Resolution: Fixed
Fix Version/s: 2.1.3
   2.2.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-2.1+.

Thanks for the patch [~jatsakthi]!

> Convert ReadOnlyZKClient to DEBUG instead of INFO
> -
>
> Key: HBASE-21453
> URL: https://issues.apache.org/jira/browse/HBASE-21453
> Project: HBase
>  Issue Type: Bug
>  Components: logging, Zookeeper
>Reporter: stack
>Assignee: Sakthi
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.3
>
> Attachments: hbase-21453.master.001.patch
>
>
> Running commands in spark-shell, this is what it looks like on each 
> invocation:
> {code}
> scala> val count = rdd.count()
> 2018-11-07 21:01:46,026 INFO  [Executor task launch worker for task 1] 
> zookeeper.ReadOnlyZKClient: Connect 0x18f3d868 to localhost:2181 with session 
> timeout=9ms, retries 30, retry interval 1000ms, keepAlive=6ms
> 2018-11-07 21:01:46,027 INFO  [ReadOnlyZKClient-localhost:2181@0x18f3d868] 
> zookeeper.ZooKeeper: Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=9 
> watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$20/1362339879@743dab9f
> 2018-11-07 21:01:46,030 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Opening socket connection to server 
> localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error)
> 2018-11-07 21:01:46,031 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Socket connection established to 
> localhost/127.0.0.1:2181, initiating session
> 2018-11-07 21:01:46,033 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Session establishment complete on server 
> localhost/127.0.0.1:2181, sessionid = 0x166f1b283080005, negotiated timeout = 
> 4
> 2018-11-07 21:01:46,035 INFO  [Executor task launch worker for task 1] 
> mapreduce.TableInputFormatBase: Input split length: 0 bytes.
> [Stage 1:>  (0 + 1) / 
> 1]2018-11-07 21:01:48,074 INFO  [Executor task launch worker for task 1] 
> zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x18f3d868 to 
> localhost:2181
> 2018-11-07 21:01:48,075 INFO  [ReadOnlyZKClient-localhost:2181@0x18f3d868] 
> zookeeper.ZooKeeper: Session: 0x166f1b283080005 closed
> 2018-11-07 21:01:48,076 INFO  [ReadOnlyZKClient 
> -localhost:2181@0x18f3d868-EventThread] zookeeper.ClientCnxn: EventThread 
> shut down for session: 0x166f1b283080005
> count: Long = 10
> {code}
> Let me shut down the ReadOnlyZKClient log level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-10 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: HBASE-21565.master.003.patch

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21453) Convert ReadOnlyZKClient to DEBUG instead of INFO

2018-12-10 Thread Peter Somogyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi updated HBASE-21453:
--
Release Note: Log level of ReadOnlyZKClient moved to debug.

> Convert ReadOnlyZKClient to DEBUG instead of INFO
> -
>
> Key: HBASE-21453
> URL: https://issues.apache.org/jira/browse/HBASE-21453
> Project: HBase
>  Issue Type: Bug
>  Components: logging, Zookeeper
>Reporter: stack
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-21453.master.001.patch
>
>
> Running commands in spark-shell, this is what it looks like on each 
> invocation:
> {code}
> scala> val count = rdd.count()
> 2018-11-07 21:01:46,026 INFO  [Executor task launch worker for task 1] 
> zookeeper.ReadOnlyZKClient: Connect 0x18f3d868 to localhost:2181 with session 
> timeout=9ms, retries 30, retry interval 1000ms, keepAlive=6ms
> 2018-11-07 21:01:46,027 INFO  [ReadOnlyZKClient-localhost:2181@0x18f3d868] 
> zookeeper.ZooKeeper: Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=9 
> watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$20/1362339879@743dab9f
> 2018-11-07 21:01:46,030 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Opening socket connection to server 
> localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error)
> 2018-11-07 21:01:46,031 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Socket connection established to 
> localhost/127.0.0.1:2181, initiating session
> 2018-11-07 21:01:46,033 INFO  
> [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] 
> zookeeper.ClientCnxn: Session establishment complete on server 
> localhost/127.0.0.1:2181, sessionid = 0x166f1b283080005, negotiated timeout = 
> 4
> 2018-11-07 21:01:46,035 INFO  [Executor task launch worker for task 1] 
> mapreduce.TableInputFormatBase: Input split length: 0 bytes.
> [Stage 1:>  (0 + 1) / 
> 1]2018-11-07 21:01:48,074 INFO  [Executor task launch worker for task 1] 
> zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x18f3d868 to 
> localhost:2181
> 2018-11-07 21:01:48,075 INFO  [ReadOnlyZKClient-localhost:2181@0x18f3d868] 
> zookeeper.ZooKeeper: Session: 0x166f1b283080005 closed
> 2018-11-07 21:01:48,076 INFO  [ReadOnlyZKClient 
> -localhost:2181@0x18f3d868-EventThread] zookeeper.ClientCnxn: EventThread 
> shut down for session: 0x166f1b283080005
> count: Long = 10
> {code}
> Let me shut down the ReadOnlyZKClient log level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21578) Fix wrong throttling exception for capacity unit

2018-12-10 Thread Yi Mei (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Mei updated HBASE-21578:
---
Attachment: HBASE-21578.master.001.patch

> Fix wrong throttling exception for capacity unit
> 
>
> Key: HBASE-21578
> URL: https://issues.apache.org/jira/browse/HBASE-21578
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Mei
>Priority: Major
> Attachments: HBASE-21578.master.001.patch
>
>
> HBASE-21034 provides a new throttle type: capacity unit, but the throttling 
> exception is confusing: 
>  
> {noformat}
> 2018-12-11 14:38:41,503 DEBUG [Time-limited test] 
> client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=7, 
> started=0 ms ago, cancelled=false, 
> msg=org.apache.hadoop.hbase.quotas.RpcThrottlingException: write size limit 
> exceeded - wait 10sec
> at 
> org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwThrottlingException(RpcThrottlingException.java:106)
> at 
> org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwWriteSizeExceeded(RpcThrottlingException.java:96)
> at 
> org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:179){noformat}
> Need to make the exception more clearly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21580) Support getting Hbck instance from AsyncClusterConnection

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21580:
--
Summary: Support getting Hbck instance from AsyncClusterConnection  (was: 
Support getting Hbck instance for AsyncClusterConnection)

> Support getting Hbck instance from AsyncClusterConnection
> -
>
> Key: HBASE-21580
> URL: https://issues.apache.org/jira/browse/HBASE-21580
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21580) Support getting Hbck instance for AsyncClusterConnection

2018-12-10 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21580:
-

 Summary: Support getting Hbck instance for AsyncClusterConnection
 Key: HBASE-21580
 URL: https://issues.apache.org/jira/browse/HBASE-21580
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21579) Use AsyncClusterConnection in Replication related classes

2018-12-10 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21579:
-

 Summary: Use AsyncClusterConnection in Replication related classes
 Key: HBASE-21579
 URL: https://issues.apache.org/jira/browse/HBASE-21579
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21578) Fix wrong throttling exception for capacity unit

2018-12-10 Thread Yi Mei (JIRA)
Yi Mei created HBASE-21578:
--

 Summary: Fix wrong throttling exception for capacity unit
 Key: HBASE-21578
 URL: https://issues.apache.org/jira/browse/HBASE-21578
 Project: HBase
  Issue Type: Bug
Reporter: Yi Mei


HBASE-21034 provides a new throttle type: capacity unit, but the throttling 
exception is confusing: 

 
{noformat}
2018-12-11 14:38:41,503 DEBUG [Time-limited test] 
client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=7, 
started=0 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.quotas.RpcThrottlingException: write size limit 
exceeded - wait 10sec
at 
org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwThrottlingException(RpcThrottlingException.java:106)
at 
org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwWriteSizeExceeded(RpcThrottlingException.java:96)
at 
org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:179){noformat}
Need to make the exception more clearly.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21538:
--
Assignee: Duo Zhang
  Status: Patch Available  (was: Open)

> Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection
> ---
>
> Key: HBASE-21538
> URL: https://issues.apache.org/jira/browse/HBASE-21538
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-21538-HBASE-21512.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716354#comment-16716354
 ] 

Hadoop QA commented on HBASE-21577:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
53s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}260m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}305m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.client.TestFromClientSide |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21577 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951296/HBASE-21577.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 157d38982940 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / da9508d427 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15240/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 

[jira] [Updated] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21538:
--
Attachment: HBASE-21538-HBASE-21512.patch

> Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection
> ---
>
> Key: HBASE-21538
> URL: https://issues.apache.org/jira/browse/HBASE-21538
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-21538-HBASE-21512.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716236#comment-16716236
 ] 

Hadoop QA commented on HBASE-21570:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} hbase-client: The patch generated 0 new + 3 
unchanged - 1 fixed = 3 total (was 4) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} The patch passed checkstyle in hbase-server {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
10s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}242m 35s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}296m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21570 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951289/HBASE-21570-v1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716206#comment-16716206
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 14 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
33s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
25s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
46s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} The patch passed checkstyle in hbase-common {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} hbase-server: The patch generated 0 new + 58 
unchanged - 1 fixed = 58 total (was 59) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
47s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
11s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
30s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 
2 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}130m  
7s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  org.apache.hadoop.hbase.wal.DisabledWALProvider$1.equals(Object) always 
returns true  At DisabledWALProvider.java:At DisabledWALProvider.java:[line 81] 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 

[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716174#comment-16716174
 ] 

Hudson commented on HBASE-21567:


Results for branch branch-2.1
[build #674 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716169#comment-16716169
 ] 

Hudson commented on HBASE-21567:


Results for branch branch-2.0
[build #1154 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21526) Use AsyncClusterConnection in ServerManager for getRsAdmin

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21526:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HBASE-21512
   Status: Resolved  (was: Patch Available)

Rebased and pushed to branch HBASE-21512. Thanks [~stack] for reviewing.

> Use AsyncClusterConnection in ServerManager for getRsAdmin
> --
>
> Key: HBASE-21526
> URL: https://issues.apache.org/jira/browse/HBASE-21526
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: HBASE-21512
>
> Attachments: HBASE-21526-HBASE-21512-v1.patch, 
> HBASE-21526-HBASE-21512-v2.patch, HBASE-21526-HBASE-21512.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716071#comment-16716071
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #58 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716052#comment-16716052
 ] 

Reid Chan commented on HBASE-21246:
---

Overall LGTM!

Few suggestions,
{code}
public FSWALIdentity(Path path) 

public FSWALIdentity(String name)
{code}
Can we add a pre-null check or annotation NotNullable or javadoc to raise 
attention of no-null?
Passing a null object to WALIdentity makes no sense to me.

The property 'name' looks redundant to me in FSWALIdentity:
{code}
  @Override
  public String getName() {
return name; // can always be replaced with path.getName(), no need 
of extra property.
  }
{code}

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.HBASE-20952.003.patch, HBASE-21246.master.001.patch, 
> HBASE-21246.master.002.patch, replication-src-creates-wal-reader.jpg, 
> wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, 
> wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716035#comment-16716035
 ] 

Sean Busbey commented on HBASE-21553:
-

it looks like the addition of a shared lock check for the namespace came in 
HBASE-15105, which means branch-1.2 doesn't have the missed lock release.

Clean up to use try/finally for unlocks is probably still a good idea, but 
probably better done as a different JIRA so that folks don't think there's the 
same risk of deadlock getting fixed.

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21553:

Component/s: proc-v2

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21553:

Issue Type: Bug  (was: Improvement)

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21553:

Priority: Critical  (was: Major)

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Critical
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716029#comment-16716029
 ] 

Hadoop QA commented on HBASE-21575:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
35s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}135m 30s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestMultiColumnScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21575 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951281/HBASE-21575.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7ac9b1373e6a 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / da9508d427 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15238/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 

[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716028#comment-16716028
 ] 

Hudson commented on HBASE-21553:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #508 (See 
[https://builds.apache.org/job/HBase-1.3-IT/508/])
HBASE-21553 schedLock not released in MasterProcedureScheduler (apurtell: rev 
b9adb955cde19746219b3efd8500c7ba7239ae56)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java


> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715996#comment-16715996
 ] 

Duo Zhang commented on HBASE-21577:
---

Is it possible to just set fsOk to false when there is a 
DroppedSnapshotException?

> do not close regions when RS is dying due to a broken WAL
> -
>
> Key: HBASE-21577
> URL: https://issues.apache.org/jira/browse/HBASE-21577
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21577.master.001.patch
>
>
> See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
> broken, some regions whose flushes are already in flight keep retrying, 
> resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
> flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716012#comment-16716012
 ] 

Andrew Purtell commented on HBASE-21553:


[~busbey] Do you want this in 1.2?

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21553:
---
Fix Version/s: 1.3.3

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715995#comment-16715995
 ] 

Karan Mehta commented on HBASE-21553:
-

Is this not going into branch-1.3 or branch-1.2?

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21246:
--
Attachment: HBASE-21246.HBASE-20952.003.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.HBASE-20952.003.patch, HBASE-21246.master.001.patch, 
> HBASE-21246.master.002.patch, replication-src-creates-wal-reader.jpg, 
> wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, 
> wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21553:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.10
   1.5.0
   Status: Resolved  (was: Patch Available)

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21577:


 Summary: do not close regions when RS is dying due to a broken WAL
 Key: HBASE-21577
 URL: https://issues.apache.org/jira/browse/HBASE-21577
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
broken, some regions whose flushes are already in flight keep retrying, 
resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21577:
-
Attachment: HBASE-21577.master.001.patch

> do not close regions when RS is dying due to a broken WAL
> -
>
> Key: HBASE-21577
> URL: https://issues.apache.org/jira/browse/HBASE-21577
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21577.master.001.patch
>
>
> See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
> broken, some regions whose flushes are already in flight keep retrying, 
> resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
> flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21577) do not close regions when RS is dying due to a broken WAL

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21577:
-
Status: Patch Available  (was: Open)

> do not close regions when RS is dying due to a broken WAL
> -
>
> Key: HBASE-21577
> URL: https://issues.apache.org/jira/browse/HBASE-21577
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21577.master.001.patch
>
>
> See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is 
> broken, some regions whose flushes are already in flight keep retrying, 
> resulting in minutes-long shutdown times. Since WAL will be replayed anyway 
> flushing regions doesn't provide much benefit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-10 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715927#comment-16715927
 ] 

Guanghao Zhang commented on HBASE-21568:


+1.

> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715926#comment-16715926
 ] 

Hadoop QA commented on HBASE-21406:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
57s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
22s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-hadoop2-compat: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
4s{color} | {color:red} hbase-server: The patch generated 3 new + 8 unchanged - 
0 fixed = 11 total (was 8) {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
7s{color} | {color:red} The patch generated 25 new + 409 unchanged - 5 fixed = 
434 total (was 414) {color} |
| {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange}  
0m  2s{color} | {color:orange} The patch generated 1 new + 748 unchanged - 1 
fixed = 749 total (was 749) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 31s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-hadoop2-compat in the 

[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715918#comment-16715918
 ] 

Hadoop QA commented on HBASE-21564:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
55s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}276m 26s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
36s{color} | {color:green} hbase-backup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}336m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.replication.TestReplicationEndpoint |
|   | hadoop.hbase.regionserver.TestSplitTransactionOnCluster |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectory |
|   | hadoop.hbase.client.TestFromClientSide3 |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas |
|   | hadoop.hbase.master.TestAssignmentManagerMetrics |
|   | hadoop.hbase.TestClientOperationTimeout |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleAsyncWAL |
|   | hadoop.hbase.client.TestAdmin1 |
|   | 
hadoop.hbase.master.replication.TestTransitPeerSyncReplicationStateProcedureRetry
 |
|   | 

[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21570:
--
Attachment: HBASE-21570-v1.patch

> Add write buffer periodic flush support for AsyncBufferedMutator
> 
>
> Key: HBASE-21570
> URL: https://issues.apache.org/jira/browse/HBASE-21570
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3
>
> Attachments: HBASE-21570-v1.patch, HBASE-21570.patch
>
>
> Align with the BufferedMutator interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21576) master should proactively reassign meta when killing a RS with it

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21576:
-
Description: 
Master has killed an RS that was hosting meta due to some HDFS issue (most 
likely; I've lost the RS logs due to HBASE-21575).
RS took a very long time to die (again, might be a separate bug, I'll file if I 
see repro), and a long time to restart; meanwhile master never tried to 
reassign meta, and eventually killed itself not being able to update it.
It seems like a RS on a bad machine would be especially prone to slow 
abort/startup, as well as to issues causing master to kill it, so it would make 
sense for master to immediately relocate meta once meta-hosting RS is dead 
after a kill; or even when killing the RS. In the former case (if the RS needs 
to die for meta to be reassigned safely), perhaps the RS hosting meta in 
particular should try to die fast in such circumstances, and not do any cleanup.
{noformat}
2018-12-08 04:52:55,144 WARN  
[RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] 
master.MasterRpcServices: ,17020,1544264858183 reported a fatal error:
* ABORTING region server ,17020,1544264858183: Replay of WAL 
required. Forcing server shutdown *
 [aborting for ~7 minutes]
2018-12-08 04:53:44,190 INFO  [PEWorker-7] client.RpcRetryingCallerImpl: Call 
exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server 
,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' 
at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [starting for ~5]
2018-12-08 04:59:58,574 INFO  
[RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] 
client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, 
started=392702 ms ago, cancelled=false, msg=Call to  failed on 
connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out: , details=row '...' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [re-initializing for at least ~7]
2018-12-08 05:04:17,271 INFO  [hconnection-0x4d58bcd4-shared-pool3-t1877] 
client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, 
started=41137 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server 
,17020,1544274145387 is not running yet
...
2018-12-08 05:11:18,470 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: 
* ABORTING master ...,17000,1544230401860: FAILED persisting region=... 
state=OPEN *^M
{noformat}

There are no signs of meta assignment activity at all in master logs

  was:
Master has killed an RS that was hosting meta due to some internal error (still 
need to see if it's a separate bug or just a machine/HDFS issue, I've lost the 
RS logs due to HBASE-21575).
RS took a very long time to die (again, might be a separate bug, I'll file if I 
see repro), and a long time to restart; meanwhile master never tried to 
reassign meta, and eventually killed itself not being able to update it.
It seems like a RS on a bad machine would be especially prone to slow 
abort/startup, as well as to issues causing master to kill it, so it would make 
sense for master to immediately relocate meta once meta-hosting RS is dead 
after a kill; or even when killing the RS. In the former case (if the RS needs 
to die for meta to be reassigned safely), perhaps the RS hosting meta in 
particular should try to die fast in such circumstances, and not do any cleanup.
{noformat}
2018-12-08 04:52:55,144 WARN  
[RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] 
master.MasterRpcServices: ,17020,1544264858183 reported a fatal error:
* ABORTING region server ,17020,1544264858183: Replay of WAL 
required. Forcing server shutdown *
 [aborting for ~7 minutes]
2018-12-08 04:53:44,190 INFO  [PEWorker-7] client.RpcRetryingCallerImpl: Call 
exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server 
,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' 
at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [starting for ~5]
2018-12-08 04:59:58,574 INFO  
[RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] 
client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, 
started=392702 ms ago, cancelled=false, msg=Call to  failed on 
connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out: , details=row '...' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [re-initializing for at least ~7]
2018-12-08 05:04:17,271 INFO  

[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715849#comment-16715849
 ] 

Hadoop QA commented on HBASE-21246:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 14 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 1 new + 58 unchanged 
- 1 fixed = 59 total (was 59) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
11s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
27s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 
2 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
34s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}128m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}171m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  org.apache.hadoop.hbase.wal.DisabledWALProvider$1 defines 
compareTo(Object) and uses Object.equals()  At 
DisabledWALProvider.java:Object.equals()  At DisabledWALProvider.java:[line 67] 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21246 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951267/HBASE-21246.master.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  

[jira] [Updated] (HBASE-21576) master should proactively reassign meta when killing a RS with it

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21576:
-
Description: 
Master has killed an RS that was hosting meta due to some internal error (still 
need to see if it's a separate bug or just a machine/HDFS issue, I've lost the 
RS logs due to HBASE-21575).
RS took a very long time to die (again, might be a separate bug, I'll file if I 
see repro), and a long time to restart; meanwhile master never tried to 
reassign meta, and eventually killed itself not being able to update it.
It seems like a RS on a bad machine would be especially prone to slow 
abort/startup, as well as to issues causing master to kill it, so it would make 
sense for master to immediately relocate meta once meta-hosting RS is dead 
after a kill; or even when killing the RS. In the former case (if the RS needs 
to die for meta to be reassigned safely), perhaps the RS hosting meta in 
particular should try to die fast in such circumstances, and not do any cleanup.
{noformat}
2018-12-08 04:52:55,144 WARN  
[RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] 
master.MasterRpcServices: ,17020,1544264858183 reported a fatal error:
* ABORTING region server ,17020,1544264858183: Replay of WAL 
required. Forcing server shutdown *
 [aborting for ~7 minutes]
2018-12-08 04:53:44,190 INFO  [PEWorker-7] client.RpcRetryingCallerImpl: Call 
exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server 
,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' 
at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [starting for ~5]
2018-12-08 04:59:58,574 INFO  
[RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] 
client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, 
started=392702 ms ago, cancelled=false, msg=Call to  failed on 
connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out: , details=row '...' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [re-initializing for at least ~7]
2018-12-08 05:04:17,271 INFO  [hconnection-0x4d58bcd4-shared-pool3-t1877] 
client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, 
started=41137 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server 
,17020,1544274145387 is not running yet
...
2018-12-08 05:11:18,470 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: 
* ABORTING master ...,17000,1544230401860: FAILED persisting region=... 
state=OPEN *^M
{noformat}

There are no signs of meta assignment activity at all in master logs

  was:
Master has killed an RS that was hosting meta due to some internal error (still 
need to see if it's a separate bug or just a machine/HDFS issue, I've lost the 
RS logs due to HBASE-21575).
RS took a very long time to die (again, might be a separate bug, I'll file if I 
see repro), and a long time to restart; meanwhile master never tried to 
reassign meta, and eventually killed itself not being able to update it.
It seems like a RS on a bad machine would be especially prone to slow 
abort/startup, as well as to issues causing master to kill it, so it would make 
sense for master to immediately relocate meta once meta-hosting RS is dead; or 
even when killing the RS. In the former case (if the RS needs to die for meta 
to be reassigned safely), perhaps the RS hosting meta in particular should try 
to die fast in such circumstances, and not do any cleanup.
{noformat}
2018-12-08 04:52:55,144 WARN  
[RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] 
master.MasterRpcServices: ,17020,1544264858183 reported a fatal error:
* ABORTING region server ,17020,1544264858183: Replay of WAL 
required. Forcing server shutdown *
 [aborting for ~7 minutes]
2018-12-08 04:53:44,190 INFO  [PEWorker-7] client.RpcRetryingCallerImpl: Call 
exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server 
,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' 
at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [starting for ~5]
2018-12-08 04:59:58,574 INFO  
[RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] 
client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, 
started=392702 ms ago, cancelled=false, msg=Call to  failed on 
connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out: , details=row '...' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [re-initializing for at least ~7]
2018-12-08 

[jira] [Created] (HBASE-21576) master should proactively reassign meta when killing a RS with it

2018-12-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21576:


 Summary: master should proactively reassign meta when killing a RS 
with it
 Key: HBASE-21576
 URL: https://issues.apache.org/jira/browse/HBASE-21576
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Master has killed an RS that was hosting meta due to some internal error (still 
need to see if it's a separate bug or just a machine/HDFS issue, I've lost the 
RS logs due to HBASE-21575).
RS took a very long time to die (again, might be a separate bug, I'll file if I 
see repro), and a long time to restart; meanwhile master never tried to 
reassign meta, and eventually killed itself not being able to update it.
It seems like a RS on a bad machine would be especially prone to slow 
abort/startup, as well as to issues causing master to kill it, so it would make 
sense for master to immediately relocate meta once meta-hosting RS is dead; or 
even when killing the RS. In the former case (if the RS needs to die for meta 
to be reassigned safely), perhaps the RS hosting meta in particular should try 
to die fast in such circumstances, and not do any cleanup.
{noformat}
2018-12-08 04:52:55,144 WARN  
[RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] 
master.MasterRpcServices: ,17020,1544264858183 reported a fatal error:
* ABORTING region server ,17020,1544264858183: Replay of WAL 
required. Forcing server shutdown *
 [aborting for ~7 minutes]
2018-12-08 04:53:44,190 INFO  [PEWorker-7] client.RpcRetryingCallerImpl: Call 
exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server 
,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' 
at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [starting for ~5]
2018-12-08 04:59:58,574 INFO  
[RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] 
client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, 
started=392702 ms ago, cancelled=false, msg=Call to  failed on 
connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: 
connection timed out: , details=row '...' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, 
seqNum=-1
... [re-initializing for at least ~7]
2018-12-08 05:04:17,271 INFO  [hconnection-0x4d58bcd4-shared-pool3-t1877] 
client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, 
started=41137 ms ago, cancelled=false, 
msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server 
,17020,1544274145387 is not running yet
...
2018-12-08 05:11:18,470 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: 
* ABORTING master ...,17000,1544230401860: FAILED persisting region=... 
state=OPEN *^M
{noformat}

There are no signs of meta assignment activity at all in master logs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HBASE-21283.
-
  Resolution: Fixed
Release Note: 


The HBase `shell` now includes a command to list regions currently in 
transition.

```
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
21:05:50 UTC 2018

hbase(main):001:0> help 'rit'
List all regions in transition.
Examples:
  hbase> rit

hbase(main):002:0> create ...
0 row(s) in 2.5150 seconds
=> Hbase::Table - IntegrationTestBigLinkedList

hbase(main):003:0> rit
0 row(s) in 0.0340 seconds

hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
0 row(s) in 0.0540 seconds

hbase(main):005:0> rit 
IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
 state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null 


  
1 row(s) in 0.0170 seconds
```

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715802#comment-16715802
 ] 

Hudson commented on HBASE-21567:


Results for branch branch-2
[build #1550 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21283) Add new shell command 'rit' for listing regions in transition

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reopened HBASE-21283:
-

> Add new shell command 'rit' for listing regions in transition
> -
>
> Key: HBASE-21283
> URL: https://issues.apache.org/jira/browse/HBASE-21283
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, shell
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, 
> HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, 
> HBASE-21283.patch
>
>
> The 'status' shell command shows regions in transition but sometimes an 
> operator may want to retrieve a simple list of regions in transition. Here's 
> a patch that adds a new 'rit' command to the TOOLS group that does just that. 
> No test, because it seems hard to mock RITs from the ruby test code, but I 
> have run TestShell and it passes, so the command is verified to meet minimum 
> requirements, like help text, and manually verified with branch-1 (shell in 
> branch-2 and up doesn't return until TransitRegionProcedure has completed so 
> by that time no RIT):
> {noformat}
> HBase Shell
> Use "help" to get list of supported commands.
> Use "exit" to quit this interactive shell.
> Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct  8 
> 21:05:50 UTC 2018
> hbase(main):001:0> help 'rit'
> List all regions in transition.
> Examples:
>   hbase> rit
> hbase(main):002:0> create ...
> 0 row(s) in 2.5150 seconds
> => Hbase::Table - IntegrationTestBigLinkedList
> hbase(main):003:0> rit
> 0 row(s) in 0.0340 seconds
> hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1'
> 0 row(s) in 0.0540 seconds
> hbase(main):005:0> rit 
> IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1.
>  state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null   
>   
>   
> 
> 1 row(s) in 0.0170 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21410) A helper page that help find all problematic regions and procedures

2018-12-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715798#comment-16715798
 ] 

Sean Busbey commented on HBASE-21410:
-

please reopen and then resolve again so you can add a release note calling this 
out.

> A helper page that help find all problematic regions and procedures
> ---
>
> Key: HBASE-21410
> URL: https://issues.apache.org/jira/browse/HBASE-21410
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2
>
> Attachments: HBASE-21410.branch-2.1.001.patch, 
> HBASE-21410.branch-2.1.002.patch, HBASE-21410.master.001.patch, 
> HBASE-21410.master.002.patch, HBASE-21410.master.003.patch, 
> HBASE-21410.master.004.patch, Screenshot from 2018-10-30 19-06-21.png, 
> Screenshot from 2018-10-30 19-06-42.png, Screenshot from 2018-10-31 
> 10-11-38.png, Screenshot from 2018-10-31 10-11-56.png, Screenshot from 
> 2018-11-01 17-56-02.png, Screenshot from 2018-11-01 17-56-15.png
>
>
> *This page is mainly focus on finding the regions stuck in some state that 
> cannot be assigned. My proposal of the page is as follows:*
> !Screenshot from 2018-10-30 19-06-21.png!
> *From this page we can see all regions in RIT queue and their related 
> procedures. If we can determine that these regions' state are abnormal, we 
> can click the link 'Procedures as TXT' to get a full list of procedure IDs to 
> bypass them. Then click 'Regions as TXT' to get a full list of encoded region 
> names to assign.*
> !Screenshot from 2018-10-30 19-06-42.png!
> *Some region names are covered by the navigator bar, I'll fix it later.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21575:
-
Description: 
100s of Mb of logs like this:
{noformat}
2018-12-08 10:27:00,462 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3646ms
2018-12-08 10:27:00,463 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms
2018-12-08 10:27:00,463 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms
2018-12-08 10:27:00,464 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms
2018-12-08 10:27:00,464 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms
2018-12-08 10:27:00,465 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms
2018-12-08 10:27:00,465 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms
2018-12-08 10:27:00,466 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms
2018-12-08 10:27:00,466 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms
2018-12-08 10:27:00,467 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3651ms
2018-12-08 10:27:00,469 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3653ms
2018-12-08 10:27:00,470 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms
2018-12-08 10:27:00,470 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms
2018-12-08 10:27:00,471 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms
2018-12-08 10:27:00,471 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms
2018-12-08 10:27:00,472 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms
2018-12-08 10:27:00,472 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms
2018-12-08 10:27:00,473 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3657ms
2018-12-08 10:27:00,474 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3658ms
2018-12-08 10:27:00,475 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3659ms
2018-12-08 10:27:00,476 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms
2018-12-08 10:27:00,476 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms
2018-12-08 10:27:00,477 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms
2018-12-08 10:27:00,477 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms
2018-12-08 10:27:00,478 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3662ms
2018-12-08 10:27:00,479 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms
2018-12-08 10:27:00,479 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms
2018-12-08 10:27:00,480 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 

[jira] [Updated] (HBASE-21410) A helper page that help find all problematic regions and procedures

2018-12-10 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-21410:

Fix Version/s: (was: 2.1.0)
   2.1.2

> A helper page that help find all problematic regions and procedures
> ---
>
> Key: HBASE-21410
> URL: https://issues.apache.org/jira/browse/HBASE-21410
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2
>
> Attachments: HBASE-21410.branch-2.1.001.patch, 
> HBASE-21410.branch-2.1.002.patch, HBASE-21410.master.001.patch, 
> HBASE-21410.master.002.patch, HBASE-21410.master.003.patch, 
> HBASE-21410.master.004.patch, Screenshot from 2018-10-30 19-06-21.png, 
> Screenshot from 2018-10-30 19-06-42.png, Screenshot from 2018-10-31 
> 10-11-38.png, Screenshot from 2018-10-31 10-11-56.png, Screenshot from 
> 2018-11-01 17-56-02.png, Screenshot from 2018-11-01 17-56-15.png
>
>
> *This page is mainly focus on finding the regions stuck in some state that 
> cannot be assigned. My proposal of the page is as follows:*
> !Screenshot from 2018-10-30 19-06-21.png!
> *From this page we can see all regions in RIT queue and their related 
> procedures. If we can determine that these regions' state are abnormal, we 
> can click the link 'Procedures as TXT' to get a full list of procedure IDs to 
> bypass them. Then click 'Regions as TXT' to get a full list of encoded region 
> names to assign.*
> !Screenshot from 2018-10-30 19-06-42.png!
> *Some region names are covered by the navigator bar, I'll fix it later.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21575:
-
Status: Patch Available  (was: Open)

> memstore above high watermark message is logged too much
> 
>
> Key: HBASE-21575
> URL: https://issues.apache.org/jira/browse/HBASE-21575
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21575.patch
>
>
> 100s of Mb of logs like this, in a tight loop:
> {noformat}
> 2018-12-08 10:27:00,462 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3646ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,467 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3651ms
> 2018-12-08 10:27:00,469 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3653ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,473 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3657ms
> 2018-12-08 10:27:00,474 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3658ms
> 2018-12-08 10:27:00,475 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3659ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,477 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3661ms
> 2018-12-08 10:27:00,477 WARN  
> 

[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21575:
-
Attachment: HBASE-21575.patch

> memstore above high watermark message is logged too much
> 
>
> Key: HBASE-21575
> URL: https://issues.apache.org/jira/browse/HBASE-21575
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21575.patch
>
>
> 100s of Mb of logs like this, in a tight loop:
> {noformat}
> 2018-12-08 10:27:00,462 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3646ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,463 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3647ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,464 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3648ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,465 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3649ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,466 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3650ms
> 2018-12-08 10:27:00,467 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3651ms
> 2018-12-08 10:27:00,469 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3653ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,470 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3654ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,471 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3655ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,472 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3656ms
> 2018-12-08 10:27:00,473 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3657ms
> 2018-12-08 10:27:00,474 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3658ms
> 2018-12-08 10:27:00,475 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3659ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,476 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3660ms
> 2018-12-08 10:27:00,477 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
> regionserver.MemStoreFlusher: Memstore is above high water mark and block 
> 3661ms
> 2018-12-08 10:27:00,477 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 

[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21575:
-
Description: 
100s of Mb of logs like this, in a tight loop:
{noformat}
2018-12-08 10:27:00,462 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3646ms
2018-12-08 10:27:00,463 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms
2018-12-08 10:27:00,463 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms
2018-12-08 10:27:00,464 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms
2018-12-08 10:27:00,464 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms
2018-12-08 10:27:00,465 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms
2018-12-08 10:27:00,465 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms
2018-12-08 10:27:00,466 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms
2018-12-08 10:27:00,466 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms
2018-12-08 10:27:00,467 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3651ms
2018-12-08 10:27:00,469 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3653ms
2018-12-08 10:27:00,470 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms
2018-12-08 10:27:00,470 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms
2018-12-08 10:27:00,471 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms
2018-12-08 10:27:00,471 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms
2018-12-08 10:27:00,472 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms
2018-12-08 10:27:00,472 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms
2018-12-08 10:27:00,473 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3657ms
2018-12-08 10:27:00,474 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3658ms
2018-12-08 10:27:00,475 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3659ms
2018-12-08 10:27:00,476 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms
2018-12-08 10:27:00,476 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms
2018-12-08 10:27:00,477 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms
2018-12-08 10:27:00,477 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms
2018-12-08 10:27:00,478 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3662ms
2018-12-08 10:27:00,479 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms
2018-12-08 10:27:00,479 WARN  
[RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms
2018-12-08 10:27:00,480 WARN  

[jira] [Created] (HBASE-21575) memstore above high watermark message is logged too much

2018-12-10 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21575:


 Summary: memstore above high watermark message is logged too much
 Key: HBASE-21575
 URL: https://issues.apache.org/jira/browse/HBASE-21575
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


100s of Mb of logs like this:
{noformat}
2018-12-08 10:29:07,603 WARN  
[RpcServer.default.FPBQ.Fifo.handler=12,queue=2,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 
103076ms
2018-12-08 10:29:07,603 WARN  
[RpcServer.default.FPBQ.Fifo.handler=44,queue=4,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 
150781ms
2018-12-08 10:29:07,603 WARN  
[RpcServer.default.FPBQ.Fifo.handler=14,queue=4,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 
150792ms
2018-12-08 10:29:07,603 WARN  
[RpcServer.default.FPBQ.Fifo.handler=23,queue=3,port=17020] 
regionserver.MemStoreFlusher: Memstore is above high water mark and block 
150780ms
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available

2018-12-10 Thread Cosmin Lehene (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715748#comment-16715748
 ] 

Cosmin Lehene commented on HBASE-21574:
---

The effective number of retries seems to be affected by 

ReadOnlyZKClient.RECOVERY_RETRY which defaults to 30 while the rest of the 
timeouts seem to be ignored 

{code}

callTimeout=1000, callDuration=149107: 
org.apache.hadoop.hbase.shaded.org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server row 

{code} 

 

see https://issues.apache.org/jira/browse/HBASE-21573

 

> createConnection / getTable should not return if there's no cluster available
> -
>
> Key: HBASE-21574
> URL: https://issues.apache.org/jira/browse/HBASE-21574
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.1.1
>Reporter: Cosmin Lehene
>Priority: Major
> Fix For: 2.1.2
>
>
> You can get a connection / table successfully with no cluster (no zk, hms, 
> hrs) and it also says it's open (closed = false)
> {code}
> Connection con = ConnectionFactory.createConnection(getConfiguration());
> con.getTable(TableName.valueOf(customersTable));
> {code}
> {code}
> con = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  hostnamesCanChange = true
>  pause = 100
>  pauseForCQTBE = 100
>  useMetaReplicas = false
>  metaReplicaCallTimeoutScanInMicroSecond = 100
>  numTries = 16
>  rpcTimeout = 6
>  asyncProcess = \{AsyncProcess@5242} 
>  stats = null
>  closed = false
>  aborted = false
>  clusterStatusListener = null
>  metaRegionLock = \{Object@5249} 
>  masterLock = \{Object@5250} 
>  batchPool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  metaLookupPool = null
>  cleanupPool = true
>  conf = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connectionConfig = \{ConnectionConfiguration@5239} 
>  rpcClient = \{NettyRpcClient@5251} 
>  metaCache = \{MetaCache@5252} 
>  metrics = null
>  user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)"
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244} 
>  interceptor = \{NoOpRetryableCallerInterceptor@5254} 
> "NoOpRetryableCallerInterceptor"
>  registry = \{ZKAsyncRegistry@5255} 
>  backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} 
>  alternateBufferedMutatorClassName = null
>  userRegionLock = \{ReentrantLock@5257} 
> "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]"
>  clusterId = "default-cluster"
>  stubs = \{ConcurrentHashMap@5259} size = 0
>  masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} 
> "MasterService"
> table = \{HTable@5193} "customers;hconnection-0x32093c94"
>  connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  tableName = \{TableName@5237} "customers"
>  configuration = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connConfiguration = \{ConnectionConfiguration@5239} 
>  closed = false
>  scannerCaching = 2147483647
>  scannerMaxResultSize = 2097152
>  pool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  operationTimeoutMs = 120
>  rpcTimeoutMs = 6
>  readRpcTimeoutMs = 6
>  writeRpcTimeoutMs = 6
>  cleanupPoolOnClose = false
>  locator = \{HRegionLocator@5241} 
>  multiAp = \{AsyncProcess@5242} 
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715716#comment-16715716
 ] 

Hudson commented on HBASE-21567:


Results for branch branch-1.3
[build #571 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715708#comment-16715708
 ] 

Hadoop QA commented on HBASE-21564:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
1s{color} | {color:red} hbase-server: The patch generated 2 new + 65 unchanged 
- 0 fixed = 67 total (was 65) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 27s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
10s{color} | {color:green} hbase-backup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}188m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleWAL |
|   | hadoop.hbase.replication.TestReplicationEndpoint |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21564 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951250/HBASE-21564.master.003.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 0163c4f49346 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available

2018-12-10 Thread Cosmin Lehene (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715695#comment-16715695
 ] 

Cosmin Lehene commented on HBASE-21574:
---

ConnectionImplementation in constructor tries to retrieve clusterid 

[https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L297]

It then fails (while ignoring max retries..) but ignores it

Looking at 
[https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L563-L570]

And then it returns a connection like everything is fine. Then getTable returns 
successfully too.

All this happens if you run the client code without any cluster whatsoever.

 

> createConnection / getTable should not return if there's no cluster available
> -
>
> Key: HBASE-21574
> URL: https://issues.apache.org/jira/browse/HBASE-21574
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.1.1
>Reporter: Cosmin Lehene
>Priority: Major
> Fix For: 2.1.2
>
>
> You can get a connection / table successfully with no cluster (no zk, hms, 
> hrs) and it also says it's open (closed = false)
> {code}
> Connection con = ConnectionFactory.createConnection(getConfiguration());
> con.getTable(TableName.valueOf(customersTable));
> {code}
> {code}
> con = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  hostnamesCanChange = true
>  pause = 100
>  pauseForCQTBE = 100
>  useMetaReplicas = false
>  metaReplicaCallTimeoutScanInMicroSecond = 100
>  numTries = 16
>  rpcTimeout = 6
>  asyncProcess = \{AsyncProcess@5242} 
>  stats = null
>  closed = false
>  aborted = false
>  clusterStatusListener = null
>  metaRegionLock = \{Object@5249} 
>  masterLock = \{Object@5250} 
>  batchPool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  metaLookupPool = null
>  cleanupPool = true
>  conf = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connectionConfig = \{ConnectionConfiguration@5239} 
>  rpcClient = \{NettyRpcClient@5251} 
>  metaCache = \{MetaCache@5252} 
>  metrics = null
>  user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)"
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244} 
>  interceptor = \{NoOpRetryableCallerInterceptor@5254} 
> "NoOpRetryableCallerInterceptor"
>  registry = \{ZKAsyncRegistry@5255} 
>  backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} 
>  alternateBufferedMutatorClassName = null
>  userRegionLock = \{ReentrantLock@5257} 
> "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]"
>  clusterId = "default-cluster"
>  stubs = \{ConcurrentHashMap@5259} size = 0
>  masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} 
> "MasterService"
> table = \{HTable@5193} "customers;hconnection-0x32093c94"
>  connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  tableName = \{TableName@5237} "customers"
>  configuration = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connConfiguration = \{ConnectionConfiguration@5239} 
>  closed = false
>  scannerCaching = 2147483647
>  scannerMaxResultSize = 2097152
>  pool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  operationTimeoutMs = 120
>  rpcTimeoutMs = 6
>  readRpcTimeoutMs = 6
>  writeRpcTimeoutMs = 6
>  cleanupPoolOnClose = false
>  locator = \{HRegionLocator@5241} 
>  multiAp = \{AsyncProcess@5242} 
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Ankit Singhal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715585#comment-16715585
 ] 

Ankit Singhal commented on HBASE-21246:
---

[~elserj], 
[HBASE-21246.master.002|https://issues.apache.org/jira/secure/attachment/12951267/HBASE-21246.master.002.patch],
 Fixed a test case failures and checkstyle errors. 

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.master.001.patch, HBASE-21246.master.002.patch, 
> replication-src-creates-wal-reader.jpg, wal-factory-providers.png, 
> wal-providers.png, wal-splitter-reader.jpg, wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21246:
--
Attachment: HBASE-21246.master.002.patch

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.master.001.patch, HBASE-21246.master.002.patch, 
> replication-src-creates-wal-reader.jpg, wal-factory-providers.png, 
> wal-providers.png, wal-splitter-reader.jpg, wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available

2018-12-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715573#comment-16715573
 ] 

stack commented on HBASE-21574:
---

Agree this is confusing.

> createConnection / getTable should not return if there's no cluster available
> -
>
> Key: HBASE-21574
> URL: https://issues.apache.org/jira/browse/HBASE-21574
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.1.1
>Reporter: Cosmin Lehene
>Priority: Major
> Fix For: 2.1.2
>
>
> You can get a connection / table successfully with no cluster (no zk, hms, 
> hrs) and it also says it's open (closed = false)
> {code}
> Connection con = ConnectionFactory.createConnection(getConfiguration());
> con.getTable(TableName.valueOf(customersTable));
> {code}
> {code}
> con = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  hostnamesCanChange = true
>  pause = 100
>  pauseForCQTBE = 100
>  useMetaReplicas = false
>  metaReplicaCallTimeoutScanInMicroSecond = 100
>  numTries = 16
>  rpcTimeout = 6
>  asyncProcess = \{AsyncProcess@5242} 
>  stats = null
>  closed = false
>  aborted = false
>  clusterStatusListener = null
>  metaRegionLock = \{Object@5249} 
>  masterLock = \{Object@5250} 
>  batchPool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  metaLookupPool = null
>  cleanupPool = true
>  conf = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connectionConfig = \{ConnectionConfiguration@5239} 
>  rpcClient = \{NettyRpcClient@5251} 
>  metaCache = \{MetaCache@5252} 
>  metrics = null
>  user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)"
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244} 
>  interceptor = \{NoOpRetryableCallerInterceptor@5254} 
> "NoOpRetryableCallerInterceptor"
>  registry = \{ZKAsyncRegistry@5255} 
>  backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} 
>  alternateBufferedMutatorClassName = null
>  userRegionLock = \{ReentrantLock@5257} 
> "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]"
>  clusterId = "default-cluster"
>  stubs = \{ConcurrentHashMap@5259} size = 0
>  masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} 
> "MasterService"
> table = \{HTable@5193} "customers;hconnection-0x32093c94"
>  connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
>  tableName = \{TableName@5237} "customers"
>  configuration = \{Configuration@5238} "Configuration: core-default.xml, 
> core-site.xml, hbase-default.xml, hbase-site.xml"
>  connConfiguration = \{ConnectionConfiguration@5239} 
>  closed = false
>  scannerCaching = 2147483647
>  scannerMaxResultSize = 2097152
>  pool = \{ThreadPoolExecutor@5240} 
> "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]"
>  operationTimeoutMs = 120
>  rpcTimeoutMs = 6
>  readRpcTimeoutMs = 6
>  writeRpcTimeoutMs = 6
>  cleanupPoolOnClose = false
>  locator = \{HRegionLocator@5241} 
>  multiAp = \{AsyncProcess@5242} 
>  rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
>  rpcControllerFactory = \{RpcControllerFactory@5244}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21573) More sensible client default timeout values

2018-12-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715572#comment-16715572
 ] 

stack commented on HBASE-21573:
---

We should fix at least surface what it takes to make client fail fast (and 
important configs should be in hbase-default where folks will go looking for 
them ...)

I think if we made stuff fail fast, we'd surface some interesting assumptions 
we've been depending on w/ a good while now.

> More sensible client default timeout values
> ---
>
> Key: HBASE-21573
> URL: https://issues.apache.org/jira/browse/HBASE-21573
> Project: HBase
>  Issue Type: Wish
>  Components: Client
>Affects Versions: 2.1.1
>Reporter: Cosmin Lehene
>Priority: Major
> Fix For: 2.1.2
>
>
> I guess the goal is to have operations allow enough time to recover from 
> major failures.
> While this may make sense for large jobs, it's a PITA for OLTP scenarios and 
> could probably benefit from a faster failure mode in default
>  
> hbase.rpc.timeout = 6
> hbase.client.operation.timeout = 120
> hbase.client.meta.operation.timeout = 120
> The client meta ops timeout is not in defaults-xml and not documented in the 
> book either.
> [https://hbase.apache.org/book.html#config_timeouts]
>  
> Would it make sense to have aggressive defaults instead?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21574) createConnection / getTable should not return if there's no cluster available

2018-12-10 Thread Cosmin Lehene (JIRA)
Cosmin Lehene created HBASE-21574:
-

 Summary: createConnection / getTable should not return if there's 
no cluster available
 Key: HBASE-21574
 URL: https://issues.apache.org/jira/browse/HBASE-21574
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.1.1
Reporter: Cosmin Lehene
 Fix For: 2.1.2


You can get a connection / table successfully with no cluster (no zk, hms, hrs) 
and it also says it's open (closed = false)

{code}

Connection con = ConnectionFactory.createConnection(getConfiguration());
con.getTable(TableName.valueOf(customersTable));

{code}

{code}

con = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
 hostnamesCanChange = true
 pause = 100
 pauseForCQTBE = 100
 useMetaReplicas = false
 metaReplicaCallTimeoutScanInMicroSecond = 100
 numTries = 16
 rpcTimeout = 6
 asyncProcess = \{AsyncProcess@5242} 
 stats = null
 closed = false
 aborted = false
 clusterStatusListener = null
 metaRegionLock = \{Object@5249} 
 masterLock = \{Object@5250} 
 batchPool = \{ThreadPoolExecutor@5240} 
"java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]"
 metaLookupPool = null
 cleanupPool = true
 conf = \{Configuration@5238} "Configuration: core-default.xml, core-site.xml, 
hbase-default.xml, hbase-site.xml"
 connectionConfig = \{ConnectionConfiguration@5239} 
 rpcClient = \{NettyRpcClient@5251} 
 metaCache = \{MetaCache@5252} 
 metrics = null
 user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)"
 rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
 rpcControllerFactory = \{RpcControllerFactory@5244} 
 interceptor = \{NoOpRetryableCallerInterceptor@5254} 
"NoOpRetryableCallerInterceptor"
 registry = \{ZKAsyncRegistry@5255} 
 backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} 
 alternateBufferedMutatorClassName = null
 userRegionLock = \{ReentrantLock@5257} 
"java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]"
 clusterId = "default-cluster"
 stubs = \{ConcurrentHashMap@5259} size = 0
 masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} 
"MasterService"
table = \{HTable@5193} "customers;hconnection-0x32093c94"
 connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94"
 tableName = \{TableName@5237} "customers"
 configuration = \{Configuration@5238} "Configuration: core-default.xml, 
core-site.xml, hbase-default.xml, hbase-site.xml"
 connConfiguration = \{ConnectionConfiguration@5239} 
 closed = false
 scannerCaching = 2147483647
 scannerMaxResultSize = 2097152
 pool = \{ThreadPoolExecutor@5240} 
"java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]"
 operationTimeoutMs = 120
 rpcTimeoutMs = 6
 readRpcTimeoutMs = 6
 writeRpcTimeoutMs = 6
 cleanupPoolOnClose = false
 locator = \{HRegionLocator@5241} 
 multiAp = \{AsyncProcess@5242} 
 rpcCallerFactory = \{RpcRetryingCallerFactory@5243} 
 rpcControllerFactory = \{RpcControllerFactory@5244}

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715568#comment-16715568
 ] 

Hudson commented on HBASE-21567:


Results for branch branch-1.2
[build #582 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21573) More sensible client default timeout values

2018-12-10 Thread Cosmin Lehene (JIRA)
Cosmin Lehene created HBASE-21573:
-

 Summary: More sensible client default timeout values
 Key: HBASE-21573
 URL: https://issues.apache.org/jira/browse/HBASE-21573
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.1.1
Reporter: Cosmin Lehene
 Fix For: 2.1.2


I guess the goal is to have operations allow enough time to recover from major 
failures.

While this may make sense for large jobs, it's a PITA for OLTP scenarios and 
could probably benefit from a faster failure mode in default

 

hbase.rpc.timeout = 6

hbase.client.operation.timeout = 120

hbase.client.meta.operation.timeout = 120

The client meta ops timeout is not in defaults-xml and not documented in the 
book either.

[https://hbase.apache.org/book.html#config_timeouts]

 

Would it make sense to have aggressive defaults instead?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21573) More sensible client default timeout values

2018-12-10 Thread Cosmin Lehene (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cosmin Lehene updated HBASE-21573:
--
Issue Type: Wish  (was: Bug)

> More sensible client default timeout values
> ---
>
> Key: HBASE-21573
> URL: https://issues.apache.org/jira/browse/HBASE-21573
> Project: HBase
>  Issue Type: Wish
>  Components: Client
>Affects Versions: 2.1.1
>Reporter: Cosmin Lehene
>Priority: Major
> Fix For: 2.1.2
>
>
> I guess the goal is to have operations allow enough time to recover from 
> major failures.
> While this may make sense for large jobs, it's a PITA for OLTP scenarios and 
> could probably benefit from a faster failure mode in default
>  
> hbase.rpc.timeout = 6
> hbase.client.operation.timeout = 120
> hbase.client.meta.operation.timeout = 120
> The client meta ops timeout is not in defaults-xml and not documented in the 
> book either.
> [https://hbase.apache.org/book.html#config_timeouts]
>  
> Would it make sense to have aggressive defaults instead?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-10 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21406:
-
Status: Patch Available  (was: In Progress)

Attached first patch version. Basically, added new metric to differentiate sink 
startup time from last OP applied time. Original behaviour was to always set 
startup time to TimestampsOfLastAppliedOp, and always show it on "status 
'replication'" command, regardless if the sink ever applied any OP. This was 
confusing, specially for scenarios where cluster was just acting as source, the 
output could lead to wrong interpretations about sink not applying edits or 
replication being stuck. With the new metric, we now compare the two metrics 
values, assuming that if both are the same, there's never been any OP shipped 
to the given sink, so output would reflect it more clearly, to something as for 
example:
{noformat}
SINK: TimeStampStarted=Thu Dec 06 23:59:47 GMT 2018, Waiting for 
OPs...{noformat}

For the replication source issues described earlier, have an ongoing jira: 
HBASE-21505.

> "status 'replication'" should not show SINK if the cluster does not act as 
> sink
> ---
>
> Key: HBASE-21406
> URL: https://issues.apache.org/jira/browse/HBASE-21406
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daisuke Kobayashi
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-21406-branch-1.001.patch, 
> HBASE-21406-master.001.patch, Screen Shot 2018-10-31 at 18.12.54.png
>
>
> When replicating in 1 way, from source to target, {{status 'replication'}} on 
> source always dumps SINK with meaningless metrics. It only makes sense when 
> running the command on target cluster.
> {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is 
> always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the 
> time the RS started since it's not acting as sink.
> {noformat}
> source-1.com
>SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, 
> TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0
>SINK  : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 
> 23:56:53 PDT 2018
> {noformat}
> {{status 'replication'}} on target works as expected. SOURCE is empty as it's 
> not acting as source:
> {noformat}
> target-1.com
>SOURCE:
>SINK  : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 
> 23:44:08 PDT 2018
> {noformat}
> This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always 
> returns a value (not null).
> 1.X
> https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204
> 2.X
> https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21564:
-
Attachment: HBASE-21564.master.003.patch

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink

2018-12-10 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21406:
-
Attachment: HBASE-21406-master.001.patch

> "status 'replication'" should not show SINK if the cluster does not act as 
> sink
> ---
>
> Key: HBASE-21406
> URL: https://issues.apache.org/jira/browse/HBASE-21406
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daisuke Kobayashi
>Assignee: Wellington Chevreuil
>Priority: Minor
> Attachments: HBASE-21406-branch-1.001.patch, 
> HBASE-21406-master.001.patch, Screen Shot 2018-10-31 at 18.12.54.png
>
>
> When replicating in 1 way, from source to target, {{status 'replication'}} on 
> source always dumps SINK with meaningless metrics. It only makes sense when 
> running the command on target cluster.
> {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is 
> always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the 
> time the RS started since it's not acting as sink.
> {noformat}
> source-1.com
>SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, 
> TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0
>SINK  : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 
> 23:56:53 PDT 2018
> {noformat}
> {{status 'replication'}} on target works as expected. SOURCE is empty as it's 
> not acting as source:
> {noformat}
> target-1.com
>SOURCE:
>SINK  : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 
> 23:44:08 PDT 2018
> {noformat}
> This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always 
> returns a value (not null).
> 1.X
> https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204
> 2.X
> https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715465#comment-16715465
 ] 

Sergey Shelukhin commented on HBASE-21564:
--

Fixed warnings, addressed some RB feedback; I cannot repro the test failures, 
the logs for both have connection errors...

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21564:
-
Attachment: (was: HBASE-21564.master.003.patch)

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21564:
-
Attachment: HBASE-21564.master.003.patch

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Ankit Singhal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715410#comment-16715410
 ] 

Ankit Singhal commented on HBASE-21246:
---

Thanks [~elserj] for the review.

{quote}I see RecoveredReplicationSource.java still needs some "unraveling" from 
Path.

WALEntryStream is in a similar position (a couple of others than just those 
pulled out above). 
{quote}
We may not need to change these Path to WalIdentity once these classes are 
refactored to abstract FS based implementation. The code related to path is 
expected to be moved in FS based implementation.

{quote} Should DisabledWALProvider have its own implementation of WALIdentity? 
Looks like we just pass a "special" Path in the FS-based case now – maybe we 
just make some special implementation of WALIdentity for it instead.{quote}
Let me introduce the new WALIdentity for it.


{quote}As long as we can spin out the above refactorings into some follow-on 
work, I would be happy to land this on the feature branch.{quote}
Yes, these refactorings goes in another follow-on jira, let me just upload a 
another patch fixing the checkstyle and the test failure before you commit.




> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.master.001.patch, replication-src-creates-wal-reader.jpg, 
> wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, 
> wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky

2018-12-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715414#comment-16715414
 ] 

stack commented on HBASE-21559:
---

This cleaned up the failures nicely. Thanks [~openinx].

> The RestoreSnapshotFromClientTestBase related UT are flaky
> --
>
> Key: HBASE-21559
> URL: https://issues.apache.org/jira/browse/HBASE-21559
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, 
> TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml,
>  
> org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt,
>  
> org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt
>
>
> The  related UT are: 
> * TestRestoreSnapshotFromClientAfterSplittingRegions
> * TestRestoreSnapshotFromClientWithRegionReplicas
> * TestMobRestoreSnapshotFromClientAfterSplittingRegions
> I guess the main problem is:  a dead lock between SplitTableRegionProcedure 
> and SnapshotProcedure.. 
> Attached logs from the failed UT. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715352#comment-16715352
 ] 

Hudson commented on HBASE-21567:


FAILURE: Integrated in Jenkins build HBase-1.2-IT #1188 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1188/])
HBASE-21567 Allow overriding configs starting up the shell (stack: rev 
0cdd8f972f9cb82e883de95435958ea824fc636a)
* (edit) bin/hirb.rb
* (edit) src/main/asciidoc/_chapters/shell.adoc


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715346#comment-16715346
 ] 

Hudson commented on HBASE-21567:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #507 (See 
[https://builds.apache.org/job/HBase-1.3-IT/507/])
HBASE-21567 Allow overriding configs starting up the shell (stack: rev 
d27c835b1cfbbe2c59f0698d3a286b19e7f63471)
* (edit) src/main/asciidoc/_chapters/shell.adoc
* (edit) bin/hirb.rb


> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19805) NPE in HMaster while issuing a sequence of table splits

2018-12-10 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-19805.

   Resolution: Incomplete
 Assignee: (was: Sergey Soldatov)
Fix Version/s: (was: 3.0.0)

Stale.

> NPE in HMaster while issuing a sequence of table splits
> ---
>
> Key: HBASE-19805
> URL: https://issues.apache.org/jira/browse/HBASE-19805
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 2.0.0-beta-1
>Reporter: Josh Elser
>Priority: Critical
>
> I wrote a toy program to test the client tarball in HBASE-19735. After the 
> first few region splits, I see the following error in the Master log. 
> {noformat}
> 2018-01-16 14:07:52,797 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] master.HMaster: 
> Client=jelser//192.168.1.23 split 
> myTestTable,1,1516129669054.8313b755f74092118f9dd30a4190ee23.
> 2018-01-16 14:07:52,797 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] ipc.RpcServer: 
> Unexpected throwable object
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:229)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getAdmin(ConnectionImplementation.java:1175)
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getAdmin(ConnectionUtils.java:149)
>   at 
> org.apache.hadoop.hbase.master.assignment.Util.getRegionInfoResponse(Util.java:59)
>   at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkSplittable(SplitTableRegionProcedure.java:146)
>   at 
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.(SplitTableRegionProcedure.java:103)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createSplitProcedure(AssignmentManager.java:761)
>   at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1626)
>   at 
> org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:134)
>   at org.apache.hadoop.hbase.master.HMaster.splitRegion(HMaster.java:1618)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.splitRegion(MasterRpcServices.java:778)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {noformat}
> {code}
>   public static void main(String[] args) throws Exception {
> Configuration conf = HBaseConfiguration.create();
> try (Connection conn = ConnectionFactory.createConnection(conf);
> Admin admin = conn.getAdmin()) {
>   final TableName tn = TableName.valueOf("myTestTable");
>   if (admin.tableExists(tn)) {
> admin.disableTable(tn);
> admin.deleteTable(tn);
>   }
>   final TableDescriptor desc = TableDescriptorBuilder.newBuilder(tn)
>   
> .addColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("f1")).build())
>   .build();
>   admin.createTable(desc);
>   List splitPoints = new ArrayList<>(16);
>   for (int i = 1; i <= 16; i++) {
> splitPoints.add(Integer.toString(i, 16));
>   }
>   
>   System.out.println("Splits: " + splitPoints);
>   int numRegions = admin.getRegions(tn).size();
>   for (String splitPoint : splitPoints) {
> System.out.println("Splitting on " + splitPoint);
> admin.split(tn, Bytes.toBytes(splitPoint));
> Thread.sleep(200);
> int newRegionSize = admin.getRegions(tn).size();
> while (numRegions == newRegionSize) {
>   Thread.sleep(50);
>   newRegionSize = admin.getRegions(tn).size();
> }
>   }
> {code}
> A quick glance, looks like {{Util.getRegionInfoResponse}} is to blame.
> {code}
>   static GetRegionInfoResponse getRegionInfoResponse(final MasterProcedureEnv 
> env,
>   final ServerName regionLocation, final RegionInfo hri, boolean 
> includeBestSplitRow)
>   throws IOException {
> // TODO: There is no timeout on this controller. Set one!
> HBaseRpcController controller = 
> env.getMasterServices().getClusterConnection().
> getRpcControllerFactory().newController();
> final AdminService.BlockingInterface admin =
> 
> env.getMasterServices().getClusterConnection().getAdmin(regionLocation);
> {code}
> We don't validate that we have a non-null {{ServerName regionLocation}}.

[jira] [Updated] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21567:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.10
   1.2.10
   1.3.3
   1.5.0
   Status: Resolved  (was: Patch Available)

Pushed it to 1.2+.

> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell

2018-12-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715266#comment-16715266
 ] 

stack commented on HBASE-21567:
---

bq. Ok, looks like rubocop is complaining for almost everything.

Yeah, someone once said that if you don't like the contributor, make them fix 
the rubocop warnings (smile)!

Let me push! Thanks for review [~psomogyi] (and [~Apache9])

> Allow overriding configs starting up the shell
> --
>
> Key: HBASE-21567
> URL: https://issues.apache.org/jira/browse/HBASE-21567
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.3
>
> Attachments: HBASE-21567.master.001.patch, 
> HBASE-21567.master.002.patch, HBASE-21567.master.003.patch
>
>
> Needed to be able to point a local install at a remote cluster. I wanted to 
> be able to do this:
> ${HBASE_HOME}/bin/hbase shell 
> -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21526) Use AsyncClusterConnection in ServerManager for getRsAdmin

2018-12-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715113#comment-16715113
 ] 

stack commented on HBASE-21526:
---

I +1'd it up on rb (thanks for class rename).

> Use AsyncClusterConnection in ServerManager for getRsAdmin
> --
>
> Key: HBASE-21526
> URL: https://issues.apache.org/jira/browse/HBASE-21526
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-21526-HBASE-21512-v1.patch, 
> HBASE-21526-HBASE-21512-v2.patch, HBASE-21526-HBASE-21512.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-12-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715060#comment-16715060
 ] 

Josh Elser commented on HBASE-21246:


{quote}used master for pre-commit jenkin build as HBASE-20952 yet to be rebased
{quote}
Just rebased that branch now.
{code:java}
-PriorityBlockingQueue newPaths =
-new PriorityBlockingQueue(queueSizePerGroup, new 
LogsComparator());
-pathsLoop: for (Path path : queue) {
-  if (fs.exists(path)) { // still in same location, don't need to do 
anything
-newPaths.add(path);
+PriorityBlockingQueue newWalIds =
+new PriorityBlockingQueue(queueSizePerGroup, new 
LogsComparator());
+pathsLoop: for (WALIdentity walId : queue) {
+  if (fs.exists(((FSWALIdentity)walId).getPath())) { // still in same 
location, don't need to do anything
+newWalIds.add(walId);{code}
I see RecoveredReplicationSource.java still needs some "unraveling" from Path.
{code:java}
-  stat = fs.getFileStatus(this.currentPath);
+  stat = 
fs.getFileStatus(((FSWALIdentity)this.currentWAlIdentity).getPath());{code}
{code:java}
-Path archivedLog = getArchivedLog(path);
-if (!path.equals(archivedLog)) {
+FSWALIdentity archivedLog = new 
FSWALIdentity(getArchivedLog(walId.getPath()));
+if (!walId.equals(archivedLog)) {{code}
WALEntryStream is in a similar position (a couple of others than just those 
pulled out above).
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java
index 75439fe6c5..ad9f6bda30 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java
@@ -63,7 +63,7 @@ class DisabledWALProvider implements WALProvider {
 if (null == providerId) {
   providerId = "defaultDisabled";
 }
-disabled = new DisabledWAL(new Path(FSUtils.getWALRootDir(conf), 
providerId), conf, null);
+disabled = new DisabledWAL(new FSWALIdentity(new 
Path(FSUtils.getWALRootDir(conf), providerId)), conf, null);{code}
{code:java}
-protected final Path path;
+protected final FSWALIdentity walId;{code}
Should DisabledWALProvider have its own implementation of WALIdentity? Looks 
like we just pass a "special" Path in the FS-based case now – maybe we just 
make some special implementation of WALIdentity for it instead.

Overall, I think this is a really nice middle-ground of changing "enough" 
without changing too much. As long as we can spin out the above refactorings 
into some follow-on work, I would be happy to land this on the feature branch.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, 
> 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, 
> 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, 
> HBASE-21246.master.001.patch, replication-src-creates-wal-reader.jpg, 
> wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, 
> wal-splitter-writer.jpg
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715005#comment-16715005
 ] 

Hadoop QA commented on HBASE-21570:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
42s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} hbase-client: The patch generated 0 new + 3 
unchanged - 1 fixed = 3 total (was 4) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} The patch passed checkstyle in hbase-server {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
45s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 
18s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21570 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951202/HBASE-21570.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux cde26df8f748 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714933#comment-16714933
 ] 

Hadoop QA commented on HBASE-21505:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
15s{color} | {color:red} hbase-server: The patch generated 2 new + 85 unchanged 
- 3 fixed = 87 total (was 88) {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
7s{color} | {color:red} The patch generated 55 new + 405 unchanged - 9 fixed = 
460 total (was 414) {color} |
| {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange}  
0m  4s{color} | {color:orange} The patch generated 3 new + 748 unchanged - 1 
fixed = 751 total (was 749) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
19s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 
2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit 

[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles

2018-12-10 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714860#comment-16714860
 ] 

Josh Elser commented on HBASE-21568:


Ok, thanks Guanghao!

Just to be clear, is this your +1 for the current patch as well?

> Disable use of BlockCache for LoadIncrementalHFiles
> ---
>
> Key: HBASE-21568
> URL: https://issues.apache.org/jira/browse/HBASE-21568
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21568.001.branch-2.0.patch
>
>
> [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow 
> callers to specify that they do not want to use a block cache when reading an 
> HFile.
> If the BucketCache is set up to use the FileSystem, we can have a situation 
> where the client tries to instantiate the BucketCache and is disallowed due 
> to filesystem permissions:
> {code:java}
> 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: 
> Failed allocating cache on /mnt/hbase/cache.data
> java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:124)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633)
>   at 
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663)
>   at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621)
>   at 
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21512) Introduce an AsyncClusterConnection and replace the usage of ClusterConnection

2018-12-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714772#comment-16714772
 ] 

Hudson commented on HBASE-21512:


Results for branch HBASE-21512
[build #12 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce an AsyncClusterConnection and replace the usage of ClusterConnection
> --
>
> Key: HBASE-21512
> URL: https://issues.apache.org/jira/browse/HBASE-21512
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>
> At least for the RSProcedureDispatcher, with CompletableFuture we do not need 
> to set a delay and use a thread pool any more, which could reduce the 
> resource usage and also the latency.
> Once this is done, I think we can remove the ClusterConnection completely, 
> and start to rewrite the old sync client based on the async client, which 
> could reduce the code base a lot for our client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714717#comment-16714717
 ] 

Duo Zhang commented on HBASE-21570:
---

Review board link:

https://reviews.apache.org/r/69539/

> Add write buffer periodic flush support for AsyncBufferedMutator
> 
>
> Key: HBASE-21570
> URL: https://issues.apache.org/jira/browse/HBASE-21570
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3
>
> Attachments: HBASE-21570.patch
>
>
> Align with the BufferedMutator interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21570:
--
Attachment: HBASE-21570.patch

> Add write buffer periodic flush support for AsyncBufferedMutator
> 
>
> Key: HBASE-21570
> URL: https://issues.apache.org/jira/browse/HBASE-21570
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-21570.patch
>
>
> Align with the BufferedMutator interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator

2018-12-10 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21570:
--
 Assignee: Duo Zhang
Fix Version/s: 2.1.3
   2.0.4
   2.2.0
   3.0.0
   Status: Patch Available  (was: Open)

> Add write buffer periodic flush support for AsyncBufferedMutator
> 
>
> Key: HBASE-21570
> URL: https://issues.apache.org/jira/browse/HBASE-21570
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3
>
> Attachments: HBASE-21570.patch
>
>
> Align with the BufferedMutator interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714616#comment-16714616
 ] 

Wellington Chevreuil commented on HBASE-21505:
--

Resolved conflict from last patch.

> Several inconsistencies on information reported for Replication Sources by 
> hbase shell status 'replication' command.
> 
>
> Key: HBASE-21505
> URL: https://issues.apache.org/jira/browse/HBASE-21505
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: 
> 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, 
> HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, 
> HBASE-21505-master.003.patch, HBASE-21505-master.004.patch
>
>
> While reviewing hbase shell status 'replication' command, noticed the 
> following issues related to replication source section:
> 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when 
> no new edits were added to source, so nothing was really shipped. Test steps 
> performed:
> 1.1) Source cluster with only one table targeted to replication;
> 1.2) Added a new row, confirmed the row appeared in Target cluster;
> 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp 
> shows current timestamp T1.
> 1.4) Waited 30 seconds, no new data added to source. Issued status 
> 'replication' command, now shows timestamp T2.
> 2) When replication is stuck due some connectivity issues or target 
> unavailability, if new edits are added in source, reported AgeOfLastShippedOp 
> is wrongly showing same value as "Replication Lag". This is incorrect, 
> AgeOfLastShippedOp should not change until there's indeed another edit 
> shipped to target. Test steps performed:
> 2.1) Source cluster with only one table targeted to replication;
> 2.2) Stopped target cluster RS;
> 2.3) Put a new row on source. Running status 'replication' command does show 
> lag increasing. TimeStampsOfLastShippedOp seems correct also, no further 
> updates as described on bullet #1 above.
> 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some 
> time before it got finally shipped to target. Test steps performed:
> 3.1) Source cluster with only one table targeted to replication;
> 3.2) Stopped target cluster RS;
> 3.3) Put a new row on source. 
> 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> T1:
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> T2:
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3.5) Restart target cluster RS and verified the new row appeared there. No 
> new edit added, but status 'replication' command reports AgeOfLastShippedOp 
> as 0, while it should be the diff between the time it concluded shipping at 
> target and the time it was added in source:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0
> {noformat}
> 4) When replication is stuck due some connectivity issues or target 
> unavailability, if RS is restarted, once recovered queue source is started, 
> TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 
> GMT 1970, for example), thus "Replication Lag" also gives a complete 
> inaccurate value. 
> Tests performed:
> 4.1) Source cluster with only one table targeted to replication;
> 4.2) Stopped target cluster RS;
> 4.3) Put a new row on source, restart RS on source, waited a few seconds for 
> recovery queue source to startup, then it gives:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication 
> Lag=9223372036854775807
> {noformat}
> Also, we should report status to all sources running, current output format 
> gives the impression there’s only one, even when there are recovery queues, 
> for instance. 
> Here is a list of ideas on how the command should 

[jira] [Updated] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21505:
-
Attachment: HBASE-21505-master.004.patch

> Several inconsistencies on information reported for Replication Sources by 
> hbase shell status 'replication' command.
> 
>
> Key: HBASE-21505
> URL: https://issues.apache.org/jira/browse/HBASE-21505
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: 
> 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, 
> HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, 
> HBASE-21505-master.003.patch, HBASE-21505-master.004.patch
>
>
> While reviewing hbase shell status 'replication' command, noticed the 
> following issues related to replication source section:
> 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when 
> no new edits were added to source, so nothing was really shipped. Test steps 
> performed:
> 1.1) Source cluster with only one table targeted to replication;
> 1.2) Added a new row, confirmed the row appeared in Target cluster;
> 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp 
> shows current timestamp T1.
> 1.4) Waited 30 seconds, no new data added to source. Issued status 
> 'replication' command, now shows timestamp T2.
> 2) When replication is stuck due some connectivity issues or target 
> unavailability, if new edits are added in source, reported AgeOfLastShippedOp 
> is wrongly showing same value as "Replication Lag". This is incorrect, 
> AgeOfLastShippedOp should not change until there's indeed another edit 
> shipped to target. Test steps performed:
> 2.1) Source cluster with only one table targeted to replication;
> 2.2) Stopped target cluster RS;
> 2.3) Put a new row on source. Running status 'replication' command does show 
> lag increasing. TimeStampsOfLastShippedOp seems correct also, no further 
> updates as described on bullet #1 above.
> 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some 
> time before it got finally shipped to target. Test steps performed:
> 3.1) Source cluster with only one table targeted to replication;
> 3.2) Stopped target cluster RS;
> 3.3) Put a new row on source. 
> 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> T1:
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> T2:
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3.5) Restart target cluster RS and verified the new row appeared there. No 
> new edit added, but status 'replication' command reports AgeOfLastShippedOp 
> as 0, while it should be the diff between the time it concluded shipping at 
> target and the time it was added in source:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0
> {noformat}
> 4) When replication is stuck due some connectivity issues or target 
> unavailability, if RS is restarted, once recovered queue source is started, 
> TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 
> GMT 1970, for example), thus "Replication Lag" also gives a complete 
> inaccurate value. 
> Tests performed:
> 4.1) Source cluster with only one table targeted to replication;
> 4.2) Stopped target cluster RS;
> 4.3) Put a new row on source, restart RS on source, waited a few seconds for 
> recovery queue source to startup, then it gives:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication 
> Lag=9223372036854775807
> {noformat}
> Also, we should report status to all sources running, current output format 
> gives the impression there’s only one, even when there are recovery queues, 
> for instance. 
> Here is a list of ideas on how the command should report under different 
> 

[jira] [Commented] (HBASE-21572) The "progress" object in "Compactor" is not thread-safe, this may cause the misleading progress information on the web UI.

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714556#comment-16714556
 ] 

Hadoop QA commented on HBASE-21572:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m 25s{color} 
| {color:red} hbase-server generated 5 new + 183 unchanged - 5 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
21s{color} | {color:red} hbase-server: The patch generated 4 new + 24 unchanged 
- 0 fixed = 28 total (was 24) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}256m 43s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}306m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestMajorCompaction |
|   | hadoop.hbase.client.TestFromClientSide3 |
|   | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.client.TestFromClientSide |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21572 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951159/HBASE-21572.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 6920c71ea1ef 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 79d90c87b5 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| 

[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714533#comment-16714533
 ] 

Hadoop QA commented on HBASE-21505:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HBASE-21505 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-21505 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951174/HBASE-21505-master.003.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15231/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Several inconsistencies on information reported for Replication Sources by 
> hbase shell status 'replication' command.
> 
>
> Key: HBASE-21505
> URL: https://issues.apache.org/jira/browse/HBASE-21505
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: 
> 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, 
> HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, 
> HBASE-21505-master.003.patch
>
>
> While reviewing hbase shell status 'replication' command, noticed the 
> following issues related to replication source section:
> 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when 
> no new edits were added to source, so nothing was really shipped. Test steps 
> performed:
> 1.1) Source cluster with only one table targeted to replication;
> 1.2) Added a new row, confirmed the row appeared in Target cluster;
> 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp 
> shows current timestamp T1.
> 1.4) Waited 30 seconds, no new data added to source. Issued status 
> 'replication' command, now shows timestamp T2.
> 2) When replication is stuck due some connectivity issues or target 
> unavailability, if new edits are added in source, reported AgeOfLastShippedOp 
> is wrongly showing same value as "Replication Lag". This is incorrect, 
> AgeOfLastShippedOp should not change until there's indeed another edit 
> shipped to target. Test steps performed:
> 2.1) Source cluster with only one table targeted to replication;
> 2.2) Stopped target cluster RS;
> 2.3) Put a new row on source. Running status 'replication' command does show 
> lag increasing. TimeStampsOfLastShippedOp seems correct also, no further 
> updates as described on bullet #1 above.
> 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some 
> time before it got finally shipped to target. Test steps performed:
> 3.1) Source cluster with only one table targeted to replication;
> 3.2) Stopped target cluster RS;
> 3.3) Put a new row on source. 
> 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> T1:
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> T2:
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3.5) Restart target cluster RS and verified the new row appeared there. No 
> new edit added, but status 'replication' command reports AgeOfLastShippedOp 
> as 0, while it should be the diff between the time it concluded shipping at 
> target and the time it was added in source:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0
> {noformat}
> 4) When replication is stuck due some connectivity issues or target 
> unavailability, if RS is restarted, once recovered queue source is started, 
> TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 

[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Wellington Chevreuil (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714528#comment-16714528
 ] 

Wellington Chevreuil commented on HBASE-21505:
--

Third patch addressing issues from last build.

> Several inconsistencies on information reported for Replication Sources by 
> hbase shell status 'replication' command.
> 
>
> Key: HBASE-21505
> URL: https://issues.apache.org/jira/browse/HBASE-21505
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: 
> 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, 
> HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, 
> HBASE-21505-master.003.patch
>
>
> While reviewing hbase shell status 'replication' command, noticed the 
> following issues related to replication source section:
> 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when 
> no new edits were added to source, so nothing was really shipped. Test steps 
> performed:
> 1.1) Source cluster with only one table targeted to replication;
> 1.2) Added a new row, confirmed the row appeared in Target cluster;
> 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp 
> shows current timestamp T1.
> 1.4) Waited 30 seconds, no new data added to source. Issued status 
> 'replication' command, now shows timestamp T2.
> 2) When replication is stuck due some connectivity issues or target 
> unavailability, if new edits are added in source, reported AgeOfLastShippedOp 
> is wrongly showing same value as "Replication Lag". This is incorrect, 
> AgeOfLastShippedOp should not change until there's indeed another edit 
> shipped to target. Test steps performed:
> 2.1) Source cluster with only one table targeted to replication;
> 2.2) Stopped target cluster RS;
> 2.3) Put a new row on source. Running status 'replication' command does show 
> lag increasing. TimeStampsOfLastShippedOp seems correct also, no further 
> updates as described on bullet #1 above.
> 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some 
> time before it got finally shipped to target. Test steps performed:
> 3.1) Source cluster with only one table targeted to replication;
> 3.2) Stopped target cluster RS;
> 3.3) Put a new row on source. 
> 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> T1:
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> T2:
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3.5) Restart target cluster RS and verified the new row appeared there. No 
> new edit added, but status 'replication' command reports AgeOfLastShippedOp 
> as 0, while it should be the diff between the time it concluded shipping at 
> target and the time it was added in source:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0
> {noformat}
> 4) When replication is stuck due some connectivity issues or target 
> unavailability, if RS is restarted, once recovered queue source is started, 
> TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 
> GMT 1970, for example), thus "Replication Lag" also gives a complete 
> inaccurate value. 
> Tests performed:
> 4.1) Source cluster with only one table targeted to replication;
> 4.2) Stopped target cluster RS;
> 4.3) Put a new row on source, restart RS on source, waited a few seconds for 
> recovery queue source to startup, then it gives:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication 
> Lag=9223372036854775807
> {noformat}
> Also, we should report status to all sources running, current output format 
> gives the impression there’s only one, even when there are recovery queues, 
> for instance. 
> Here is a list of ideas on how the command should report under 

[jira] [Updated] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.

2018-12-10 Thread Wellington Chevreuil (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-21505:
-
Attachment: HBASE-21505-master.003.patch

> Several inconsistencies on information reported for Replication Sources by 
> hbase shell status 'replication' command.
> 
>
> Key: HBASE-21505
> URL: https://issues.apache.org/jira/browse/HBASE-21505
> Project: HBase
>  Issue Type: Bug
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Major
> Attachments: 
> 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, 
> HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, 
> HBASE-21505-master.003.patch
>
>
> While reviewing hbase shell status 'replication' command, noticed the 
> following issues related to replication source section:
> 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when 
> no new edits were added to source, so nothing was really shipped. Test steps 
> performed:
> 1.1) Source cluster with only one table targeted to replication;
> 1.2) Added a new row, confirmed the row appeared in Target cluster;
> 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp 
> shows current timestamp T1.
> 1.4) Waited 30 seconds, no new data added to source. Issued status 
> 'replication' command, now shows timestamp T2.
> 2) When replication is stuck due some connectivity issues or target 
> unavailability, if new edits are added in source, reported AgeOfLastShippedOp 
> is wrongly showing same value as "Replication Lag". This is incorrect, 
> AgeOfLastShippedOp should not change until there's indeed another edit 
> shipped to target. Test steps performed:
> 2.1) Source cluster with only one table targeted to replication;
> 2.2) Stopped target cluster RS;
> 2.3) Put a new row on source. Running status 'replication' command does show 
> lag increasing. TimeStampsOfLastShippedOp seems correct also, no further 
> updates as described on bullet #1 above.
> 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some 
> time before it got finally shipped to target. Test steps performed:
> 3.1) Source cluster with only one table targeted to replication;
> 3.2) Stopped target cluster RS;
> 3.3) Put a new row on source. 
> 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even 
> though there's no new edit shipped to target:
> {noformat}
> T1:
> ...
>  SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581
> ...
> T2:
> ...
> SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586
> ...
> {noformat}
> 3.5) Restart target cluster RS and verified the new row appeared there. No 
> new edit added, but status 'replication' command reports AgeOfLastShippedOp 
> as 0, while it should be the diff between the time it concluded shipping at 
> target and the time it was added in source:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0
> {noformat}
> 4) When replication is stuck due some connectivity issues or target 
> unavailability, if RS is restarted, once recovered queue source is started, 
> TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 
> GMT 1970, for example), thus "Replication Lag" also gives a complete 
> inaccurate value. 
> Tests performed:
> 4.1) Source cluster with only one table targeted to replication;
> 4.2) Stopped target cluster RS;
> 4.3) Put a new row on source, restart RS on source, waited a few seconds for 
> recovery queue source to startup, then it gives:
> {noformat}
> SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, 
> TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication 
> Lag=9223372036854775807
> {noformat}
> Also, we should report status to all sources running, current output format 
> gives the impression there’s only one, even when there are recovery queues, 
> for instance. 
> Here is a list of ideas on how the command should report under different 
> states of replication:
> a) Source