[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21570: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-2.0+. Thanks [~stack] for reviewing. > Add write buffer periodic flush support for AsyncBufferedMutator > > > Key: HBASE-21570 > URL: https://issues.apache.org/jira/browse/HBASE-21570 > Project: HBase > Issue Type: Sub-task > Components: asyncclient, Client >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3 > > Attachments: HBASE-21570-v1.patch, HBASE-21570.patch > > > Align with the BufferedMutator interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21453) Convert ReadOnlyZKClient to DEBUG instead of INFO
[ https://issues.apache.org/jira/browse/HBASE-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-21453: -- Resolution: Fixed Fix Version/s: 2.1.3 2.2.0 3.0.0 Status: Resolved (was: Patch Available) Pushed to branch-2.1+. Thanks for the patch [~jatsakthi]! > Convert ReadOnlyZKClient to DEBUG instead of INFO > - > > Key: HBASE-21453 > URL: https://issues.apache.org/jira/browse/HBASE-21453 > Project: HBase > Issue Type: Bug > Components: logging, Zookeeper >Reporter: stack >Assignee: Sakthi >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.3 > > Attachments: hbase-21453.master.001.patch > > > Running commands in spark-shell, this is what it looks like on each > invocation: > {code} > scala> val count = rdd.count() > 2018-11-07 21:01:46,026 INFO [Executor task launch worker for task 1] > zookeeper.ReadOnlyZKClient: Connect 0x18f3d868 to localhost:2181 with session > timeout=9ms, retries 30, retry interval 1000ms, keepAlive=6ms > 2018-11-07 21:01:46,027 INFO [ReadOnlyZKClient-localhost:2181@0x18f3d868] > zookeeper.ZooKeeper: Initiating client connection, > connectString=localhost:2181 sessionTimeout=9 > watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$20/1362339879@743dab9f > 2018-11-07 21:01:46,030 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Opening socket connection to server > localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL > (unknown error) > 2018-11-07 21:01:46,031 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Socket connection established to > localhost/127.0.0.1:2181, initiating session > 2018-11-07 21:01:46,033 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Session establishment complete on server > localhost/127.0.0.1:2181, sessionid = 0x166f1b283080005, negotiated timeout = > 4 > 2018-11-07 21:01:46,035 INFO [Executor task launch worker for task 1] > mapreduce.TableInputFormatBase: Input split length: 0 bytes. > [Stage 1:> (0 + 1) / > 1]2018-11-07 21:01:48,074 INFO [Executor task launch worker for task 1] > zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x18f3d868 to > localhost:2181 > 2018-11-07 21:01:48,075 INFO [ReadOnlyZKClient-localhost:2181@0x18f3d868] > zookeeper.ZooKeeper: Session: 0x166f1b283080005 closed > 2018-11-07 21:01:48,076 INFO [ReadOnlyZKClient > -localhost:2181@0x18f3d868-EventThread] zookeeper.ClientCnxn: EventThread > shut down for session: 0x166f1b283080005 > count: Long = 10 > {code} > Let me shut down the ReadOnlyZKClient log level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server
[ https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21565: - Attachment: HBASE-21565.master.003.patch > Delete dead server from dead server list too early leads to concurrent Server > Crash Procedures(SCP) for a same server > - > > Key: HBASE-21565 > URL: https://issues.apache.org/jira/browse/HBASE-21565 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Critical > Attachments: HBASE-21565.master.001.patch, > HBASE-21565.master.002.patch, HBASE-21565.master.003.patch > > > There are 2 kinds of SCP for a same server will be scheduled during cluster > restart, one is ZK session timeout, the other one is new server report in > will cause the stale one do fail over. The only barrier for these 2 kinds of > SCP is check if the server is in the dead server list. > {code} > if (this.deadservers.isDeadServer(serverName)) { > LOG.warn("Expiration called on {} but crash processing already in > progress", serverName); > return false; > } > {code} > But the problem is when master finish initialization, it will delete all > stale servers from dead server list. Thus when the SCP for ZK session timeout > come in, the barrier is already removed. > Here is the logs that how this problem occur. > {code} > 2018-12-07,11:42:37,589 INFO > org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, > state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure > server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false > 2018-12-07,11:42:58,007 INFO > org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, > state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure > server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false > {code} > Now we can see two SCP are scheduled for the same server. > But the first procedure is finished after the second SCP starts. > {code} > 2018-12-07,11:43:08,038 INFO > org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, > state=SUCCESS, hasLock=false; ServerCrashProcedure > server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false > in 30.5340sec > {code} > Thus it will leads the problem that regions will be assigned twice. > {code} > 2018-12-07,12:16:33,039 WARN > org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, > location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, > region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on > server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise > {code} > And here we can see the server is removed from dead server list before the > second SCP starts. > {code} > 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: > Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3 > {code} > Thus we should not delete dead server from dead server list immediately. > Patch to fix this problem will be upload later. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21453) Convert ReadOnlyZKClient to DEBUG instead of INFO
[ https://issues.apache.org/jira/browse/HBASE-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-21453: -- Release Note: Log level of ReadOnlyZKClient moved to debug. > Convert ReadOnlyZKClient to DEBUG instead of INFO > - > > Key: HBASE-21453 > URL: https://issues.apache.org/jira/browse/HBASE-21453 > Project: HBase > Issue Type: Bug > Components: logging, Zookeeper >Reporter: stack >Assignee: Sakthi >Priority: Major > Attachments: hbase-21453.master.001.patch > > > Running commands in spark-shell, this is what it looks like on each > invocation: > {code} > scala> val count = rdd.count() > 2018-11-07 21:01:46,026 INFO [Executor task launch worker for task 1] > zookeeper.ReadOnlyZKClient: Connect 0x18f3d868 to localhost:2181 with session > timeout=9ms, retries 30, retry interval 1000ms, keepAlive=6ms > 2018-11-07 21:01:46,027 INFO [ReadOnlyZKClient-localhost:2181@0x18f3d868] > zookeeper.ZooKeeper: Initiating client connection, > connectString=localhost:2181 sessionTimeout=9 > watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$20/1362339879@743dab9f > 2018-11-07 21:01:46,030 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Opening socket connection to server > localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL > (unknown error) > 2018-11-07 21:01:46,031 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Socket connection established to > localhost/127.0.0.1:2181, initiating session > 2018-11-07 21:01:46,033 INFO > [ReadOnlyZKClient-localhost:2181@0x18f3d868-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Session establishment complete on server > localhost/127.0.0.1:2181, sessionid = 0x166f1b283080005, negotiated timeout = > 4 > 2018-11-07 21:01:46,035 INFO [Executor task launch worker for task 1] > mapreduce.TableInputFormatBase: Input split length: 0 bytes. > [Stage 1:> (0 + 1) / > 1]2018-11-07 21:01:48,074 INFO [Executor task launch worker for task 1] > zookeeper.ReadOnlyZKClient: Close zookeeper connection 0x18f3d868 to > localhost:2181 > 2018-11-07 21:01:48,075 INFO [ReadOnlyZKClient-localhost:2181@0x18f3d868] > zookeeper.ZooKeeper: Session: 0x166f1b283080005 closed > 2018-11-07 21:01:48,076 INFO [ReadOnlyZKClient > -localhost:2181@0x18f3d868-EventThread] zookeeper.ClientCnxn: EventThread > shut down for session: 0x166f1b283080005 > count: Long = 10 > {code} > Let me shut down the ReadOnlyZKClient log level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21578) Fix wrong throttling exception for capacity unit
[ https://issues.apache.org/jira/browse/HBASE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Mei updated HBASE-21578: --- Attachment: HBASE-21578.master.001.patch > Fix wrong throttling exception for capacity unit > > > Key: HBASE-21578 > URL: https://issues.apache.org/jira/browse/HBASE-21578 > Project: HBase > Issue Type: Bug >Reporter: Yi Mei >Priority: Major > Attachments: HBASE-21578.master.001.patch > > > HBASE-21034 provides a new throttle type: capacity unit, but the throttling > exception is confusing: > > {noformat} > 2018-12-11 14:38:41,503 DEBUG [Time-limited test] > client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=7, > started=0 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.quotas.RpcThrottlingException: write size limit > exceeded - wait 10sec > at > org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwThrottlingException(RpcThrottlingException.java:106) > at > org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwWriteSizeExceeded(RpcThrottlingException.java:96) > at > org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:179){noformat} > Need to make the exception more clearly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21580) Support getting Hbck instance from AsyncClusterConnection
[ https://issues.apache.org/jira/browse/HBASE-21580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21580: -- Summary: Support getting Hbck instance from AsyncClusterConnection (was: Support getting Hbck instance for AsyncClusterConnection) > Support getting Hbck instance from AsyncClusterConnection > - > > Key: HBASE-21580 > URL: https://issues.apache.org/jira/browse/HBASE-21580 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21580) Support getting Hbck instance for AsyncClusterConnection
Duo Zhang created HBASE-21580: - Summary: Support getting Hbck instance for AsyncClusterConnection Key: HBASE-21580 URL: https://issues.apache.org/jira/browse/HBASE-21580 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21579) Use AsyncClusterConnection in Replication related classes
Duo Zhang created HBASE-21579: - Summary: Use AsyncClusterConnection in Replication related classes Key: HBASE-21579 URL: https://issues.apache.org/jira/browse/HBASE-21579 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21578) Fix wrong throttling exception for capacity unit
Yi Mei created HBASE-21578: -- Summary: Fix wrong throttling exception for capacity unit Key: HBASE-21578 URL: https://issues.apache.org/jira/browse/HBASE-21578 Project: HBase Issue Type: Bug Reporter: Yi Mei HBASE-21034 provides a new throttle type: capacity unit, but the throttling exception is confusing: {noformat} 2018-12-11 14:38:41,503 DEBUG [Time-limited test] client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=7, started=0 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.quotas.RpcThrottlingException: write size limit exceeded - wait 10sec at org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwThrottlingException(RpcThrottlingException.java:106) at org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwWriteSizeExceeded(RpcThrottlingException.java:96) at org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:179){noformat} Need to make the exception more clearly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection
[ https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21538: -- Assignee: Duo Zhang Status: Patch Available (was: Open) > Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection > --- > > Key: HBASE-21538 > URL: https://issues.apache.org/jira/browse/HBASE-21538 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21538-HBASE-21512.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
[ https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716354#comment-16716354 ] Hadoop QA commented on HBASE-21577: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 52s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 53s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}260m 2s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}305m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestAdmin1 | | | hadoop.hbase.client.TestFromClientSide | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21577 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951296/HBASE-21577.master.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 157d38982940 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / da9508d427 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/15240/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results |
[jira] [Updated] (HBASE-21538) Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection
[ https://issues.apache.org/jira/browse/HBASE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21538: -- Attachment: HBASE-21538-HBASE-21512.patch > Rewrite RegionReplicaFlushHandler to use AsyncClusterConnection > --- > > Key: HBASE-21538 > URL: https://issues.apache.org/jira/browse/HBASE-21538 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > Attachments: HBASE-21538-HBASE-21512.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716236#comment-16716236 ] Hadoop QA commented on HBASE-21570: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 50s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 59s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} hbase-client: The patch generated 0 new + 3 unchanged - 1 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} The patch passed checkstyle in hbase-server {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 59s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 10s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}242m 35s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}296m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas | | | hadoop.hbase.client.TestFromClientSide | | | hadoop.hbase.client.TestAdmin1 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21570 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951289/HBASE-21570-v1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716206#comment-16716206 ] Hadoop QA commented on HBASE-21246: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 14 new or modified test files. {color} | || || || || {color:brown} HBASE-20952 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 28s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 33s{color} | {color:green} HBASE-20952 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 25s{color} | {color:green} HBASE-20952 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s{color} | {color:green} HBASE-20952 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s{color} | {color:green} HBASE-20952 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} HBASE-20952 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} The patch passed checkstyle in hbase-common {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} hbase-server: The patch generated 0 new + 58 unchanged - 1 fixed = 58 total (was 59) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 47s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 11s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 30s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 43s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}130m 7s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 54s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}177m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.wal.DisabledWALProvider$1.equals(Object) always returns true At DisabledWALProvider.java:At DisabledWALProvider.java:[line 81] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21246 | | JIRA Patch URL |
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716174#comment-16716174 ] Hudson commented on HBASE-21567: Results for branch branch-2.1 [build #674 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/674//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716169#comment-16716169 ] Hudson commented on HBASE-21567: Results for branch branch-2.0 [build #1154 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1154//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21526) Use AsyncClusterConnection in ServerManager for getRsAdmin
[ https://issues.apache.org/jira/browse/HBASE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21526: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HBASE-21512 Status: Resolved (was: Patch Available) Rebased and pushed to branch HBASE-21512. Thanks [~stack] for reviewing. > Use AsyncClusterConnection in ServerManager for getRsAdmin > -- > > Key: HBASE-21526 > URL: https://issues.apache.org/jira/browse/HBASE-21526 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: HBASE-21512 > > Attachments: HBASE-21526-HBASE-21512-v1.patch, > HBASE-21526-HBASE-21512-v2.patch, HBASE-21526-HBASE-21512.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716071#comment-16716071 ] Hudson commented on HBASE-20952: Results for branch HBASE-20952 [build #58 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/58//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716052#comment-16716052 ] Reid Chan commented on HBASE-21246: --- Overall LGTM! Few suggestions, {code} public FSWALIdentity(Path path) public FSWALIdentity(String name) {code} Can we add a pre-null check or annotation NotNullable or javadoc to raise attention of no-null? Passing a null object to WALIdentity makes no sense to me. The property 'name' looks redundant to me in FSWALIdentity: {code} @Override public String getName() { return name; // can always be replaced with path.getName(), no need of extra property. } {code} > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.HBASE-20952.003.patch, HBASE-21246.master.001.patch, > HBASE-21246.master.002.patch, replication-src-creates-wal-reader.jpg, > wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, > wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716035#comment-16716035 ] Sean Busbey commented on HBASE-21553: - it looks like the addition of a shared lock check for the namespace came in HBASE-15105, which means branch-1.2 doesn't have the missed lock release. Clean up to use try/finally for unlocks is probably still a good idea, but probably better done as a different JIRA so that folks don't think there's the same risk of deadlock getting fixed. > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Critical > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-21553: Component/s: proc-v2 > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Critical > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-21553: Issue Type: Bug (was: Improvement) > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-21553: Priority: Critical (was: Major) > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Critical > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716029#comment-16716029 ] Hadoop QA commented on HBASE-21575: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 7s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 35s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}135m 30s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}176m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestMultiColumnScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21575 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951281/HBASE-21575.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7ac9b1373e6a 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 31 10:55:11 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / da9508d427 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/15238/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results |
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716028#comment-16716028 ] Hudson commented on HBASE-21553: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #508 (See [https://builds.apache.org/job/HBase-1.3-IT/508/]) HBASE-21553 schedLock not released in MasterProcedureScheduler (apurtell: rev b9adb955cde19746219b3efd8500c7ba7239ae56) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
[ https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715996#comment-16715996 ] Duo Zhang commented on HBASE-21577: --- Is it possible to just set fsOk to false when there is a DroppedSnapshotException? > do not close regions when RS is dying due to a broken WAL > - > > Key: HBASE-21577 > URL: https://issues.apache.org/jira/browse/HBASE-21577 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21577.master.001.patch > > > See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is > broken, some regions whose flushes are already in flight keep retrying, > resulting in minutes-long shutdown times. Since WAL will be replayed anyway > flushing regions doesn't provide much benefit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716012#comment-16716012 ] Andrew Purtell commented on HBASE-21553: [~busbey] Do you want this in 1.2? > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21553: --- Fix Version/s: 1.3.3 > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715995#comment-16715995 ] Karan Mehta commented on HBASE-21553: - Is this not going into branch-1.3 or branch-1.2? > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-21246: -- Attachment: HBASE-21246.HBASE-20952.003.patch > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.HBASE-20952.003.patch, HBASE-21246.master.001.patch, > HBASE-21246.master.002.patch, replication-src-creates-wal-reader.jpg, > wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, > wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21553: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.10 1.5.0 Status: Resolved (was: Patch Available) > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
Sergey Shelukhin created HBASE-21577: Summary: do not close regions when RS is dying due to a broken WAL Key: HBASE-21577 URL: https://issues.apache.org/jira/browse/HBASE-21577 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is broken, some regions whose flushes are already in flight keep retrying, resulting in minutes-long shutdown times. Since WAL will be replayed anyway flushing regions doesn't provide much benefit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
[ https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21577: - Attachment: HBASE-21577.master.001.patch > do not close regions when RS is dying due to a broken WAL > - > > Key: HBASE-21577 > URL: https://issues.apache.org/jira/browse/HBASE-21577 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21577.master.001.patch > > > See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is > broken, some regions whose flushes are already in flight keep retrying, > resulting in minutes-long shutdown times. Since WAL will be replayed anyway > flushing regions doesn't provide much benefit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
[ https://issues.apache.org/jira/browse/HBASE-21577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21577: - Status: Patch Available (was: Open) > do not close regions when RS is dying due to a broken WAL > - > > Key: HBASE-21577 > URL: https://issues.apache.org/jira/browse/HBASE-21577 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21577.master.001.patch > > > See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is > broken, some regions whose flushes are already in flight keep retrying, > resulting in minutes-long shutdown times. Since WAL will be replayed anyway > flushing regions doesn't provide much benefit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles
[ https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715927#comment-16715927 ] Guanghao Zhang commented on HBASE-21568: +1. > Disable use of BlockCache for LoadIncrementalHFiles > --- > > Key: HBASE-21568 > URL: https://issues.apache.org/jira/browse/HBASE-21568 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Fix For: 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21568.001.branch-2.0.patch > > > [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow > callers to specify that they do not want to use a block cache when reading an > HFile. > If the BucketCache is set up to use the FileSystem, we can have a situation > where the client tries to instantiate the BucketCache and is disallowed due > to filesystem permissions: > {code:java} > 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: > Failed allocating cache on /mnt/hbase/cache.data > java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at java.io.RandomAccessFile.(RandomAccessFile.java:124) > at > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81) > at > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382) > at > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262) > at > org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633) > at > org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663) > at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink
[ https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715926#comment-16715926 ] Hadoop QA commented on HBASE-21406: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 57s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 36s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 22s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} hbase-hadoop2-compat: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 4s{color} | {color:red} hbase-server: The patch generated 3 new + 8 unchanged - 0 fixed = 11 total (was 8) {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 7s{color} | {color:red} The patch generated 25 new + 409 unchanged - 5 fixed = 434 total (was 414) {color} | | {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange} 0m 2s{color} | {color:orange} The patch generated 1 new + 748 unchanged - 1 fixed = 749 total (was 749) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 52s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 31s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hbase-hadoop2-compat in the
[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715918#comment-16715918 ] Hadoop QA commented on HBASE-21564: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 55s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}276m 26s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 36s{color} | {color:green} hbase-backup in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 43s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}336m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationEndpoint | | | hadoop.hbase.regionserver.TestSplitTransactionOnCluster | | | hadoop.hbase.client.TestFromClientSideWithCoprocessor | | | hadoop.hbase.client.TestSnapshotTemporaryDirectory | | | hadoop.hbase.client.TestFromClientSide3 | | | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas | | | hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas | | | hadoop.hbase.master.TestAssignmentManagerMetrics | | | hadoop.hbase.TestClientOperationTimeout | | | hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleAsyncWAL | | | hadoop.hbase.client.TestAdmin1 | | | hadoop.hbase.master.replication.TestTransitPeerSyncReplicationStateProcedureRetry | | |
[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21570: -- Attachment: HBASE-21570-v1.patch > Add write buffer periodic flush support for AsyncBufferedMutator > > > Key: HBASE-21570 > URL: https://issues.apache.org/jira/browse/HBASE-21570 > Project: HBase > Issue Type: Sub-task > Components: asyncclient, Client >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3 > > Attachments: HBASE-21570-v1.patch, HBASE-21570.patch > > > Align with the BufferedMutator interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21576) master should proactively reassign meta when killing a RS with it
[ https://issues.apache.org/jira/browse/HBASE-21576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21576: - Description: Master has killed an RS that was hosting meta due to some HDFS issue (most likely; I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead after a kill; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08 05:04:17,271 INFO [hconnection-0x4d58bcd4-shared-pool3-t1877] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41137 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server ,17020,1544274145387 is not running yet ... 2018-12-08 05:11:18,470 ERROR [RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: * ABORTING master ...,17000,1544230401860: FAILED persisting region=... state=OPEN *^M {noformat} There are no signs of meta assignment activity at all in master logs was: Master has killed an RS that was hosting meta due to some internal error (still need to see if it's a separate bug or just a machine/HDFS issue, I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead after a kill; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08 05:04:17,271 INFO
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715849#comment-16715849 ] Hadoop QA commented on HBASE-21246: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 14 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 46s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s{color} | {color:red} hbase-server: The patch generated 1 new + 58 unchanged - 1 fixed = 59 total (was 59) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 50s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 11s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 34s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}128m 41s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}171m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.wal.DisabledWALProvider$1 defines compareTo(Object) and uses Object.equals() At DisabledWALProvider.java:Object.equals() At DisabledWALProvider.java:[line 67] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21246 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951267/HBASE-21246.master.002.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs
[jira] [Updated] (HBASE-21576) master should proactively reassign meta when killing a RS with it
[ https://issues.apache.org/jira/browse/HBASE-21576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21576: - Description: Master has killed an RS that was hosting meta due to some internal error (still need to see if it's a separate bug or just a machine/HDFS issue, I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead after a kill; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08 05:04:17,271 INFO [hconnection-0x4d58bcd4-shared-pool3-t1877] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41137 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server ,17020,1544274145387 is not running yet ... 2018-12-08 05:11:18,470 ERROR [RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: * ABORTING master ...,17000,1544230401860: FAILED persisting region=... state=OPEN *^M {noformat} There are no signs of meta assignment activity at all in master logs was: Master has killed an RS that was hosting meta due to some internal error (still need to see if it's a separate bug or just a machine/HDFS issue, I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08
[jira] [Created] (HBASE-21576) master should proactively reassign meta when killing a RS with it
Sergey Shelukhin created HBASE-21576: Summary: master should proactively reassign meta when killing a RS with it Key: HBASE-21576 URL: https://issues.apache.org/jira/browse/HBASE-21576 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Master has killed an RS that was hosting meta due to some internal error (still need to see if it's a separate bug or just a machine/HDFS issue, I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08 05:04:17,271 INFO [hconnection-0x4d58bcd4-shared-pool3-t1877] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41137 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server ,17020,1544274145387 is not running yet ... 2018-12-08 05:11:18,470 ERROR [RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: * ABORTING master ...,17000,1544230401860: FAILED persisting region=... state=OPEN *^M {noformat} There are no signs of meta assignment activity at all in master logs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21283) Add new shell command 'rit' for listing regions in transition
[ https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey resolved HBASE-21283. - Resolution: Fixed Release Note: The HBase `shell` now includes a command to list regions currently in transition. ``` HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct 8 21:05:50 UTC 2018 hbase(main):001:0> help 'rit' List all regions in transition. Examples: hbase> rit hbase(main):002:0> create ... 0 row(s) in 2.5150 seconds => Hbase::Table - IntegrationTestBigLinkedList hbase(main):003:0> rit 0 row(s) in 0.0340 seconds hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1' 0 row(s) in 0.0540 seconds hbase(main):005:0> rit IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1. state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null 1 row(s) in 0.0170 seconds ``` > Add new shell command 'rit' for listing regions in transition > - > > Key: HBASE-21283 > URL: https://issues.apache.org/jira/browse/HBASE-21283 > Project: HBase > Issue Type: Improvement > Components: Operability, shell >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, > HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, > HBASE-21283.patch > > > The 'status' shell command shows regions in transition but sometimes an > operator may want to retrieve a simple list of regions in transition. Here's > a patch that adds a new 'rit' command to the TOOLS group that does just that. > No test, because it seems hard to mock RITs from the ruby test code, but I > have run TestShell and it passes, so the command is verified to meet minimum > requirements, like help text, and manually verified with branch-1 (shell in > branch-2 and up doesn't return until TransitRegionProcedure has completed so > by that time no RIT): > {noformat} > HBase Shell > Use "help" to get list of supported commands. > Use "exit" to quit this interactive shell. > Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct 8 > 21:05:50 UTC 2018 > hbase(main):001:0> help 'rit' > List all regions in transition. > Examples: > hbase> rit > hbase(main):002:0> create ... > 0 row(s) in 2.5150 seconds > => Hbase::Table - IntegrationTestBigLinkedList > hbase(main):003:0> rit > 0 row(s) in 0.0340 seconds > hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1' > 0 row(s) in 0.0540 seconds > hbase(main):005:0> rit > IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1. > state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null > > > > 1 row(s) in 0.0170 seconds > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715802#comment-16715802 ] Hudson commented on HBASE-21567: Results for branch branch-2 [build #1550 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1550//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21283) Add new shell command 'rit' for listing regions in transition
[ https://issues.apache.org/jira/browse/HBASE-21283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey reopened HBASE-21283: - > Add new shell command 'rit' for listing regions in transition > - > > Key: HBASE-21283 > URL: https://issues.apache.org/jira/browse/HBASE-21283 > Project: HBase > Issue Type: Improvement > Components: Operability, shell >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-21283-branch-1.patch, HBASE-21283-branch-1.patch, > HBASE-21283-branch-1.patch, HBASE-21283.patch, HBASE-21283.patch, > HBASE-21283.patch > > > The 'status' shell command shows regions in transition but sometimes an > operator may want to retrieve a simple list of regions in transition. Here's > a patch that adds a new 'rit' command to the TOOLS group that does just that. > No test, because it seems hard to mock RITs from the ruby test code, but I > have run TestShell and it passes, so the command is verified to meet minimum > requirements, like help text, and manually verified with branch-1 (shell in > branch-2 and up doesn't return until TransitRegionProcedure has completed so > by that time no RIT): > {noformat} > HBase Shell > Use "help" to get list of supported commands. > Use "exit" to quit this interactive shell. > Version 1.5.0-SNAPSHOT, r9bb6d2fa8b760f16cd046657240ebd4ad91cb6de, Mon Oct 8 > 21:05:50 UTC 2018 > hbase(main):001:0> help 'rit' > List all regions in transition. > Examples: > hbase> rit > hbase(main):002:0> create ... > 0 row(s) in 2.5150 seconds > => Hbase::Table - IntegrationTestBigLinkedList > hbase(main):003:0> rit > 0 row(s) in 0.0340 seconds > hbase(main):004:0> unassign '56f0c38c81ae453d19906ce156a2d6a1' > 0 row(s) in 0.0540 seconds > hbase(main):005:0> rit > IntegrationTestBigLinkedList,L\xCC\xCC\xCC\xCC\xCC\xCC\xCB,1539117183224.56f0c38c81ae453d19906ce156a2d6a1. > state=PENDING_CLOSE, ts=Tue Oct 09 20:33:34 UTC 2018 (0s ago), server=null > > > > 1 row(s) in 0.0170 seconds > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21410) A helper page that help find all problematic regions and procedures
[ https://issues.apache.org/jira/browse/HBASE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715798#comment-16715798 ] Sean Busbey commented on HBASE-21410: - please reopen and then resolve again so you can add a release note calling this out. > A helper page that help find all problematic regions and procedures > --- > > Key: HBASE-21410 > URL: https://issues.apache.org/jira/browse/HBASE-21410 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2 > > Attachments: HBASE-21410.branch-2.1.001.patch, > HBASE-21410.branch-2.1.002.patch, HBASE-21410.master.001.patch, > HBASE-21410.master.002.patch, HBASE-21410.master.003.patch, > HBASE-21410.master.004.patch, Screenshot from 2018-10-30 19-06-21.png, > Screenshot from 2018-10-30 19-06-42.png, Screenshot from 2018-10-31 > 10-11-38.png, Screenshot from 2018-10-31 10-11-56.png, Screenshot from > 2018-11-01 17-56-02.png, Screenshot from 2018-11-01 17-56-15.png > > > *This page is mainly focus on finding the regions stuck in some state that > cannot be assigned. My proposal of the page is as follows:* > !Screenshot from 2018-10-30 19-06-21.png! > *From this page we can see all regions in RIT queue and their related > procedures. If we can determine that these regions' state are abnormal, we > can click the link 'Procedures as TXT' to get a full list of procedure IDs to > bypass them. Then click 'Regions as TXT' to get a full list of encoded region > names to assign.* > !Screenshot from 2018-10-30 19-06-42.png! > *Some region names are covered by the navigator bar, I'll fix it later.* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21575: - Description: 100s of Mb of logs like this: {noformat} 2018-12-08 10:27:00,462 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3646ms 2018-12-08 10:27:00,463 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms 2018-12-08 10:27:00,463 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms 2018-12-08 10:27:00,464 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms 2018-12-08 10:27:00,464 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms 2018-12-08 10:27:00,465 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms 2018-12-08 10:27:00,465 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms 2018-12-08 10:27:00,466 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms 2018-12-08 10:27:00,466 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms 2018-12-08 10:27:00,467 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3651ms 2018-12-08 10:27:00,469 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3653ms 2018-12-08 10:27:00,470 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms 2018-12-08 10:27:00,470 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms 2018-12-08 10:27:00,471 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms 2018-12-08 10:27:00,471 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms 2018-12-08 10:27:00,472 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms 2018-12-08 10:27:00,472 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms 2018-12-08 10:27:00,473 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3657ms 2018-12-08 10:27:00,474 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3658ms 2018-12-08 10:27:00,475 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3659ms 2018-12-08 10:27:00,476 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms 2018-12-08 10:27:00,476 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms 2018-12-08 10:27:00,477 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms 2018-12-08 10:27:00,477 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms 2018-12-08 10:27:00,478 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3662ms 2018-12-08 10:27:00,479 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms 2018-12-08 10:27:00,479 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms 2018-12-08 10:27:00,480 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020]
[jira] [Updated] (HBASE-21410) A helper page that help find all problematic regions and procedures
[ https://issues.apache.org/jira/browse/HBASE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-21410: Fix Version/s: (was: 2.1.0) 2.1.2 > A helper page that help find all problematic regions and procedures > --- > > Key: HBASE-21410 > URL: https://issues.apache.org/jira/browse/HBASE-21410 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2 > > Attachments: HBASE-21410.branch-2.1.001.patch, > HBASE-21410.branch-2.1.002.patch, HBASE-21410.master.001.patch, > HBASE-21410.master.002.patch, HBASE-21410.master.003.patch, > HBASE-21410.master.004.patch, Screenshot from 2018-10-30 19-06-21.png, > Screenshot from 2018-10-30 19-06-42.png, Screenshot from 2018-10-31 > 10-11-38.png, Screenshot from 2018-10-31 10-11-56.png, Screenshot from > 2018-11-01 17-56-02.png, Screenshot from 2018-11-01 17-56-15.png > > > *This page is mainly focus on finding the regions stuck in some state that > cannot be assigned. My proposal of the page is as follows:* > !Screenshot from 2018-10-30 19-06-21.png! > *From this page we can see all regions in RIT queue and their related > procedures. If we can determine that these regions' state are abnormal, we > can click the link 'Procedures as TXT' to get a full list of procedure IDs to > bypass them. Then click 'Regions as TXT' to get a full list of encoded region > names to assign.* > !Screenshot from 2018-10-30 19-06-42.png! > *Some region names are covered by the navigator bar, I'll fix it later.* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21575: - Status: Patch Available (was: Open) > memstore above high watermark message is logged too much > > > Key: HBASE-21575 > URL: https://issues.apache.org/jira/browse/HBASE-21575 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21575.patch > > > 100s of Mb of logs like this, in a tight loop: > {noformat} > 2018-12-08 10:27:00,462 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3646ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,467 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3651ms > 2018-12-08 10:27:00,469 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3653ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,473 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3657ms > 2018-12-08 10:27:00,474 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3658ms > 2018-12-08 10:27:00,475 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3659ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,477 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3661ms > 2018-12-08 10:27:00,477 WARN >
[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21575: - Attachment: HBASE-21575.patch > memstore above high watermark message is logged too much > > > Key: HBASE-21575 > URL: https://issues.apache.org/jira/browse/HBASE-21575 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21575.patch > > > 100s of Mb of logs like this, in a tight loop: > {noformat} > 2018-12-08 10:27:00,462 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3646ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,467 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3651ms > 2018-12-08 10:27:00,469 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3653ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,473 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3657ms > 2018-12-08 10:27:00,474 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3658ms > 2018-12-08 10:27:00,475 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3659ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,477 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3661ms > 2018-12-08 10:27:00,477 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020]
[jira] [Updated] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21575: - Description: 100s of Mb of logs like this, in a tight loop: {noformat} 2018-12-08 10:27:00,462 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3646ms 2018-12-08 10:27:00,463 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms 2018-12-08 10:27:00,463 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3647ms 2018-12-08 10:27:00,464 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms 2018-12-08 10:27:00,464 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3648ms 2018-12-08 10:27:00,465 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms 2018-12-08 10:27:00,465 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3649ms 2018-12-08 10:27:00,466 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms 2018-12-08 10:27:00,466 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3650ms 2018-12-08 10:27:00,467 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3651ms 2018-12-08 10:27:00,469 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3653ms 2018-12-08 10:27:00,470 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms 2018-12-08 10:27:00,470 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3654ms 2018-12-08 10:27:00,471 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms 2018-12-08 10:27:00,471 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3655ms 2018-12-08 10:27:00,472 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms 2018-12-08 10:27:00,472 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3656ms 2018-12-08 10:27:00,473 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3657ms 2018-12-08 10:27:00,474 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3658ms 2018-12-08 10:27:00,475 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3659ms 2018-12-08 10:27:00,476 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms 2018-12-08 10:27:00,476 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3660ms 2018-12-08 10:27:00,477 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms 2018-12-08 10:27:00,477 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3661ms 2018-12-08 10:27:00,478 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3662ms 2018-12-08 10:27:00,479 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms 2018-12-08 10:27:00,479 WARN [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 3663ms 2018-12-08 10:27:00,480 WARN
[jira] [Created] (HBASE-21575) memstore above high watermark message is logged too much
Sergey Shelukhin created HBASE-21575: Summary: memstore above high watermark message is logged too much Key: HBASE-21575 URL: https://issues.apache.org/jira/browse/HBASE-21575 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin 100s of Mb of logs like this: {noformat} 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=12,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 103076ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=44,queue=4,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150781ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=14,queue=4,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150792ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=23,queue=3,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150780ms {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available
[ https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715748#comment-16715748 ] Cosmin Lehene commented on HBASE-21574: --- The effective number of retries seems to be affected by ReadOnlyZKClient.RECOVERY_RETRY which defaults to 30 while the rest of the timeouts seem to be ignored {code} callTimeout=1000, callDuration=149107: org.apache.hadoop.hbase.shaded.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server row {code} see https://issues.apache.org/jira/browse/HBASE-21573 > createConnection / getTable should not return if there's no cluster available > - > > Key: HBASE-21574 > URL: https://issues.apache.org/jira/browse/HBASE-21574 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.1.1 >Reporter: Cosmin Lehene >Priority: Major > Fix For: 2.1.2 > > > You can get a connection / table successfully with no cluster (no zk, hms, > hrs) and it also says it's open (closed = false) > {code} > Connection con = ConnectionFactory.createConnection(getConfiguration()); > con.getTable(TableName.valueOf(customersTable)); > {code} > {code} > con = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > hostnamesCanChange = true > pause = 100 > pauseForCQTBE = 100 > useMetaReplicas = false > metaReplicaCallTimeoutScanInMicroSecond = 100 > numTries = 16 > rpcTimeout = 6 > asyncProcess = \{AsyncProcess@5242} > stats = null > closed = false > aborted = false > clusterStatusListener = null > metaRegionLock = \{Object@5249} > masterLock = \{Object@5250} > batchPool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > metaLookupPool = null > cleanupPool = true > conf = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connectionConfig = \{ConnectionConfiguration@5239} > rpcClient = \{NettyRpcClient@5251} > metaCache = \{MetaCache@5252} > metrics = null > user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)" > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > interceptor = \{NoOpRetryableCallerInterceptor@5254} > "NoOpRetryableCallerInterceptor" > registry = \{ZKAsyncRegistry@5255} > backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} > alternateBufferedMutatorClassName = null > userRegionLock = \{ReentrantLock@5257} > "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]" > clusterId = "default-cluster" > stubs = \{ConcurrentHashMap@5259} size = 0 > masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} > "MasterService" > table = \{HTable@5193} "customers;hconnection-0x32093c94" > connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > tableName = \{TableName@5237} "customers" > configuration = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connConfiguration = \{ConnectionConfiguration@5239} > closed = false > scannerCaching = 2147483647 > scannerMaxResultSize = 2097152 > pool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > operationTimeoutMs = 120 > rpcTimeoutMs = 6 > readRpcTimeoutMs = 6 > writeRpcTimeoutMs = 6 > cleanupPoolOnClose = false > locator = \{HRegionLocator@5241} > multiAp = \{AsyncProcess@5242} > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715716#comment-16715716 ] Hudson commented on HBASE-21567: Results for branch branch-1.3 [build #571 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/571//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715708#comment-16715708 ] Hadoop QA commented on HBASE-21564: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 37s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 1s{color} | {color:red} hbase-server: The patch generated 2 new + 65 unchanged - 0 fixed = 67 total (was 65) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 32s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}137m 27s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 10s{color} | {color:green} hbase-backup in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}188m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.multiwal.TestReplicationEndpointWithMultipleWAL | | | hadoop.hbase.replication.TestReplicationEndpoint | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21564 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951250/HBASE-21564.master.003.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0163c4f49346 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available
[ https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715695#comment-16715695 ] Cosmin Lehene commented on HBASE-21574: --- ConnectionImplementation in constructor tries to retrieve clusterid [https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L297] It then fails (while ignoring max retries..) but ignores it Looking at [https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L563-L570] And then it returns a connection like everything is fine. Then getTable returns successfully too. All this happens if you run the client code without any cluster whatsoever. > createConnection / getTable should not return if there's no cluster available > - > > Key: HBASE-21574 > URL: https://issues.apache.org/jira/browse/HBASE-21574 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.1.1 >Reporter: Cosmin Lehene >Priority: Major > Fix For: 2.1.2 > > > You can get a connection / table successfully with no cluster (no zk, hms, > hrs) and it also says it's open (closed = false) > {code} > Connection con = ConnectionFactory.createConnection(getConfiguration()); > con.getTable(TableName.valueOf(customersTable)); > {code} > {code} > con = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > hostnamesCanChange = true > pause = 100 > pauseForCQTBE = 100 > useMetaReplicas = false > metaReplicaCallTimeoutScanInMicroSecond = 100 > numTries = 16 > rpcTimeout = 6 > asyncProcess = \{AsyncProcess@5242} > stats = null > closed = false > aborted = false > clusterStatusListener = null > metaRegionLock = \{Object@5249} > masterLock = \{Object@5250} > batchPool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > metaLookupPool = null > cleanupPool = true > conf = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connectionConfig = \{ConnectionConfiguration@5239} > rpcClient = \{NettyRpcClient@5251} > metaCache = \{MetaCache@5252} > metrics = null > user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)" > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > interceptor = \{NoOpRetryableCallerInterceptor@5254} > "NoOpRetryableCallerInterceptor" > registry = \{ZKAsyncRegistry@5255} > backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} > alternateBufferedMutatorClassName = null > userRegionLock = \{ReentrantLock@5257} > "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]" > clusterId = "default-cluster" > stubs = \{ConcurrentHashMap@5259} size = 0 > masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} > "MasterService" > table = \{HTable@5193} "customers;hconnection-0x32093c94" > connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > tableName = \{TableName@5237} "customers" > configuration = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connConfiguration = \{ConnectionConfiguration@5239} > closed = false > scannerCaching = 2147483647 > scannerMaxResultSize = 2097152 > pool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > operationTimeoutMs = 120 > rpcTimeoutMs = 6 > readRpcTimeoutMs = 6 > writeRpcTimeoutMs = 6 > cleanupPoolOnClose = false > locator = \{HRegionLocator@5241} > multiAp = \{AsyncProcess@5242} > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715585#comment-16715585 ] Ankit Singhal commented on HBASE-21246: --- [~elserj], [HBASE-21246.master.002|https://issues.apache.org/jira/secure/attachment/12951267/HBASE-21246.master.002.patch], Fixed a test case failures and checkstyle errors. > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.master.001.patch, HBASE-21246.master.002.patch, > replication-src-creates-wal-reader.jpg, wal-factory-providers.png, > wal-providers.png, wal-splitter-reader.jpg, wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-21246: -- Attachment: HBASE-21246.master.002.patch > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.master.001.patch, HBASE-21246.master.002.patch, > replication-src-creates-wal-reader.jpg, wal-factory-providers.png, > wal-providers.png, wal-splitter-reader.jpg, wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21574) createConnection / getTable should not return if there's no cluster available
[ https://issues.apache.org/jira/browse/HBASE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715573#comment-16715573 ] stack commented on HBASE-21574: --- Agree this is confusing. > createConnection / getTable should not return if there's no cluster available > - > > Key: HBASE-21574 > URL: https://issues.apache.org/jira/browse/HBASE-21574 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 2.1.1 >Reporter: Cosmin Lehene >Priority: Major > Fix For: 2.1.2 > > > You can get a connection / table successfully with no cluster (no zk, hms, > hrs) and it also says it's open (closed = false) > {code} > Connection con = ConnectionFactory.createConnection(getConfiguration()); > con.getTable(TableName.valueOf(customersTable)); > {code} > {code} > con = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > hostnamesCanChange = true > pause = 100 > pauseForCQTBE = 100 > useMetaReplicas = false > metaReplicaCallTimeoutScanInMicroSecond = 100 > numTries = 16 > rpcTimeout = 6 > asyncProcess = \{AsyncProcess@5242} > stats = null > closed = false > aborted = false > clusterStatusListener = null > metaRegionLock = \{Object@5249} > masterLock = \{Object@5250} > batchPool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > metaLookupPool = null > cleanupPool = true > conf = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connectionConfig = \{ConnectionConfiguration@5239} > rpcClient = \{NettyRpcClient@5251} > metaCache = \{MetaCache@5252} > metrics = null > user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)" > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > interceptor = \{NoOpRetryableCallerInterceptor@5254} > "NoOpRetryableCallerInterceptor" > registry = \{ZKAsyncRegistry@5255} > backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} > alternateBufferedMutatorClassName = null > userRegionLock = \{ReentrantLock@5257} > "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]" > clusterId = "default-cluster" > stubs = \{ConcurrentHashMap@5259} size = 0 > masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} > "MasterService" > table = \{HTable@5193} "customers;hconnection-0x32093c94" > connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94" > tableName = \{TableName@5237} "customers" > configuration = \{Configuration@5238} "Configuration: core-default.xml, > core-site.xml, hbase-default.xml, hbase-site.xml" > connConfiguration = \{ConnectionConfiguration@5239} > closed = false > scannerCaching = 2147483647 > scannerMaxResultSize = 2097152 > pool = \{ThreadPoolExecutor@5240} > "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, > active threads = 0, queued tasks = 0, completed tasks = 0]" > operationTimeoutMs = 120 > rpcTimeoutMs = 6 > readRpcTimeoutMs = 6 > writeRpcTimeoutMs = 6 > cleanupPoolOnClose = false > locator = \{HRegionLocator@5241} > multiAp = \{AsyncProcess@5242} > rpcCallerFactory = \{RpcRetryingCallerFactory@5243} > rpcControllerFactory = \{RpcControllerFactory@5244} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21573) More sensible client default timeout values
[ https://issues.apache.org/jira/browse/HBASE-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715572#comment-16715572 ] stack commented on HBASE-21573: --- We should fix at least surface what it takes to make client fail fast (and important configs should be in hbase-default where folks will go looking for them ...) I think if we made stuff fail fast, we'd surface some interesting assumptions we've been depending on w/ a good while now. > More sensible client default timeout values > --- > > Key: HBASE-21573 > URL: https://issues.apache.org/jira/browse/HBASE-21573 > Project: HBase > Issue Type: Wish > Components: Client >Affects Versions: 2.1.1 >Reporter: Cosmin Lehene >Priority: Major > Fix For: 2.1.2 > > > I guess the goal is to have operations allow enough time to recover from > major failures. > While this may make sense for large jobs, it's a PITA for OLTP scenarios and > could probably benefit from a faster failure mode in default > > hbase.rpc.timeout = 6 > hbase.client.operation.timeout = 120 > hbase.client.meta.operation.timeout = 120 > The client meta ops timeout is not in defaults-xml and not documented in the > book either. > [https://hbase.apache.org/book.html#config_timeouts] > > Would it make sense to have aggressive defaults instead? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21574) createConnection / getTable should not return if there's no cluster available
Cosmin Lehene created HBASE-21574: - Summary: createConnection / getTable should not return if there's no cluster available Key: HBASE-21574 URL: https://issues.apache.org/jira/browse/HBASE-21574 Project: HBase Issue Type: Bug Components: Client Affects Versions: 2.1.1 Reporter: Cosmin Lehene Fix For: 2.1.2 You can get a connection / table successfully with no cluster (no zk, hms, hrs) and it also says it's open (closed = false) {code} Connection con = ConnectionFactory.createConnection(getConfiguration()); con.getTable(TableName.valueOf(customersTable)); {code} {code} con = \{ConnectionImplementation@5192} "hconnection-0x32093c94" hostnamesCanChange = true pause = 100 pauseForCQTBE = 100 useMetaReplicas = false metaReplicaCallTimeoutScanInMicroSecond = 100 numTries = 16 rpcTimeout = 6 asyncProcess = \{AsyncProcess@5242} stats = null closed = false aborted = false clusterStatusListener = null metaRegionLock = \{Object@5249} masterLock = \{Object@5250} batchPool = \{ThreadPoolExecutor@5240} "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]" metaLookupPool = null cleanupPool = true conf = \{Configuration@5238} "Configuration: core-default.xml, core-site.xml, hbase-default.xml, hbase-site.xml" connectionConfig = \{ConnectionConfiguration@5239} rpcClient = \{NettyRpcClient@5251} metaCache = \{MetaCache@5252} metrics = null user = \{User$SecureHadoopUser@5253} "clehene (auth:SIMPLE)" rpcCallerFactory = \{RpcRetryingCallerFactory@5243} rpcControllerFactory = \{RpcControllerFactory@5244} interceptor = \{NoOpRetryableCallerInterceptor@5254} "NoOpRetryableCallerInterceptor" registry = \{ZKAsyncRegistry@5255} backoffPolicy = \{ClientBackoffPolicyFactory$NoBackoffPolicy@5256} alternateBufferedMutatorClassName = null userRegionLock = \{ReentrantLock@5257} "java.util.concurrent.locks.ReentrantLock@4d368ebc[Unlocked]" clusterId = "default-cluster" stubs = \{ConcurrentHashMap@5259} size = 0 masterServiceState = \{ConnectionImplementation$MasterServiceState@5260} "MasterService" table = \{HTable@5193} "customers;hconnection-0x32093c94" connection = \{ConnectionImplementation@5192} "hconnection-0x32093c94" tableName = \{TableName@5237} "customers" configuration = \{Configuration@5238} "Configuration: core-default.xml, core-site.xml, hbase-default.xml, hbase-site.xml" connConfiguration = \{ConnectionConfiguration@5239} closed = false scannerCaching = 2147483647 scannerMaxResultSize = 2097152 pool = \{ThreadPoolExecutor@5240} "java.util.concurrent.ThreadPoolExecutor@5bb40116[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]" operationTimeoutMs = 120 rpcTimeoutMs = 6 readRpcTimeoutMs = 6 writeRpcTimeoutMs = 6 cleanupPoolOnClose = false locator = \{HRegionLocator@5241} multiAp = \{AsyncProcess@5242} rpcCallerFactory = \{RpcRetryingCallerFactory@5243} rpcControllerFactory = \{RpcControllerFactory@5244} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715568#comment-16715568 ] Hudson commented on HBASE-21567: Results for branch branch-1.2 [build #582 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/582//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21573) More sensible client default timeout values
Cosmin Lehene created HBASE-21573: - Summary: More sensible client default timeout values Key: HBASE-21573 URL: https://issues.apache.org/jira/browse/HBASE-21573 Project: HBase Issue Type: Bug Components: Client Affects Versions: 2.1.1 Reporter: Cosmin Lehene Fix For: 2.1.2 I guess the goal is to have operations allow enough time to recover from major failures. While this may make sense for large jobs, it's a PITA for OLTP scenarios and could probably benefit from a faster failure mode in default hbase.rpc.timeout = 6 hbase.client.operation.timeout = 120 hbase.client.meta.operation.timeout = 120 The client meta ops timeout is not in defaults-xml and not documented in the book either. [https://hbase.apache.org/book.html#config_timeouts] Would it make sense to have aggressive defaults instead? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21573) More sensible client default timeout values
[ https://issues.apache.org/jira/browse/HBASE-21573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cosmin Lehene updated HBASE-21573: -- Issue Type: Wish (was: Bug) > More sensible client default timeout values > --- > > Key: HBASE-21573 > URL: https://issues.apache.org/jira/browse/HBASE-21573 > Project: HBase > Issue Type: Wish > Components: Client >Affects Versions: 2.1.1 >Reporter: Cosmin Lehene >Priority: Major > Fix For: 2.1.2 > > > I guess the goal is to have operations allow enough time to recover from > major failures. > While this may make sense for large jobs, it's a PITA for OLTP scenarios and > could probably benefit from a faster failure mode in default > > hbase.rpc.timeout = 6 > hbase.client.operation.timeout = 120 > hbase.client.meta.operation.timeout = 120 > The client meta ops timeout is not in defaults-xml and not documented in the > book either. > [https://hbase.apache.org/book.html#config_timeouts] > > Would it make sense to have aggressive defaults instead? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink
[ https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21406: - Status: Patch Available (was: In Progress) Attached first patch version. Basically, added new metric to differentiate sink startup time from last OP applied time. Original behaviour was to always set startup time to TimestampsOfLastAppliedOp, and always show it on "status 'replication'" command, regardless if the sink ever applied any OP. This was confusing, specially for scenarios where cluster was just acting as source, the output could lead to wrong interpretations about sink not applying edits or replication being stuck. With the new metric, we now compare the two metrics values, assuming that if both are the same, there's never been any OP shipped to the given sink, so output would reflect it more clearly, to something as for example: {noformat} SINK: TimeStampStarted=Thu Dec 06 23:59:47 GMT 2018, Waiting for OPs...{noformat} For the replication source issues described earlier, have an ongoing jira: HBASE-21505. > "status 'replication'" should not show SINK if the cluster does not act as > sink > --- > > Key: HBASE-21406 > URL: https://issues.apache.org/jira/browse/HBASE-21406 > Project: HBase > Issue Type: Improvement >Reporter: Daisuke Kobayashi >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-21406-branch-1.001.patch, > HBASE-21406-master.001.patch, Screen Shot 2018-10-31 at 18.12.54.png > > > When replicating in 1 way, from source to target, {{status 'replication'}} on > source always dumps SINK with meaningless metrics. It only makes sense when > running the command on target cluster. > {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is > always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the > time the RS started since it's not acting as sink. > {noformat} > source-1.com >SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, > TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0 >SINK : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 > 23:56:53 PDT 2018 > {noformat} > {{status 'replication'}} on target works as expected. SOURCE is empty as it's > not acting as source: > {noformat} > target-1.com >SOURCE: >SINK : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 > 23:44:08 PDT 2018 > {noformat} > This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always > returns a value (not null). > 1.X > https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204 > 2.X > https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21564: - Attachment: HBASE-21564.master.003.patch > race condition in WAL rolling resulting in size-based rolling getting stuck > --- > > Key: HBASE-21564 > URL: https://issues.apache.org/jira/browse/HBASE-21564 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21564.master.001.patch, > HBASE-21564.master.002.patch, HBASE-21564.master.003.patch > > > Manifests at least with AsyncFsWriter. > There's a window after LogRoller replaces the writer in the WAL, but before > it sets the rollLog boolean to false in the finally, where the WAL class can > request another log roll (it can happen in particular when the logs are > getting archived in the LogRoller thread, and there's high write volume > causing the logs to roll quickly). > LogRoller will blindly reset the rollLog flag in finally and "forget" about > this request. > AsyncWAL in turn never requests it again because its own rollRequested field > is set and it expects a callback. Logs don't get rolled until a periodic roll > is triggered after that. > The acknowledgment of roll requests by LogRoller should be atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21406) "status 'replication'" should not show SINK if the cluster does not act as sink
[ https://issues.apache.org/jira/browse/HBASE-21406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21406: - Attachment: HBASE-21406-master.001.patch > "status 'replication'" should not show SINK if the cluster does not act as > sink > --- > > Key: HBASE-21406 > URL: https://issues.apache.org/jira/browse/HBASE-21406 > Project: HBase > Issue Type: Improvement >Reporter: Daisuke Kobayashi >Assignee: Wellington Chevreuil >Priority: Minor > Attachments: HBASE-21406-branch-1.001.patch, > HBASE-21406-master.001.patch, Screen Shot 2018-10-31 at 18.12.54.png > > > When replicating in 1 way, from source to target, {{status 'replication'}} on > source always dumps SINK with meaningless metrics. It only makes sense when > running the command on target cluster. > {{status 'replication'}} on source, for example. {{AgeOfLastAppliedOp}} is > always zero and {{TimeStampsOfLastAppliedOp}} does not get updated from the > time the RS started since it's not acting as sink. > {noformat} > source-1.com >SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=0, > TimeStampsOfLastShippedOp=Mon Oct 29 23:44:14 PDT 2018, Replication Lag=0 >SINK : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Thu Oct 25 > 23:56:53 PDT 2018 > {noformat} > {{status 'replication'}} on target works as expected. SOURCE is empty as it's > not acting as source: > {noformat} > target-1.com >SOURCE: >SINK : AgeOfLastAppliedOp=70, TimeStampsOfLastAppliedOp=Mon Oct 29 > 23:44:08 PDT 2018 > {noformat} > This is because {{getReplicationLoadSink}}, called in {{admin.rb}}, always > returns a value (not null). > 1.X > https://github.com/apache/hbase/blob/rel/1.4.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L194-L204 > 2.X > https://github.com/apache/hbase/blob/rel/2.0.0/hbase-client/src/main/java/org/apache/hadoop/hbase/ServerLoad.java#L392-L399 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715465#comment-16715465 ] Sergey Shelukhin commented on HBASE-21564: -- Fixed warnings, addressed some RB feedback; I cannot repro the test failures, the logs for both have connection errors... > race condition in WAL rolling resulting in size-based rolling getting stuck > --- > > Key: HBASE-21564 > URL: https://issues.apache.org/jira/browse/HBASE-21564 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21564.master.001.patch, > HBASE-21564.master.002.patch, HBASE-21564.master.003.patch > > > Manifests at least with AsyncFsWriter. > There's a window after LogRoller replaces the writer in the WAL, but before > it sets the rollLog boolean to false in the finally, where the WAL class can > request another log roll (it can happen in particular when the logs are > getting archived in the LogRoller thread, and there's high write volume > causing the logs to roll quickly). > LogRoller will blindly reset the rollLog flag in finally and "forget" about > this request. > AsyncWAL in turn never requests it again because its own rollRequested field > is set and it expects a callback. Logs don't get rolled until a periodic roll > is triggered after that. > The acknowledgment of roll requests by LogRoller should be atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21564: - Attachment: (was: HBASE-21564.master.003.patch) > race condition in WAL rolling resulting in size-based rolling getting stuck > --- > > Key: HBASE-21564 > URL: https://issues.apache.org/jira/browse/HBASE-21564 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21564.master.001.patch, > HBASE-21564.master.002.patch, HBASE-21564.master.003.patch > > > Manifests at least with AsyncFsWriter. > There's a window after LogRoller replaces the writer in the WAL, but before > it sets the rollLog boolean to false in the finally, where the WAL class can > request another log roll (it can happen in particular when the logs are > getting archived in the LogRoller thread, and there's high write volume > causing the logs to roll quickly). > LogRoller will blindly reset the rollLog flag in finally and "forget" about > this request. > AsyncWAL in turn never requests it again because its own rollRequested field > is set and it expects a callback. Logs don't get rolled until a periodic roll > is triggered after that. > The acknowledgment of roll requests by LogRoller should be atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck
[ https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-21564: - Attachment: HBASE-21564.master.003.patch > race condition in WAL rolling resulting in size-based rolling getting stuck > --- > > Key: HBASE-21564 > URL: https://issues.apache.org/jira/browse/HBASE-21564 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21564.master.001.patch, > HBASE-21564.master.002.patch, HBASE-21564.master.003.patch > > > Manifests at least with AsyncFsWriter. > There's a window after LogRoller replaces the writer in the WAL, but before > it sets the rollLog boolean to false in the finally, where the WAL class can > request another log roll (it can happen in particular when the logs are > getting archived in the LogRoller thread, and there's high write volume > causing the logs to roll quickly). > LogRoller will blindly reset the rollLog flag in finally and "forget" about > this request. > AsyncWAL in turn never requests it again because its own rollRequested field > is set and it expects a callback. Logs don't get rolled until a periodic roll > is triggered after that. > The acknowledgment of roll requests by LogRoller should be atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715410#comment-16715410 ] Ankit Singhal commented on HBASE-21246: --- Thanks [~elserj] for the review. {quote}I see RecoveredReplicationSource.java still needs some "unraveling" from Path. WALEntryStream is in a similar position (a couple of others than just those pulled out above). {quote} We may not need to change these Path to WalIdentity once these classes are refactored to abstract FS based implementation. The code related to path is expected to be moved in FS based implementation. {quote} Should DisabledWALProvider have its own implementation of WALIdentity? Looks like we just pass a "special" Path in the FS-based case now – maybe we just make some special implementation of WALIdentity for it instead.{quote} Let me introduce the new WALIdentity for it. {quote}As long as we can spin out the above refactorings into some follow-on work, I would be happy to land this on the feature branch.{quote} Yes, these refactorings goes in another follow-on jira, let me just upload a another patch fixing the checkstyle and the test failure before you commit. > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.master.001.patch, replication-src-creates-wal-reader.jpg, > wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, > wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21559) The RestoreSnapshotFromClientTestBase related UT are flaky
[ https://issues.apache.org/jira/browse/HBASE-21559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715414#comment-16715414 ] stack commented on HBASE-21559: --- This cleaned up the failures nicely. Thanks [~openinx]. > The RestoreSnapshotFromClientTestBase related UT are flaky > -- > > Key: HBASE-21559 > URL: https://issues.apache.org/jira/browse/HBASE-21559 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21559.v1.patch, HBASE-21559.v2.patch, > TEST-org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.xml, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions-output.txt, > > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientAfterSplittingRegions.txt > > > The related UT are: > * TestRestoreSnapshotFromClientAfterSplittingRegions > * TestRestoreSnapshotFromClientWithRegionReplicas > * TestMobRestoreSnapshotFromClientAfterSplittingRegions > I guess the main problem is: a dead lock between SplitTableRegionProcedure > and SnapshotProcedure.. > Attached logs from the failed UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715352#comment-16715352 ] Hudson commented on HBASE-21567: FAILURE: Integrated in Jenkins build HBase-1.2-IT #1188 (See [https://builds.apache.org/job/HBase-1.2-IT/1188/]) HBASE-21567 Allow overriding configs starting up the shell (stack: rev 0cdd8f972f9cb82e883de95435958ea824fc636a) * (edit) bin/hirb.rb * (edit) src/main/asciidoc/_chapters/shell.adoc > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715346#comment-16715346 ] Hudson commented on HBASE-21567: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #507 (See [https://builds.apache.org/job/HBase-1.3-IT/507/]) HBASE-21567 Allow overriding configs starting up the shell (stack: rev d27c835b1cfbbe2c59f0698d3a286b19e7f63471) * (edit) src/main/asciidoc/_chapters/shell.adoc * (edit) bin/hirb.rb > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-19805) NPE in HMaster while issuing a sequence of table splits
[ https://issues.apache.org/jira/browse/HBASE-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser resolved HBASE-19805. Resolution: Incomplete Assignee: (was: Sergey Soldatov) Fix Version/s: (was: 3.0.0) Stale. > NPE in HMaster while issuing a sequence of table splits > --- > > Key: HBASE-19805 > URL: https://issues.apache.org/jira/browse/HBASE-19805 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0-beta-1 >Reporter: Josh Elser >Priority: Critical > > I wrote a toy program to test the client tarball in HBASE-19735. After the > first few region splits, I see the following error in the Master log. > {noformat} > 2018-01-16 14:07:52,797 INFO > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] master.HMaster: > Client=jelser//192.168.1.23 split > myTestTable,1,1516129669054.8313b755f74092118f9dd30a4190ee23. > 2018-01-16 14:07:52,797 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=16000] ipc.RpcServer: > Unexpected throwable object > java.lang.NullPointerException > at > org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:229) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.getAdmin(ConnectionImplementation.java:1175) > at > org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getAdmin(ConnectionUtils.java:149) > at > org.apache.hadoop.hbase.master.assignment.Util.getRegionInfoResponse(Util.java:59) > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.checkSplittable(SplitTableRegionProcedure.java:146) > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.(SplitTableRegionProcedure.java:103) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.createSplitProcedure(AssignmentManager.java:761) > at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1626) > at > org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:134) > at org.apache.hadoop.hbase.master.HMaster.splitRegion(HMaster.java:1618) > at > org.apache.hadoop.hbase.master.MasterRpcServices.splitRegion(MasterRpcServices.java:778) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {noformat} > {code} > public static void main(String[] args) throws Exception { > Configuration conf = HBaseConfiguration.create(); > try (Connection conn = ConnectionFactory.createConnection(conf); > Admin admin = conn.getAdmin()) { > final TableName tn = TableName.valueOf("myTestTable"); > if (admin.tableExists(tn)) { > admin.disableTable(tn); > admin.deleteTable(tn); > } > final TableDescriptor desc = TableDescriptorBuilder.newBuilder(tn) > > .addColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("f1")).build()) > .build(); > admin.createTable(desc); > List splitPoints = new ArrayList<>(16); > for (int i = 1; i <= 16; i++) { > splitPoints.add(Integer.toString(i, 16)); > } > > System.out.println("Splits: " + splitPoints); > int numRegions = admin.getRegions(tn).size(); > for (String splitPoint : splitPoints) { > System.out.println("Splitting on " + splitPoint); > admin.split(tn, Bytes.toBytes(splitPoint)); > Thread.sleep(200); > int newRegionSize = admin.getRegions(tn).size(); > while (numRegions == newRegionSize) { > Thread.sleep(50); > newRegionSize = admin.getRegions(tn).size(); > } > } > {code} > A quick glance, looks like {{Util.getRegionInfoResponse}} is to blame. > {code} > static GetRegionInfoResponse getRegionInfoResponse(final MasterProcedureEnv > env, > final ServerName regionLocation, final RegionInfo hri, boolean > includeBestSplitRow) > throws IOException { > // TODO: There is no timeout on this controller. Set one! > HBaseRpcController controller = > env.getMasterServices().getClusterConnection(). > getRpcControllerFactory().newController(); > final AdminService.BlockingInterface admin = > > env.getMasterServices().getClusterConnection().getAdmin(regionLocation); > {code} > We don't validate that we have a non-null {{ServerName regionLocation}}.
[jira] [Updated] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21567: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.10 1.2.10 1.3.3 1.5.0 Status: Resolved (was: Patch Available) Pushed it to 1.2+. > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.2.10, 1.4.10, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21567) Allow overriding configs starting up the shell
[ https://issues.apache.org/jira/browse/HBASE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715266#comment-16715266 ] stack commented on HBASE-21567: --- bq. Ok, looks like rubocop is complaining for almost everything. Yeah, someone once said that if you don't like the contributor, make them fix the rubocop warnings (smile)! Let me push! Thanks for review [~psomogyi] (and [~Apache9]) > Allow overriding configs starting up the shell > -- > > Key: HBASE-21567 > URL: https://issues.apache.org/jira/browse/HBASE-21567 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.3 > > Attachments: HBASE-21567.master.001.patch, > HBASE-21567.master.002.patch, HBASE-21567.master.003.patch > > > Needed to be able to point a local install at a remote cluster. I wanted to > be able to do this: > ${HBASE_HOME}/bin/hbase shell > -Dhbase.zookeeper.quorum=ZK0.remote.cluster.example.org,ZK1.remote.cluster.example.org,ZK2.remote.cluster.example.org -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21526) Use AsyncClusterConnection in ServerManager for getRsAdmin
[ https://issues.apache.org/jira/browse/HBASE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715113#comment-16715113 ] stack commented on HBASE-21526: --- I +1'd it up on rb (thanks for class rename). > Use AsyncClusterConnection in ServerManager for getRsAdmin > -- > > Key: HBASE-21526 > URL: https://issues.apache.org/jira/browse/HBASE-21526 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-21526-HBASE-21512-v1.patch, > HBASE-21526-HBASE-21512-v2.patch, HBASE-21526-HBASE-21512.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface
[ https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715060#comment-16715060 ] Josh Elser commented on HBASE-21246: {quote}used master for pre-commit jenkin build as HBASE-20952 yet to be rebased {quote} Just rebased that branch now. {code:java} -PriorityBlockingQueue newPaths = -new PriorityBlockingQueue(queueSizePerGroup, new LogsComparator()); -pathsLoop: for (Path path : queue) { - if (fs.exists(path)) { // still in same location, don't need to do anything -newPaths.add(path); +PriorityBlockingQueue newWalIds = +new PriorityBlockingQueue(queueSizePerGroup, new LogsComparator()); +pathsLoop: for (WALIdentity walId : queue) { + if (fs.exists(((FSWALIdentity)walId).getPath())) { // still in same location, don't need to do anything +newWalIds.add(walId);{code} I see RecoveredReplicationSource.java still needs some "unraveling" from Path. {code:java} - stat = fs.getFileStatus(this.currentPath); + stat = fs.getFileStatus(((FSWALIdentity)this.currentWAlIdentity).getPath());{code} {code:java} -Path archivedLog = getArchivedLog(path); -if (!path.equals(archivedLog)) { +FSWALIdentity archivedLog = new FSWALIdentity(getArchivedLog(walId.getPath())); +if (!walId.equals(archivedLog)) {{code} WALEntryStream is in a similar position (a couple of others than just those pulled out above). {code:java} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java index 75439fe6c5..ad9f6bda30 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java @@ -63,7 +63,7 @@ class DisabledWALProvider implements WALProvider { if (null == providerId) { providerId = "defaultDisabled"; } -disabled = new DisabledWAL(new Path(FSUtils.getWALRootDir(conf), providerId), conf, null); +disabled = new DisabledWAL(new FSWALIdentity(new Path(FSUtils.getWALRootDir(conf), providerId)), conf, null);{code} {code:java} -protected final Path path; +protected final FSWALIdentity walId;{code} Should DisabledWALProvider have its own implementation of WALIdentity? Looks like we just pass a "special" Path in the FS-based case now – maybe we just make some special implementation of WALIdentity for it instead. Overall, I think this is a really nice middle-ground of changing "enough" without changing too much. As long as we can spin out the above refactorings into some follow-on work, I would be happy to land this on the feature branch. > Introduce WALIdentity interface > --- > > Key: HBASE-21246 > URL: https://issues.apache.org/jira/browse/HBASE-21246 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Major > Fix For: HBASE-20952 > > Attachments: 21246.003.patch, 21246.20.txt, 21246.21.txt, > 21246.23.txt, 21246.24.txt, 21246.25.txt, 21246.26.txt, 21246.34.txt, > 21246.37.txt, 21246.39.txt, 21246.41.txt, 21246.43.txt, > 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, > 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, > 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch, > HBASE-21246.master.001.patch, replication-src-creates-wal-reader.jpg, > wal-factory-providers.png, wal-providers.png, wal-splitter-reader.jpg, > wal-splitter-writer.jpg > > > We are introducing WALIdentity interface so that the WAL representation can > be decoupled from distributed filesystem. > The interface provides getName method whose return value can represent > filename in distributed filesystem environment or, the name of the stream > when the WAL is backed by log stream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715005#comment-16715005 ] Hadoop QA commented on HBASE-21570: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 42s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} hbase-client: The patch generated 0 new + 3 unchanged - 1 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} The patch passed checkstyle in hbase-server {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 45s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 9s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 18s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 53s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}173m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21570 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951202/HBASE-21570.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux cde26df8f748 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714933#comment-16714933 ] Hadoop QA commented on HBASE-21505: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 13s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} hbase-hadoop2-compat in master has 18 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 15s{color} | {color:red} hbase-server: The patch generated 2 new + 85 unchanged - 3 fixed = 87 total (was 88) {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 7s{color} | {color:red} The patch generated 55 new + 405 unchanged - 9 fixed = 460 total (was 414) {color} | | {color:orange}-0{color} | {color:orange} ruby-lint {color} | {color:orange} 0m 4s{color} | {color:orange} The patch generated 3 new + 748 unchanged - 1 fixed = 751 total (was 749) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 14s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 19s{color} | {color:red} hbase-server generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 34s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit
[jira] [Commented] (HBASE-21568) Disable use of BlockCache for LoadIncrementalHFiles
[ https://issues.apache.org/jira/browse/HBASE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714860#comment-16714860 ] Josh Elser commented on HBASE-21568: Ok, thanks Guanghao! Just to be clear, is this your +1 for the current patch as well? > Disable use of BlockCache for LoadIncrementalHFiles > --- > > Key: HBASE-21568 > URL: https://issues.apache.org/jira/browse/HBASE-21568 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Fix For: 2.2.0, 2.1.2, 2.0.4 > > Attachments: HBASE-21568.001.branch-2.0.patch > > > [~vrodionov] added some API to {{CacheConfig}} via HBASE-17151 to allow > callers to specify that they do not want to use a block cache when reading an > HFile. > If the BucketCache is set up to use the FileSystem, we can have a situation > where the client tries to instantiate the BucketCache and is disallowed due > to filesystem permissions: > {code:java} > 2018-12-03 16:22:03,032 ERROR [LoadIncrementalHFiles-0] bucket.FileIOEngine: > Failed allocating cache on /mnt/hbase/cache.data > java.io.FileNotFoundException: /mnt/hbase/cache.data (Permission denied) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at java.io.RandomAccessFile.(RandomAccessFile.java:124) > at > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.(FileIOEngine.java:81) > at > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:382) > at > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:262) > at > org.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:633) > at > org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:663) > at org.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:250) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:713) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:621) > at > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$3.call(LoadIncrementalHFiles.java:617) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > LoadIncrementalHfiles should provide the {{CacheConfig.DISABLE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21512) Introduce an AsyncClusterConnection and replace the usage of ClusterConnection
[ https://issues.apache.org/jira/browse/HBASE-21512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714772#comment-16714772 ] Hudson commented on HBASE-21512: Results for branch HBASE-21512 [build #12 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/12//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Introduce an AsyncClusterConnection and replace the usage of ClusterConnection > -- > > Key: HBASE-21512 > URL: https://issues.apache.org/jira/browse/HBASE-21512 > Project: HBase > Issue Type: Umbrella >Reporter: Duo Zhang >Priority: Major > Fix For: 3.0.0 > > > At least for the RSProcedureDispatcher, with CompletableFuture we do not need > to set a delay and use a thread pool any more, which could reduce the > resource usage and also the latency. > Once this is done, I think we can remove the ClusterConnection completely, > and start to rewrite the old sync client based on the async client, which > could reduce the code base a lot for our client. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714717#comment-16714717 ] Duo Zhang commented on HBASE-21570: --- Review board link: https://reviews.apache.org/r/69539/ > Add write buffer periodic flush support for AsyncBufferedMutator > > > Key: HBASE-21570 > URL: https://issues.apache.org/jira/browse/HBASE-21570 > Project: HBase > Issue Type: Sub-task > Components: asyncclient, Client >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3 > > Attachments: HBASE-21570.patch > > > Align with the BufferedMutator interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21570: -- Attachment: HBASE-21570.patch > Add write buffer periodic flush support for AsyncBufferedMutator > > > Key: HBASE-21570 > URL: https://issues.apache.org/jira/browse/HBASE-21570 > Project: HBase > Issue Type: Sub-task > Components: asyncclient, Client >Reporter: Duo Zhang >Priority: Major > Attachments: HBASE-21570.patch > > > Align with the BufferedMutator interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21570) Add write buffer periodic flush support for AsyncBufferedMutator
[ https://issues.apache.org/jira/browse/HBASE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21570: -- Assignee: Duo Zhang Fix Version/s: 2.1.3 2.0.4 2.2.0 3.0.0 Status: Patch Available (was: Open) > Add write buffer periodic flush support for AsyncBufferedMutator > > > Key: HBASE-21570 > URL: https://issues.apache.org/jira/browse/HBASE-21570 > Project: HBase > Issue Type: Sub-task > Components: asyncclient, Client >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.4, 2.1.3 > > Attachments: HBASE-21570.patch > > > Align with the BufferedMutator interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714616#comment-16714616 ] Wellington Chevreuil commented on HBASE-21505: -- Resolved conflict from last patch. > Several inconsistencies on information reported for Replication Sources by > hbase shell status 'replication' command. > > > Key: HBASE-21505 > URL: https://issues.apache.org/jira/browse/HBASE-21505 > Project: HBase > Issue Type: Bug >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Attachments: > 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, > HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, > HBASE-21505-master.003.patch, HBASE-21505-master.004.patch > > > While reviewing hbase shell status 'replication' command, noticed the > following issues related to replication source section: > 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when > no new edits were added to source, so nothing was really shipped. Test steps > performed: > 1.1) Source cluster with only one table targeted to replication; > 1.2) Added a new row, confirmed the row appeared in Target cluster; > 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp > shows current timestamp T1. > 1.4) Waited 30 seconds, no new data added to source. Issued status > 'replication' command, now shows timestamp T2. > 2) When replication is stuck due some connectivity issues or target > unavailability, if new edits are added in source, reported AgeOfLastShippedOp > is wrongly showing same value as "Replication Lag". This is incorrect, > AgeOfLastShippedOp should not change until there's indeed another edit > shipped to target. Test steps performed: > 2.1) Source cluster with only one table targeted to replication; > 2.2) Stopped target cluster RS; > 2.3) Put a new row on source. Running status 'replication' command does show > lag increasing. TimeStampsOfLastShippedOp seems correct also, no further > updates as described on bullet #1 above. > 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some > time before it got finally shipped to target. Test steps performed: > 3.1) Source cluster with only one table targeted to replication; > 3.2) Stopped target cluster RS; > 3.3) Put a new row on source. > 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > T1: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > T2: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3.5) Restart target cluster RS and verified the new row appeared there. No > new edit added, but status 'replication' command reports AgeOfLastShippedOp > as 0, while it should be the diff between the time it concluded shipping at > target and the time it was added in source: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0 > {noformat} > 4) When replication is stuck due some connectivity issues or target > unavailability, if RS is restarted, once recovered queue source is started, > TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 > GMT 1970, for example), thus "Replication Lag" also gives a complete > inaccurate value. > Tests performed: > 4.1) Source cluster with only one table targeted to replication; > 4.2) Stopped target cluster RS; > 4.3) Put a new row on source, restart RS on source, waited a few seconds for > recovery queue source to startup, then it gives: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication > Lag=9223372036854775807 > {noformat} > Also, we should report status to all sources running, current output format > gives the impression there’s only one, even when there are recovery queues, > for instance. > Here is a list of ideas on how the command should
[jira] [Updated] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21505: - Attachment: HBASE-21505-master.004.patch > Several inconsistencies on information reported for Replication Sources by > hbase shell status 'replication' command. > > > Key: HBASE-21505 > URL: https://issues.apache.org/jira/browse/HBASE-21505 > Project: HBase > Issue Type: Bug >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Attachments: > 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, > HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, > HBASE-21505-master.003.patch, HBASE-21505-master.004.patch > > > While reviewing hbase shell status 'replication' command, noticed the > following issues related to replication source section: > 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when > no new edits were added to source, so nothing was really shipped. Test steps > performed: > 1.1) Source cluster with only one table targeted to replication; > 1.2) Added a new row, confirmed the row appeared in Target cluster; > 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp > shows current timestamp T1. > 1.4) Waited 30 seconds, no new data added to source. Issued status > 'replication' command, now shows timestamp T2. > 2) When replication is stuck due some connectivity issues or target > unavailability, if new edits are added in source, reported AgeOfLastShippedOp > is wrongly showing same value as "Replication Lag". This is incorrect, > AgeOfLastShippedOp should not change until there's indeed another edit > shipped to target. Test steps performed: > 2.1) Source cluster with only one table targeted to replication; > 2.2) Stopped target cluster RS; > 2.3) Put a new row on source. Running status 'replication' command does show > lag increasing. TimeStampsOfLastShippedOp seems correct also, no further > updates as described on bullet #1 above. > 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some > time before it got finally shipped to target. Test steps performed: > 3.1) Source cluster with only one table targeted to replication; > 3.2) Stopped target cluster RS; > 3.3) Put a new row on source. > 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > T1: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > T2: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3.5) Restart target cluster RS and verified the new row appeared there. No > new edit added, but status 'replication' command reports AgeOfLastShippedOp > as 0, while it should be the diff between the time it concluded shipping at > target and the time it was added in source: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0 > {noformat} > 4) When replication is stuck due some connectivity issues or target > unavailability, if RS is restarted, once recovered queue source is started, > TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 > GMT 1970, for example), thus "Replication Lag" also gives a complete > inaccurate value. > Tests performed: > 4.1) Source cluster with only one table targeted to replication; > 4.2) Stopped target cluster RS; > 4.3) Put a new row on source, restart RS on source, waited a few seconds for > recovery queue source to startup, then it gives: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication > Lag=9223372036854775807 > {noformat} > Also, we should report status to all sources running, current output format > gives the impression there’s only one, even when there are recovery queues, > for instance. > Here is a list of ideas on how the command should report under different >
[jira] [Commented] (HBASE-21572) The "progress" object in "Compactor" is not thread-safe, this may cause the misleading progress information on the web UI.
[ https://issues.apache.org/jira/browse/HBASE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714556#comment-16714556 ] Hadoop QA commented on HBASE-21572: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 53s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 47s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 25s{color} | {color:red} hbase-server generated 5 new + 183 unchanged - 5 fixed = 188 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 21s{color} | {color:red} hbase-server: The patch generated 4 new + 24 unchanged - 0 fixed = 28 total (was 24) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 48s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}256m 43s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}306m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestMajorCompaction | | | hadoop.hbase.client.TestFromClientSide3 | | | hadoop.hbase.client.TestAdmin1 | | | hadoop.hbase.client.TestFromClientSide | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21572 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951159/HBASE-21572.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 6920c71ea1ef 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 79d90c87b5 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | |
[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714533#comment-16714533 ] Hadoop QA commented on HBASE-21505: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HBASE-21505 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-21505 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12951174/HBASE-21505-master.003.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/15231/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Several inconsistencies on information reported for Replication Sources by > hbase shell status 'replication' command. > > > Key: HBASE-21505 > URL: https://issues.apache.org/jira/browse/HBASE-21505 > Project: HBase > Issue Type: Bug >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Attachments: > 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, > HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, > HBASE-21505-master.003.patch > > > While reviewing hbase shell status 'replication' command, noticed the > following issues related to replication source section: > 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when > no new edits were added to source, so nothing was really shipped. Test steps > performed: > 1.1) Source cluster with only one table targeted to replication; > 1.2) Added a new row, confirmed the row appeared in Target cluster; > 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp > shows current timestamp T1. > 1.4) Waited 30 seconds, no new data added to source. Issued status > 'replication' command, now shows timestamp T2. > 2) When replication is stuck due some connectivity issues or target > unavailability, if new edits are added in source, reported AgeOfLastShippedOp > is wrongly showing same value as "Replication Lag". This is incorrect, > AgeOfLastShippedOp should not change until there's indeed another edit > shipped to target. Test steps performed: > 2.1) Source cluster with only one table targeted to replication; > 2.2) Stopped target cluster RS; > 2.3) Put a new row on source. Running status 'replication' command does show > lag increasing. TimeStampsOfLastShippedOp seems correct also, no further > updates as described on bullet #1 above. > 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some > time before it got finally shipped to target. Test steps performed: > 3.1) Source cluster with only one table targeted to replication; > 3.2) Stopped target cluster RS; > 3.3) Put a new row on source. > 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > T1: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > T2: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3.5) Restart target cluster RS and verified the new row appeared there. No > new edit added, but status 'replication' command reports AgeOfLastShippedOp > as 0, while it should be the diff between the time it concluded shipping at > target and the time it was added in source: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0 > {noformat} > 4) When replication is stuck due some connectivity issues or target > unavailability, if RS is restarted, once recovered queue source is started, > TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01
[jira] [Commented] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714528#comment-16714528 ] Wellington Chevreuil commented on HBASE-21505: -- Third patch addressing issues from last build. > Several inconsistencies on information reported for Replication Sources by > hbase shell status 'replication' command. > > > Key: HBASE-21505 > URL: https://issues.apache.org/jira/browse/HBASE-21505 > Project: HBase > Issue Type: Bug >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Attachments: > 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, > HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, > HBASE-21505-master.003.patch > > > While reviewing hbase shell status 'replication' command, noticed the > following issues related to replication source section: > 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when > no new edits were added to source, so nothing was really shipped. Test steps > performed: > 1.1) Source cluster with only one table targeted to replication; > 1.2) Added a new row, confirmed the row appeared in Target cluster; > 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp > shows current timestamp T1. > 1.4) Waited 30 seconds, no new data added to source. Issued status > 'replication' command, now shows timestamp T2. > 2) When replication is stuck due some connectivity issues or target > unavailability, if new edits are added in source, reported AgeOfLastShippedOp > is wrongly showing same value as "Replication Lag". This is incorrect, > AgeOfLastShippedOp should not change until there's indeed another edit > shipped to target. Test steps performed: > 2.1) Source cluster with only one table targeted to replication; > 2.2) Stopped target cluster RS; > 2.3) Put a new row on source. Running status 'replication' command does show > lag increasing. TimeStampsOfLastShippedOp seems correct also, no further > updates as described on bullet #1 above. > 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some > time before it got finally shipped to target. Test steps performed: > 3.1) Source cluster with only one table targeted to replication; > 3.2) Stopped target cluster RS; > 3.3) Put a new row on source. > 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > T1: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > T2: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3.5) Restart target cluster RS and verified the new row appeared there. No > new edit added, but status 'replication' command reports AgeOfLastShippedOp > as 0, while it should be the diff between the time it concluded shipping at > target and the time it was added in source: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0 > {noformat} > 4) When replication is stuck due some connectivity issues or target > unavailability, if RS is restarted, once recovered queue source is started, > TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 > GMT 1970, for example), thus "Replication Lag" also gives a complete > inaccurate value. > Tests performed: > 4.1) Source cluster with only one table targeted to replication; > 4.2) Stopped target cluster RS; > 4.3) Put a new row on source, restart RS on source, waited a few seconds for > recovery queue source to startup, then it gives: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication > Lag=9223372036854775807 > {noformat} > Also, we should report status to all sources running, current output format > gives the impression there’s only one, even when there are recovery queues, > for instance. > Here is a list of ideas on how the command should report under
[jira] [Updated] (HBASE-21505) Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.
[ https://issues.apache.org/jira/browse/HBASE-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wellington Chevreuil updated HBASE-21505: - Attachment: HBASE-21505-master.003.patch > Several inconsistencies on information reported for Replication Sources by > hbase shell status 'replication' command. > > > Key: HBASE-21505 > URL: https://issues.apache.org/jira/browse/HBASE-21505 > Project: HBase > Issue Type: Bug >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Major > Attachments: > 0001-HBASE-21505-initial-version-for-more-detailed-report.patch, > HBASE-21505-master.001.patch, HBASE-21505-master.002.patch, > HBASE-21505-master.003.patch > > > While reviewing hbase shell status 'replication' command, noticed the > following issues related to replication source section: > 1) TimeStampsOfLastShippedOp keeps getting updated and increasing even when > no new edits were added to source, so nothing was really shipped. Test steps > performed: > 1.1) Source cluster with only one table targeted to replication; > 1.2) Added a new row, confirmed the row appeared in Target cluster; > 1.3) Issued status 'replication' command in source, TimeStampsOfLastShippedOp > shows current timestamp T1. > 1.4) Waited 30 seconds, no new data added to source. Issued status > 'replication' command, now shows timestamp T2. > 2) When replication is stuck due some connectivity issues or target > unavailability, if new edits are added in source, reported AgeOfLastShippedOp > is wrongly showing same value as "Replication Lag". This is incorrect, > AgeOfLastShippedOp should not change until there's indeed another edit > shipped to target. Test steps performed: > 2.1) Source cluster with only one table targeted to replication; > 2.2) Stopped target cluster RS; > 2.3) Put a new row on source. Running status 'replication' command does show > lag increasing. TimeStampsOfLastShippedOp seems correct also, no further > updates as described on bullet #1 above. > 2.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3) AgeOfLastShippedOp gets set to 0 even when a given edit had taken some > time before it got finally shipped to target. Test steps performed: > 3.1) Source cluster with only one table targeted to replication; > 3.2) Stopped target cluster RS; > 3.3) Put a new row on source. > 3.4) AgeOfLastShippedOp keeps increasing together with Replication Lag, even > though there's no new edit shipped to target: > {noformat} > T1: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=5581, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=5581 > ... > T2: > ... > SOURCE: PeerID=1, AgeOfLastShippedOp=8586, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=8586 > ... > {noformat} > 3.5) Restart target cluster RS and verified the new row appeared there. No > new edit added, but status 'replication' command reports AgeOfLastShippedOp > as 0, while it should be the diff between the time it concluded shipping at > target and the time it was added in source: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Wed Nov 21 02:50:23 GMT 2018, Replication Lag=0 > {noformat} > 4) When replication is stuck due some connectivity issues or target > unavailability, if RS is restarted, once recovered queue source is started, > TimeStampsOfLastShippedOp is set to initial java date (Thu Jan 01 01:00:00 > GMT 1970, for example), thus "Replication Lag" also gives a complete > inaccurate value. > Tests performed: > 4.1) Source cluster with only one table targeted to replication; > 4.2) Stopped target cluster RS; > 4.3) Put a new row on source, restart RS on source, waited a few seconds for > recovery queue source to startup, then it gives: > {noformat} > SOURCE: PeerID=1, AgeOfLastShippedOp=0, SizeOfLogQueue=1, > TimeStampsOfLastShippedOp=Thu Jan 01 01:00:00 GMT 1970, Replication > Lag=9223372036854775807 > {noformat} > Also, we should report status to all sources running, current output format > gives the impression there’s only one, even when there are recovery queues, > for instance. > Here is a list of ideas on how the command should report under different > states of replication: > a) Source