[jira] [Created] (HBASE-24958) CompactingMemStore.timeOfOldestEdit error update

2020-08-26 Thread wenfeiyi666 (Jira)
wenfeiyi666 created HBASE-24958:
---

 Summary: CompactingMemStore.timeOfOldestEdit error update
 Key: HBASE-24958
 URL: https://issues.apache.org/jira/browse/HBASE-24958
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.2.5, 2.3.1, 3.0.0-alpha-1
Reporter: wenfeiyi666
Assignee: wenfeiyi666
 Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.2


when use 'flush in memory', update timeOfOldestEdit every flush in memory, 
cause PeriodicMemStoreFlusher to not take effect, wals not free, constant 
backlog until maxlogs  triggers forced flush, makes failure recovery slower



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24957) ZKTableStateClientSideReader#isDisabledTable doesn't check if table exists or not.

2020-08-26 Thread Rushabh Shah (Jira)
Rushabh Shah created HBASE-24957:


 Summary: ZKTableStateClientSideReader#isDisabledTable doesn't 
check if table exists or not.
 Key: HBASE-24957
 URL: https://issues.apache.org/jira/browse/HBASE-24957
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 1.6.0
Reporter: Rushabh Shah
Assignee: Rushabh Shah


The following bug exists only in branch-1 and below.

ZKTableStateClientSideReader#isDisabledTable returns false even if table 
doesn't exists.

Below is the code snippet:

 {code:title=ZKTableStateClientSideReader.java|borderStyle=solid}
  public static boolean isDisabledTable(final ZooKeeperWatcher zkw,
  final TableName tableName)
  throws KeeperException, InterruptedException {
ZooKeeperProtos.Table.State state = getTableState(zkw, tableName);---> 
We should check here if state is null or not.
return isTableState(ZooKeeperProtos.Table.State.DISABLED, state);
  }
}
{code}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24689.

Resolution: Fixed

> Generate CHANGES.md and RELEASENOTES.md for 2.2.6
> -
>
> Key: HBASE-24689
> URL: https://issues.apache.org/jira/browse/HBASE-24689
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24956) ConnectionManager#userRegionLock waits for lock indefinitely.

2020-08-26 Thread Rushabh Shah (Jira)
Rushabh Shah created HBASE-24956:


 Summary: ConnectionManager#userRegionLock waits for lock 
indefinitely.
 Key: HBASE-24956
 URL: https://issues.apache.org/jira/browse/HBASE-24956
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 1.3.2
Reporter: Rushabh Shah
Assignee: Rushabh Shah


One of our customers experienced high latencies (in order of 3-4 minutes) for 
point lookup query (We use phoenix on top of hbase).

We have different threads sharing the same hconnection.  Looks like multiple 
threads are stuck at the same place. 
[https://github.com/apache/hbase/blob/branch-1.3/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java#L1282]
 

We have set the following configuration parameters to ensure query fails with a 
reasonable SLAs:

1. hbase.client.meta.operation.timeout

2. hbase.client.operation.timeout

3. hbase.client.scanner.timeout.period

But since  userRegionLock can wait for lock indefinitely the call will not fail 
within SLA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24955) Clarify patch upgrade compatibility guarantees

2020-08-26 Thread Bharath Vissapragada (Jira)
Bharath Vissapragada created HBASE-24955:


 Summary: Clarify patch upgrade compatibility guarantees
 Key: HBASE-24955
 URL: https://issues.apache.org/jira/browse/HBASE-24955
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.0.0-alpha-1, 2.3.3, 1.7.0
Reporter: Bharath Vissapragada


Per the [compatibility|https://hbase.apache.org/book.html#hbase.versioning] 
guidelines (specifically section "Client-Server wire protocol compatibility
 ") which says "We could only allow upgrading the server first. I.e. the server 
would be backward compatible to an old client, that way new APIs are OK."

This gives an impression that it is fine to break API compatibility in patch 
upgrades and expect the users to upgrade server binaries first before upgrading 
clients. However, when considering a back-port of HBASE-24765, it was noted by 
[~zhangduo] and [~ndimiduk] that this compatibility shouldn't be broken. Seems 
like something that should be clarified in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24954) incorrect value for AuthUtil.HBASE_CLIENT_KERBEROS_PRINCIPAL

2020-08-26 Thread Jason Plurad (Jira)
Jason Plurad created HBASE-24954:


 Summary: incorrect value for 
AuthUtil.HBASE_CLIENT_KERBEROS_PRINCIPAL
 Key: HBASE-24954
 URL: https://issues.apache.org/jira/browse/HBASE-24954
 Project: HBase
  Issue Type: Bug
  Components: asyncclient, Client, security
Affects Versions: 2.2.0, 3.0.0-alpha-1
Reporter: Jason Plurad


[HBASE-20886|https://issues.apache.org/jira/browse/HBASE-20886] introduced 
constants for HBASE_CLIENT_KEYTAB_FILE and HBASE_CLIENT_KERBEROS_PRINCIPAL, 
however the value for HBASE_CLIENT_KERBEROS_PRINCIPAL is incorrectly assigned 
as "hbase.client.keytab.principal". The correct value should be 
"hbase.client.kerberos.principal".

"hbase.client.keytab.principal" is inconsistent with the [previous 
code|https://github.com/apache/hbase/blob/rel/2.1.9/hbase-common/src/main/java/org/apache/hadoop/hbase/AuthUtil.java#L96],
 so clients migrating to 2.2.0 would need to update their configurations to 
match the incorrect value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] The second HBase 2.2.6 release candidate (RC1) is available

2020-08-26 Thread Guanghao Zhang
Please vote on this release candidate (RC) for Apache HBase 2.2.6.

The VOTE will remain open for at least 72 hours.

[ ] +1 Release this package as Apache HBase 2.2.6
[ ] -1 Do not release this package because ...

The tag to be voted on is 2.2.6RC1. The release files, including
signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/

Maven artifacts are available in a staging repository at:
https://repository.apache.org/content/repositories/orgapachehbase-1406/

Signatures used for HBase RCs can be found in this file:
https://dist.apache.org/repos/dist/release/hbase/KEYS

The list of bug fixes going into 2.2.6 can be found in included
CHANGES.md and RELEASENOTES.md available here:
https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/CHANGES.md
https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/RELEASENOTES.md

A detailed source and binary compatibility report for this release is
available at:
https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/api_compare_2.2.6RC1_to_2.2.5.html

To learn more about Apache HBase, please see http://hbase.apache.org/

Thanks,
Guanghao Zhang


[jira] [Reopened] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-24689:


> Generate CHANGES.md and RELEASENOTES.md for 2.2.6
> -
>
> Key: HBASE-24689
> URL: https://issues.apache.org/jira/browse/HBASE-24689
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24897) RegionReplicaFlushHandler should handle NoServerForRegionException to avoid aborting RegionServer

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24897.

Fix Version/s: 2.2.6
   Resolution: Fixed

> RegionReplicaFlushHandler should handle NoServerForRegionException to avoid 
> aborting RegionServer
> -
>
> Key: HBASE-24897
> URL: https://issues.apache.org/jira/browse/HBASE-24897
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> Debug flaky test TestRegionReplicaReplicationEndpoint, I found the RS aborted 
> because RegionReplicaFlushHandler flush failed. When create a new table with 
> region replica, the assign order may be:
>  # assign 0002 replica region and trigger primary region flush.
>  # assign 0001 replica region and trigger primary region flush.
>  # assign primary region.
> But the primary region flush may failed because the primary region not opened 
> now. So it may abort the RS..
>  
> {code:java}
> 2020-08-18 16:56:30,041 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0002.66e9757a05fbae7623cfea3369fc8354.
> 2020-08-18 16:56:30,558 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463_0001.22ff45423b0f1f0e93794f673449d140.
> 2020-08-18 16:56:31,192 INFO 
> [RS_OPEN_REGION-regionserver/hao-OptiPlex-7050:0-0] 
> handler.AssignRegionHandler(141): Opened 
> testRegionReplicaReplicationIgnoresDisabledTables_drop_false_disabledReplication_false,,1597740978463.901f9cd06bbf27ef7c2d70b5af725cd2.
> 2020-08-18 16:58:53,857 ERROR 
> [RS_REGION_REPLICA_FLUSH_OPS-regionserver/hao-OptiPlex-7050:0-0] 
> helpers.MarkerIgnoringBase(159): * ABORTING region server 
> hao-optiplex-7050,36368,1597740961432: ServerAborting because an exception 
> was thrown *
> org.apache.hadoop.hbase.client.NoServerForRegionException: No server address 
> listed in hbase:meta for region 
> testRegionReplicaReplicationWithReplicas_10,,1597741128945.0f541dc1a7ca64797c4cf054adb9edfb.
>  containing row 
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:926)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:784)
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:140)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:147)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:98)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:84)
>   at 
> org.apache.hadoop.hbase.client.FlushRegionCallable.prepare(FlushRegionCallable.java:62)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.triggerFlushInPrimaryRegion(RegionReplicaFlushHandler.java:129)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.RegionReplicaFlushHandler.process(RegionReplicaFlushHandler.java:78)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> I thought the fix should be assign primary region firstly when enable region 
> replica featue. Will check the implmenation of region replica.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24881) Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24881.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Fix flaky TestMasterAbortAndRSGotKilled for branch-2.2
> --
>
> Key: HBASE-24881
> URL: https://issues.apache.org/jira/browse/HBASE-24881
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> I meet this problem on branch-2.2 too. This case happened because the 
> DelayCloseCP. The event execute order is:
>  # Close regiong. But because the DelayCloseCP, it will close after 10 
> seconds.
>  # Finish ut and shutdown cluster.
>  # Shutdown master.
>  # Shutdown RS. Call waitOnAllRegionsToClose method. But abortRequested is 
> false now.
>  # Close region and failed because master is down and report master error. 
> Then abort RegionServer and set abortRequested to ture.
>  # waitOnAllRegionsToClose hanged because the online regions cannot be empty.
>  
> waitOnAllRegionsToClose(final boolean abort) already consider the abort case 
> but the problem is abortRequested is false when call this method. I thought 
> the fix should be that keep to check the abortRequested in 
> waitOnAllRegionsToClose method internal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24870) Ignore TestAsyncTableRSCrashPublish

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24870.

Fix Version/s: 2.2.6
   Resolution: Fixed

> Ignore TestAsyncTableRSCrashPublish
> ---
>
> Key: HBASE-24870
> URL: https://issues.apache.org/jira/browse/HBASE-24870
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.2.6
>
>
> [ERROR] Failures: 
> [ERROR] TestAsyncTableRSCrashPublish.test:94 Waiting timed out after [60,000] 
> msec
>  
> I meet this failure many times when runAllTests. And other developers meet 
> this too when vote RC. Let's ignore this first and enable this after parent 
> issue resolved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23987) NettyRpcClientConfigHelper will not share event loop by default which is incorrect

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-23987.

Fix Version/s: 2.2.6
   Resolution: Fixed

> NettyRpcClientConfigHelper will not share event loop by default which is 
> incorrect
> --
>
> Key: HBASE-23987
> URL: https://issues.apache.org/jira/browse/HBASE-23987
> Project: HBase
>  Issue Type: Bug
>  Components: Client, rpc
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24928) balanceRSGroup should skip generating balance plan for disabled table and splitParent region

2020-08-26 Thread Guanghao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-24928.

Fix Version/s: 2.3.2
   2.2.6
   Resolution: Fixed

> balanceRSGroup should skip generating balance plan for disabled table and 
> splitParent region
> 
>
> Key: HBASE-24928
> URL: https://issues.apache.org/jira/browse/HBASE-24928
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: niuyulin
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.6, 2.3.2
>
>
> now ,we generate balance plan for disabled tables, which is useless
> {code:java}
> 2020-08-20,20:47:54,702 WARN 
> [RpcServer.default.RWQ.Fifo.read.handler=310,queue=6,port=22500] 
> org.apache.hadoop.hbase.master.HMaster: Failed balance plan: 
> hri=aa325467924edc865ab2ef6d82f9e2a7, 
> source=tj1-hadoop-staging-st02.kscn,22600,1572403947348, destination=, just 
> skip it
> org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state 
> for rit=CLOSED, location=tj1-hadoop-staging-st02.kscn,22600,1572403947348, 
> table=galaxysds:sds_staging_258z, region=aa325467924edc865ab2ef6d82f9e2a7
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:580)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:635)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:652)
> at 
> org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:1776)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.balanceRSGroup(RSGroupAdminServer.java:486)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.balanceRSGroup(RSGroupAdminEndpoint.java:293)
> at 
> org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:13890)
> at 
> org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:908)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:135)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)