from:"Viraj Jasani \(Jira\)"

[jira] [Commented] (HBASE-28508) Remove the need for ADMIN permissions for RSRpcServices#execRegionServerService

2024-04-10 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835915#comment-17835915
 ] 

Viraj Jasani commented on HBASE-28508:
--

{quote}Also since this co proc invocation happens on the client side, we have 
to enforce all the clients have ADMIN permissions which might not be true in 
current deployments.
{quote}
Yeah, that's the sad part.
{quote}In short for custom coprocs, we need to leave upto implementation to 
enforce permissions.
{quote}
Agree

> Remove the need for ADMIN permissions for 
> RSRpcServices#execRegionServerService
> ---
>
> Key: HBASE-28508
> URL: https://issues.apache.org/jira/browse/HBASE-28508
> Project: HBase
>  Issue Type: Bug
>  Components: acl
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>  Labels: pull-request-available
>
> We have introduced a new regionserver coproc within phoenix and all the 
> permission related tests are failing with the following exception.
> {noformat}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.security.AccessDeniedException):
>  org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient 
> permissions for user 'groupUser_N42' (global, action=ADMIN)
>   at 
> org.apache.hadoop.hbase.security.access.AccessChecker.requireGlobalPermission(AccessChecker.java:152)
>   at 
> org.apache.hadoop.hbase.security.access.AccessChecker.requirePermission(AccessChecker.java:125)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.requirePermission(RSRpcServices.java:1318)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.rpcPreCheck(RSRpcServices.java:584)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execRegionServerService(RSRpcServices.java:3804)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45016)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>   at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
>   at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
> {noformat}
> This check is failing. 
> [RSRpcServices|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3815]
> {code}
>   @Override
>   public CoprocessorServiceResponse execRegionServerService(RpcController 
> controller,
> CoprocessorServiceRequest request) throws ServiceException {
> rpcPreCheck("execRegionServerService");
> return server.execRegionServerService(controller, request);
>   }
>   private void rpcPreCheck(String requestName) throws ServiceException {
> try {
>   checkOpen();
>   requirePermission(requestName, Permission.Action.ADMIN);
> } catch (IOException ioe) {
>   throw new ServiceException(ioe);
> }
>   }
> {code}
> Why do we need ADMIN permissions to call region server coproc? We don't need 
> ADMIN permissions to call all region co-procs. We require ADMIN permissions 
> to execute some region coprocs (compactionSwitch, clearRegionBlockCache).
> Can we change the permission to READ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28508) Remove the need for ADMIN permissions for RSRpcServices#execRegionServerService

2024-04-10 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835892#comment-17835892
 ] 

Viraj Jasani commented on HBASE-28508:
--

Yes, if we can find a way to customize it per coproc endpoint, that would be 
better. Otherwise changing the generic permission for execRegionServerService 
to READ might even be treated as a security concern.

> Remove the need for ADMIN permissions for 
> RSRpcServices#execRegionServerService
> ---
>
> Key: HBASE-28508
> URL: https://issues.apache.org/jira/browse/HBASE-28508
> Project: HBase
>  Issue Type: Bug
>  Components: acl
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>  Labels: pull-request-available
>
> We have introduced a new regionserver coproc within phoenix and all the 
> permission related tests are failing with the following exception.
> {noformat}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.security.AccessDeniedException):
>  org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient 
> permissions for user 'groupUser_N42' (global, action=ADMIN)
>   at 
> org.apache.hadoop.hbase.security.access.AccessChecker.requireGlobalPermission(AccessChecker.java:152)
>   at 
> org.apache.hadoop.hbase.security.access.AccessChecker.requirePermission(AccessChecker.java:125)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.requirePermission(RSRpcServices.java:1318)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.rpcPreCheck(RSRpcServices.java:584)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execRegionServerService(RSRpcServices.java:3804)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45016)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>   at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
>   at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
> {noformat}
> This check is failing. 
> [RSRpcServices|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3815]
> {code}
>   @Override
>   public CoprocessorServiceResponse execRegionServerService(RpcController 
> controller,
> CoprocessorServiceRequest request) throws ServiceException {
> rpcPreCheck("execRegionServerService");
> return server.execRegionServerService(controller, request);
>   }
>   private void rpcPreCheck(String requestName) throws ServiceException {
> try {
>   checkOpen();
>   requirePermission(requestName, Permission.Action.ADMIN);
> } catch (IOException ioe) {
>   throw new ServiceException(ioe);
> }
>   }
> {code}
> Why do we need ADMIN permissions to call region server coproc? We don't need 
> ADMIN permissions to call all region co-procs. We require ADMIN permissions 
> to execute some region coprocs (compactionSwitch, clearRegionBlockCache).
> Can we change the permission to READ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-28428) ConnectionRegistry APIs should have timeout

2024-04-10 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28428:


Assignee: Divneet Kaur  (was: Lokesh Khurana)

> ConnectionRegistry APIs should have timeout
> ---
>
> Key: HBASE-28428
> URL: https://issues.apache.org/jira/browse/HBASE-28428
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.8
>Reporter: Viraj Jasani
>Assignee: Divneet Kaur
>Priority: Major
>
> Came across a couple of instances where active master failover happens around 
> the same time as Zookeeper leader failover, leading to stuck HBase client if 
> one of the threads is blocked on one of the ConnectionRegistry rpc calls. 
> ConnectionRegistry APIs are wrapped with CompletableFuture. However, their 
> usages do not have any timeouts, which can potentially lead to the entire 
> client in stuck state indefinitely as we take some global locks. For 
> instance, _getKeepAliveMasterService()_ takes
> {_}masterLock{_}, hence if getting active master from _masterAddressZNode_ 
> gets stuck, we can block any admin operation that needs 
> {_}getKeepAliveMasterService(){_}.
>  
> Sample stacktrace that blocked all client operations that required table 
> descriptor from Admin:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.access$?
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStubNoRetries
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub
> org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService
> org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster
> org.apache.hadoop.hbase.client.MasterCallable.prepare
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable
> org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor
> org.apache.hadoop.hbase.client.HTable.getDescriptororg.apache.phoenix.query.ConnectionQueryServicesImpl.getTableDescriptor
> org.apache.phoenix.query.DelegateConnectionQueryServices.getTableDescriptor
> org.apache.phoenix.util.IndexUtil.isGlobalIndexCheckerEnabled
> org.apache.phoenix.execute.MutationState.filterIndexCheckerMutations
> org.apache.phoenix.execute.MutationState.sendBatch
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.commit
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.call.CallRunner.run
> org.apache.phoenix.jdbc.PhoenixConnection.commit {code}
> Another similar incident is captured on PHOENIX-7233. In this case, 
> retrieving clusterId from ZNode got stuck and that blocked client from being 
> able to create any more HBase Connection. Stacktrace for referece:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnectionorg.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$?
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.util.PhoenixContextExecutor.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init
>

[jira] [Updated] (HBASE-28420) Aborting Active HMaster is not rejecting remote Procedure Reports

2024-04-09 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28420:
-
Affects Version/s: 2.5.8
   2.4.17
   (was: 2.5.7)

> Aborting Active HMaster is not rejecting remote Procedure Reports
> -
>
> Key: HBASE-28420
> URL: https://issues.apache.org/jira/browse/HBASE-28420
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Umesh Kumar Kumawat
>Assignee: Umesh Kumar Kumawat
>Priority: Critical
>  Labels: pull-request-available
>
> When the Active Hmaster is in the process of abortion and another HMaster is 
> becoming Active HMaster,at the same time if any region server reports the 
> completion of the remote procedure, it generally goes to the old active 
> HMaster because of the cached value of rssStub -> 
> [code|https://github.com/apache/hbase/blob/branch-2.5/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2829]
>  ([caller 
> method|https://github.com/apache/hbase/blob/branch-2.5/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L3941]).
>  On the Master side 
> ([code|https://github.com/apache/hbase/blob/branch-2.5/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java#L2381]),
>  It did check if the service is started but that returns true if the master 
> is in the process of abortion(I didn't see when we are setting this flag 
> false while abortion).  
> This issue becomes *critical* when *ServerCrash of meta hosting RS and master 
> failover* happens at the same time and hbase:meta got stuck in the offline 
> state.
> Logs for abortion start of HMaster 
> {noformat}
> 2024-02-02 07:33:11,581 ERROR [PEWorker-6] master.HMaster - * ABORTING 
> master server4-1xxx,61000,1705169084562:
> FAILED persisting region=52d36581218e00a2668776cfea897132 state=CLOSING 
> *{noformat}
> {noformat}
> 2024-02-02 07:33:40,999 INFO [master/server4-1xxx:61000] 
> regionserver.HRegionServer - Exiting; 
> stopping=hbase2b-mnds4-1-ia2.ops.sfdc.net,61000,1705169084562; zookeeper 
> connection closed.{noformat}
> it took almost 30 seconds to abort the HMaster.
>  
> Logs of starting SCP for meta carrying host. (This SCP is started by the new 
> active HMaster)
> {noformat}
> 2024-02-02 07:33:32,622 INFO [aster/server3-1xxx61000:becomeActiveMaster] 
> assignment.AssignmentManager - Scheduled
> ServerCrashProcedure pid=3305546 for server5-1xxx61020,1706857451955 
> (carryingMeta=true) server5-1-
> xxx61020,1706857451955/CRASHED/regionCount=1/lock=java.util.concurrent.locks.ReentrantReadWriteLock@1b0a5293[Write
>  
> locks = 1, Read locks = 0], oldState=ONLINE.{noformat}
> initialization of remote procedure
> {noformat}
> 2024-02-02 07:33:33,178 INFO [PEWorker-4] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[{pid=3305548, 
> ppid=3305547, state=RUNNABLE; SplitWALRemoteProcedure server5-1-
> t%2C61020%2C1706857451955.meta.1706858156058.meta, 
> worker=server4-1-,61020,1705169180881}]{noformat}
> Logs of remote procedure handling on Old Active Hmaster(server4-1xxx,61000) 
> (in the process of abortion)
> {noformat}
> 2024-02-02 07:33:37,990 DEBUG 
> [r.default.FPBQ.Fifo.handler=243,queue=9,port=61000] master.HMaster - Remote 
> procedure 
> done, pid=3305548{noformat}
> This should be handled by the new active HMaster so that it can wake up the 
> suspended Procedure on the new Active Hmaster. As the new ActiveHMaster was 
> not able to wake that up, SCP procedure got stuck thus meta stayed OFFLINE. 
>  
> Logs of Hmaster trying to becomeActivehmaster but stuck-
> {noformat}
> 2024-02-02 07:33:43,159 WARN [aster/server3-1-ia2:61000:becomeActiveMaster] 
> master.HMaster - hbase:meta,,1.1588230740 
> is NOT online; state={1588230740 state=OPEN, ts=1706859212481, 
> server=server5-1-xxx,61020,1706857451955}; 
> ServerCrashProcedures=true. Master startup cannot progress, in 
> holding-pattern until region onlined.{noformat}
> After this master was stuck till we did hmaster failover to come out of this 
> situation. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ https://issues.apache.org/jira/browse/HBASE-28405 ]


Viraj Jasani deleted comment on HBASE-28405:
--

was (Author: vjasani):
Btw in this whole investigation, we know that we do have real RIT because the 
region assign as part of the "region merge rollback" could not be completed, 
and this definitely needs to be fixed.

However, from HBase client perspective, read/write should not be affected on 
the merging region right? Because the region state is OPEN even in meta, only 
master's in-memory image has the state as MERGING. This doesn't change the fact 
that RIT needs to be fixed, it's definitely a bug, triggers alerts, requires 
manual hbck intervention which we need to minimize as much as possible, but I 
hope that at least clients should be fine in this whole situation.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920

[jira] [Updated] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28405:
-
Component/s: Region Assignment

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, ppid=26674602, state=RUNNABLE; 
> OpenRegionProcedure a92008b76ccae47d55c590930b837036, 
> server=rs-210,60020,1707596461539}]_
> _2024-02-11 10:53:59,074 WARN [REGION-regionserver/rs-210:60020-10] 
> handler.AssignRegionHandler - Received OPEN for 
> table1,r1,1685436252488.a92008b76ccae47d55c590930b837036. which is already 
> online_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28405:
-
Affects Version/s: 2.5.8
   2.4.17
   (was: 2.5.7)

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, ppid=26674602, state=RUNNABLE; 
> OpenRegionProcedure a92008b76ccae47d55c590930b837036, 
> server=rs-210,60020,1707596461539}]_
> _2024-02-11 10:53:59,074 WARN [REGION-regionserver/rs-210:60020-10] 
> handler.AssignRegionHandler - Received OPEN for 
> table1,r1,1685436252488.a92008b76ccae47d55c590930b837036. which is already 
> online_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834715#comment-17834715
 ] 

Viraj Jasani commented on HBASE-28405:
--

Btw in this whole investigation, we know that we do have real RIT because the 
region assign as part of the "region merge rollback" could not be completed, 
and this definitely needs to be fixed.

However, from HBase client perspective, read/write should not be affected on 
the merging region right? Because the region state is OPEN even in meta, only 
master's in-memory image has the state as MERGING. This doesn't change the fact 
that RIT needs to be fixed, it's definitely a bug, triggers alerts, requires 
manual hbck intervention which we need to minimize as much as possible, but I 
hope that at least clients should be fine in this whole situation.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING,

[jira] [Comment Edited] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834623#comment-17834623
 ] 

Viraj Jasani edited comment on HBASE-28405 at 4/7/24 7:24 AM:
--

checkOnlineRegionsReport() is only called by reportOnlineRegions(), which is 
only called by regionServerReport().

Though report region transition is different than regionserver report, without 
coordination between the two, it might be tricky to get out of this mess. I 
edited my above comment.


was (Author: vjasani):
checkOnlineRegionsReport() is only called by reportOnlineRegions(), which is 
only called by regionServerReport().

They are different but without coordination among them, it might be tricky to 
get out of this mess. I edited my above comment.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
>

[jira] [Comment Edited] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834623#comment-17834623
 ] 

Viraj Jasani edited comment on HBASE-28405 at 4/7/24 7:23 AM:
--

checkOnlineRegionsReport() is only called by reportOnlineRegions(), which is 
only called by regionServerReport().

They are different but without coordination among them, it might be tricky to 
get out of this mess. I edited my above comment.


was (Author: vjasani):
checkOnlineRegionsReport() is only called by reportOnlineRegions(), which is 
only called by regionServerReport().

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, ppid=26674602, state=RUNNABLE; 
> OpenRegionProcedure a92008b76ccae47d55c590930b837036, 
> server=rs-210,60020,1707596461539}]_
> _2024-02-11

[jira] [Comment Edited] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834621#comment-17834621
 ] 

Viraj Jasani edited comment on HBASE-28405 at 4/7/24 7:21 AM:
--

[~zhangduo] the problem here is that master will receive region report that 
will not match the state:
{code:java}
2024-04-05 18:56:13,176 INFO  [PEWorker-7] procedure2.ProcedureExecutor - 
Rolled back pid=36790760, state=ROLLEDBACK, 
exception=org.apache.hadoop.hbase.HBaseIOException via 
master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
region state=MERGING, location=xyz-200,61020,1712163093095, table=T1, 
region=2dccaed62a347e3cfd8515650c902de9 is currently in transition, give up; 
MergeTableRegionsProcedure table=T1, regions=[fc6965af717fcd35182f85f0f192965e, 
2dccaed62a347e3cfd8515650c902de9], force=false exec-time=56.0760 sec

2024-04-05 18:56:55,397 WARN  
[iority.RWQ.Fifo.write.handler=0,queue=0,port=61000] 
assignment.AssignmentManager - Reporting xyz-12,61020,1712280779333 state does 
not match state=MERGING, location=xyz-12,61020,1712280779333, table=T1, 
region=fc6965af717fcd35182f85f0f192965e (time since last update=42224ms)

2024-04-05 18:56:56,370 INFO  [PEWorker-55] 
assignment.TransitRegionStateProcedure - Starting pid=36794357, 
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, locked=true; 
TransitRegionStateProcedure table=T1, region=fc6965af717fcd35182f85f0f192965e, 
ASSIGN; state=MERGING, location=xyz-12,61020,1712280779333; forceNewPlan=false, 
retain=false {code}
So we need to coordinate "merge transition rollback" with "regionserver 
reports".

If the regionserver report already reports the region online (which is set to 
MERGING in master in-memory state), but it's corresponding merge procedure was 
successfully rolledback, we are good to set the state to ONLINE.


was (Author: vjasani):
[~zhangduo] the problem here is that master will receive region report that 
will not match the state:
{code:java}
2024-04-05 18:56:55,397 WARN  
[iority.RWQ.Fifo.write.handler=0,queue=0,port=61000] 
assignment.AssignmentManager - Reporting xyz-12,61020,1712280779333 state does 
not match state=MERGING, location=xyz-12,61020,1712280779333, table=T1, 
region=fc6965af717fcd35182f85f0f192965e (time since last update=42224ms)

2024-04-05 18:56:56,370 INFO  [PEWorker-55] 
assignment.TransitRegionStateProcedure - Starting pid=36794357, 
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, locked=true; 
TransitRegionStateProcedure table=T1, region=fc6965af717fcd35182f85f0f192965e, 
ASSIGN; state=MERGING, location=xyz-12,61020,1712280779333; forceNewPlan=false, 
retain=false {code}

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at

[jira] [Commented] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834623#comment-17834623
 ] 

Viraj Jasani commented on HBASE-28405:
--

checkOnlineRegionsReport() is only called by reportOnlineRegions(), which is 
only called by regionServerReport().

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, ppid=26674602, state=RUNNABLE; 
> OpenRegionProcedure a92008b76ccae47d55c590930b837036, 
> server=rs-210,60020,1707596461539}]_
> _2024-02-11 10:53:59,074 WARN [REGION-regionserver/rs-210:60020-10] 
> handler.AssignRegionHandler - Received OPEN for 
> table1,r1,1685436252488.a92008b76ccae47d55c590930b837036. which is already 
> online_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-07 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834621#comment-17834621
 ] 

Viraj Jasani commented on HBASE-28405:
--

[~zhangduo] the problem here is that master will receive region report that 
will not match the state:
{code:java}
2024-04-05 18:56:55,397 WARN  
[iority.RWQ.Fifo.write.handler=0,queue=0,port=61000] 
assignment.AssignmentManager - Reporting xyz-12,61020,1712280779333 state does 
not match state=MERGING, location=xyz-12,61020,1712280779333, table=T1, 
region=fc6965af717fcd35182f85f0f192965e (time since last update=42224ms)

2024-04-05 18:56:56,370 INFO  [PEWorker-55] 
assignment.TransitRegionStateProcedure - Starting pid=36794357, 
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, locked=true; 
TransitRegionStateProcedure table=T1, region=fc6965af717fcd35182f85f0f192965e, 
ASSIGN; state=MERGING, location=xyz-12,61020,1712280779333; forceNewPlan=false, 
retain=false {code}

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta

[jira] [Comment Edited] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834600#comment-17834600
 ] 

Viraj Jasani edited comment on HBASE-28405 at 4/7/24 5:21 AM:
--

I believe this should fix the issue:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
index a9ab6f502a..6beb0fcab7 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
@@ -99,9 +99,16 @@ public class AssignRegionHandler extends EventHandler {
     String encodedName = regionInfo.getEncodedName();
     byte[] encodedNameBytes = regionInfo.getEncodedNameAsBytes();
     String regionName = regionInfo.getRegionNameAsString();
-    Region onlineRegion = rs.getRegion(encodedName);
+    HRegion onlineRegion = rs.getRegion(encodedName);
     if (onlineRegion != null) {
       LOG.warn("Received OPEN for {} which is already online", regionName);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.OPENED, 
onlineRegion.getOpenSeqNum(),
+          openProcId, masterSystemTime, onlineRegion.getRegionInfo( {
+        throw new IOException(
+          "Failed to report opened region to master: " + 
onlineRegion.getRegionInfo()
+            .getRegionNameAsString());
+      }
+      rs.finishRegionProcedure(openProcId);
       // Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
       // For normal case, it could happen that the rpc call to schedule this 
handler is succeeded,
       // but before returning to master the connection is broken. And when 
master tries again, we {code}
This would make assign region an idempotent operation.

 

And of course, we need to remove this comment section because now we know that 
this is not relevant anymore :)
{code:java}
// Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
// For normal case, it could happen that the rpc call to schedule this handler 
is succeeded,
// but before returning to master the connection is broken. And when master 
tries again, we
// have already finished the opening. For this case we do not need to call
// reportRegionStateTransition any more.{code}


was (Author: vjasani):
I believe this should fix the issue:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
index a9ab6f502a..6beb0fcab7 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
@@ -99,9 +99,16 @@ public class AssignRegionHandler extends EventHandler {
     String encodedName = regionInfo.getEncodedName();
     byte[] encodedNameBytes = regionInfo.getEncodedNameAsBytes();
     String regionName = regionInfo.getRegionNameAsString();
-    Region onlineRegion = rs.getRegion(encodedName);
+    HRegion onlineRegion = rs.getRegion(encodedName);
     if (onlineRegion != null) {
       LOG.warn("Received OPEN for {} which is already online", regionName);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.OPENED, 
onlineRegion.getOpenSeqNum(),
+          openProcId, masterSystemTime, onlineRegion.getRegionInfo( {
+        throw new IOException(
+          "Failed to report opened region to master: " + 
onlineRegion.getRegionInfo()
+            .getRegionNameAsString());
+      }
+      rs.finishRegionProcedure(openProcId);
       // Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
       // For normal case, it could happen that the rpc call to schedule this 
handler is succeeded,
       // but before returning to master the connection is broken. And when 
master tries again, we {code}
This would make assign region an idempotent operation.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
>

[jira] [Comment Edited] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834600#comment-17834600
 ] 

Viraj Jasani edited comment on HBASE-28405 at 4/7/24 5:16 AM:
--

I believe this should fix the issue:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
index a9ab6f502a..6beb0fcab7 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
@@ -99,9 +99,16 @@ public class AssignRegionHandler extends EventHandler {
     String encodedName = regionInfo.getEncodedName();
     byte[] encodedNameBytes = regionInfo.getEncodedNameAsBytes();
     String regionName = regionInfo.getRegionNameAsString();
-    Region onlineRegion = rs.getRegion(encodedName);
+    HRegion onlineRegion = rs.getRegion(encodedName);
     if (onlineRegion != null) {
       LOG.warn("Received OPEN for {} which is already online", regionName);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.OPENED, 
onlineRegion.getOpenSeqNum(),
+          openProcId, masterSystemTime, onlineRegion.getRegionInfo( {
+        throw new IOException(
+          "Failed to report opened region to master: " + 
onlineRegion.getRegionInfo()
+            .getRegionNameAsString());
+      }
+      rs.finishRegionProcedure(openProcId);
       // Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
       // For normal case, it could happen that the rpc call to schedule this 
handler is succeeded,
       // but before returning to master the connection is broken. And when 
master tries again, we {code}
This would make assign region an idempotent operation.


was (Author: vjasani):
I believe this should fix the issue:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
index a9ab6f502a..6beb0fcab7 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
@@ -99,9 +99,16 @@ public class AssignRegionHandler extends EventHandler {
     String encodedName = regionInfo.getEncodedName();
     byte[] encodedNameBytes = regionInfo.getEncodedNameAsBytes();
     String regionName = regionInfo.getRegionNameAsString();
-    Region onlineRegion = rs.getRegion(encodedName);
+    HRegion onlineRegion = rs.getRegion(encodedName);
     if (onlineRegion != null) {
       LOG.warn("Received OPEN for {} which is already online", regionName);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.OPENED, 
onlineRegion.getOpenSeqNum(),
+          openProcId, masterSystemTime, onlineRegion.getRegionInfo( {
+        throw new IOException(
+          "Failed to report opened region to master: " + 
onlineRegion.getRegionInfo()
+            .getRegionNameAsString());
+      }
       // Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
       // For normal case, it could happen that the rpc call to schedule this 
handler is succeeded,
       // but before returning to master the connection is broken. And when 
master tries again, we {code}
This would make assign region an idempotent operation.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
>

[jira] [Commented] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834600#comment-17834600
 ] 

Viraj Jasani commented on HBASE-28405:
--

I believe this should fix the issue:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
index a9ab6f502a..6beb0fcab7 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
@@ -99,9 +99,16 @@ public class AssignRegionHandler extends EventHandler {
     String encodedName = regionInfo.getEncodedName();
     byte[] encodedNameBytes = regionInfo.getEncodedNameAsBytes();
     String regionName = regionInfo.getRegionNameAsString();
-    Region onlineRegion = rs.getRegion(encodedName);
+    HRegion onlineRegion = rs.getRegion(encodedName);
     if (onlineRegion != null) {
       LOG.warn("Received OPEN for {} which is already online", regionName);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.OPENED, 
onlineRegion.getOpenSeqNum(),
+          openProcId, masterSystemTime, onlineRegion.getRegionInfo( {
+        throw new IOException(
+          "Failed to report opened region to master: " + 
onlineRegion.getRegionInfo()
+            .getRegionNameAsString());
+      }
       // Just follow the old behavior, do we need to call 
reportRegionStateTransition? Maybe not?
       // For normal case, it could happen that the rpc call to schedule this 
handler is succeeded,
       // but before returning to master the connection is broken. And when 
master tries again, we {code}
This would make assign region an idempotent operation.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via

[jira] [Commented] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834599#comment-17834599
 ] 

Viraj Jasani commented on HBASE-28405:
--

{quote}Also, there is a related and interesting finding.
Here:
{noformat}
2024-02-11 10:53:59,074 WARN [REGION-regionserver/rs-210:60020-10] 
handler.AssignRegionHandler -
Received OPEN for table1,r1,1685436252488.a92008b76ccae47d55c590930b837036. 
which is already online
{noformat}
One option here is the RS can tell the master the assign succeeded, because the 
OPEN request is idempotent when the region is already open on the RS.
{quote}
Yes, assign should have been idempotent, but it's not so we need to fix this. 
Came across 3 similar incidents this week, out of which 2 were similar to this 
Jira i.e. merge transition is rolled back due to one of the parent regions was 
in transition. However, the rollback does not get successfully completed 
because assign is not somehow treated idempotent.

 
{quote}It would be ideal if the master does not make redundant requests, but if 
it does make one, the RS should handle the request and return success to the 
master because the request to open an already open region on the same server is 
idempotent with the earlier request that caused the region to be opened there 
in the first place.
{quote}
+1, if assign was idempotent, we would not have run into merge rollback getting 
stuck indefinitely.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.5.7
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region

[jira] [Resolved] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-04-04 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28366.
--
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-2
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
>     at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
>     at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
>  {code}
>  
> Several region transition failure report logs:
> {code:java}
> 2024-01-29 16:55:13,029 INFO  [_REGION-regionserver/server1-114:61020-0] 
> regionserver.HRegionServer - Failed report transition server { host_name: 
> "server1-114.xyz" port: 61020

[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829327#comment-17829327
 ] 

Viraj Jasani commented on HBASE-28447:
--

Yes, that is correct. This is reg introducing global site configuration.

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28428) ConnectionRegistry APIs should have timeout

2024-03-11 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825489#comment-17825489
 ] 

Viraj Jasani commented on HBASE-28428:
--

Yes, we have plans to migrate to MasterRegistry (RpcConnectionRegistry).

However, as of today, we do have zookeeper timeouts, maybe not quite aggressive.

The timeout for CompletableFuture based connection registry APIs would be very 
useful, in case somehow the client thread gets stuck due to any network or os 
level issues. The idea here is to provide timeout to Future#get for all 
connection registry APIs.

> ConnectionRegistry APIs should have timeout
> ---
>
> Key: HBASE-28428
> URL: https://issues.apache.org/jira/browse/HBASE-28428
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.8
>Reporter: Viraj Jasani
>Assignee: Lokesh Khurana
>Priority: Major
>
> Came across a couple of instances where active master failover happens around 
> the same time as Zookeeper leader failover, leading to stuck HBase client if 
> one of the threads is blocked on one of the ConnectionRegistry rpc calls. 
> ConnectionRegistry APIs are wrapped with CompletableFuture. However, their 
> usages do not have any timeouts, which can potentially lead to the entire 
> client in stuck state indefinitely as we take some global locks. For 
> instance, _getKeepAliveMasterService()_ takes
> {_}masterLock{_}, hence if getting active master from _masterAddressZNode_ 
> gets stuck, we can block any admin operation that needs 
> {_}getKeepAliveMasterService(){_}.
>  
> Sample stacktrace that blocked all client operations that required table 
> descriptor from Admin:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.access$?
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStubNoRetries
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub
> org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService
> org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster
> org.apache.hadoop.hbase.client.MasterCallable.prepare
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable
> org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor
> org.apache.hadoop.hbase.client.HTable.getDescriptororg.apache.phoenix.query.ConnectionQueryServicesImpl.getTableDescriptor
> org.apache.phoenix.query.DelegateConnectionQueryServices.getTableDescriptor
> org.apache.phoenix.util.IndexUtil.isGlobalIndexCheckerEnabled
> org.apache.phoenix.execute.MutationState.filterIndexCheckerMutations
> org.apache.phoenix.execute.MutationState.sendBatch
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.commit
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.call.CallRunner.run
> org.apache.phoenix.jdbc.PhoenixConnection.commit {code}
> Another similar incident is captured on PHOENIX-7233. In this case, 
> retrieving clusterId from ZNode got stuck and that blocked client from being 
> able to create any more HBase Connection. Stacktrace for referece:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
>

[jira] [Resolved] (HBASE-28424) Set correct Result to RegionActionResult for successful Put/Delete mutations

2024-03-10 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28424.
--
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-2
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Set correct Result to RegionActionResult for successful Put/Delete mutations
> 
>
> Key: HBASE-28424
> URL: https://issues.apache.org/jira/browse/HBASE-28424
> Project: HBase
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> While returning response of multi(), RSRpcServices build the 
> RegionActionResult with Result or Exception (ClientProtos.ResultOrException). 
> It sets the Exception to this class in all cases where the operation fails 
> with corresponding exception types e.g. NoSuchColumnFamilyException or 
> FailedSanityCheckException etc.
> In case of atomic mutations Increment and Append, we add the Result object to 
> ClientProtos.ResultOrException, which is used by client to retrieve result 
> from the batch API: {_}Table#batch(List actions, Object[] 
> results){_}.
> Phoenix performs atomic mutation for Put using _preBatchMutate()_ endpoint. 
> Hence, returning Result object with ResultOrException is important for the 
> purpose of returning the result back to the client as part of the atomic 
> operation. Even if Phoenix returns the OperationStatus (with Result) to 
> MiniBatchOperationInProgress, since HBase uses the empty Result for the 
> Success case, the client would not be able to get the expected result.
> {code:java}
> case SUCCESS:
>   builder.addResultOrException(
> getResultOrException(ClientProtos.Result.getDefaultInstance(), index));
>   break; {code}
> If OperationStatus returned by _Region#batchMutate_ has valid Result object, 
> it should be used by RSRpcServices while returning the response.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-28428) ConnectionRegistry APIs should have timeout

2024-03-07 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28428:


Assignee: Lokesh Khurana

> ConnectionRegistry APIs should have timeout
> ---
>
> Key: HBASE-28428
> URL: https://issues.apache.org/jira/browse/HBASE-28428
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.8
>Reporter: Viraj Jasani
>Assignee: Lokesh Khurana
>Priority: Major
>
> Came across a couple of instances where active master failover happens around 
> the same time as Zookeeper leader failover, leading to stuck HBase client if 
> one of the threads is blocked on one of the ConnectionRegistry rpc calls. 
> ConnectionRegistry APIs are wrapped with CompletableFuture. However, their 
> usages do not have any timeouts, which can potentially lead to the entire 
> client in stuck state indefinitely as we take some global locks. For 
> instance, _getKeepAliveMasterService()_ takes
> {_}masterLock{_}, hence if getting active master from _masterAddressZNode_ 
> gets stuck, we can block any admin operation that needs 
> {_}getKeepAliveMasterService(){_}.
>  
> Sample stacktrace that blocked all client operations that required table 
> descriptor from Admin:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.access$?
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStubNoRetries
> org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub
> org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService
> org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster
> org.apache.hadoop.hbase.client.MasterCallable.prepare
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable
> org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor
> org.apache.hadoop.hbase.client.HTable.getDescriptororg.apache.phoenix.query.ConnectionQueryServicesImpl.getTableDescriptor
> org.apache.phoenix.query.DelegateConnectionQueryServices.getTableDescriptor
> org.apache.phoenix.util.IndexUtil.isGlobalIndexCheckerEnabled
> org.apache.phoenix.execute.MutationState.filterIndexCheckerMutations
> org.apache.phoenix.execute.MutationState.sendBatch
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.send
> org.apache.phoenix.execute.MutationState.commit
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.jdbc.PhoenixConnection$?.call
> org.apache.phoenix.call.CallRunner.run
> org.apache.phoenix.jdbc.PhoenixConnection.commit {code}
> Another similar incident is captured on PHOENIX-7233. In this case, 
> retrieving clusterId from ZNode got stuck and that blocked client from being 
> able to create any more HBase Connection. Stacktrace for referece:
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnectionorg.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$?
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.util.PhoenixContextExecutor.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init
>

[jira] [Created] (HBASE-28428) ConnectionRegistry APIs should have timeout

2024-03-07 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28428:


 Summary: ConnectionRegistry APIs should have timeout
 Key: HBASE-28428
 URL: https://issues.apache.org/jira/browse/HBASE-28428
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.5.8, 3.0.0-beta-1, 2.4.17
Reporter: Viraj Jasani


Came across a couple of instances where active master failover happens around 
the same time as Zookeeper leader failover, leading to stuck HBase client if 
one of the threads is blocked on one of the ConnectionRegistry rpc calls. 
ConnectionRegistry APIs are wrapped with CompletableFuture. However, their 
usages do not have any timeouts, which can potentially lead to the entire 
client in stuck state indefinitely as we take some global locks. For instance, 
_getKeepAliveMasterService()_ takes
{_}masterLock{_}, hence if getting active master from _masterAddressZNode_ gets 
stuck, we can block any admin operation that needs 
{_}getKeepAliveMasterService(){_}.
 
Sample stacktrace that blocked all client operations that required table 
descriptor from Admin:
{code:java}
jdk.internal.misc.Unsafe.park
java.util.concurrent.locks.LockSupport.park
java.util.concurrent.CompletableFuture$Signaller.block
java.util.concurrent.ForkJoinPool.managedBlock
java.util.concurrent.CompletableFuture.waitingGet
java.util.concurrent.CompletableFuture.get
org.apache.hadoop.hbase.client.ConnectionImplementation.get
org.apache.hadoop.hbase.client.ConnectionImplementation.access$?
org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStubNoRetries
org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub
org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService
org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster
org.apache.hadoop.hbase.client.MasterCallable.prepare
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries
org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable
org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor
org.apache.hadoop.hbase.client.HTable.getDescriptororg.apache.phoenix.query.ConnectionQueryServicesImpl.getTableDescriptor
org.apache.phoenix.query.DelegateConnectionQueryServices.getTableDescriptor
org.apache.phoenix.util.IndexUtil.isGlobalIndexCheckerEnabled
org.apache.phoenix.execute.MutationState.filterIndexCheckerMutations
org.apache.phoenix.execute.MutationState.sendBatch
org.apache.phoenix.execute.MutationState.send
org.apache.phoenix.execute.MutationState.send
org.apache.phoenix.execute.MutationState.commit
org.apache.phoenix.jdbc.PhoenixConnection$?.call
org.apache.phoenix.jdbc.PhoenixConnection$?.call
org.apache.phoenix.call.CallRunner.run
org.apache.phoenix.jdbc.PhoenixConnection.commit {code}
Another similar incident is captured on PHOENIX-7233. In this case, retrieving 
clusterId from ZNode got stuck and that blocked client from being able to 
create any more HBase Connection. Stacktrace for referece:
{code:java}
jdk.internal.misc.Unsafe.park
java.util.concurrent.locks.LockSupport.park
java.util.concurrent.CompletableFuture$Signaller.block
java.util.concurrent.ForkJoinPool.managedBlock
java.util.concurrent.CompletableFuture.waitingGet
java.util.concurrent.CompletableFuture.get
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
org.apache.hadoop.hbase.client.ConnectionImplementation.
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
java.lang.reflect.Constructor.newInstance
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
java.security.AccessController.doPrivileged
javax.security.auth.Subject.doAs
org.apache.hadoop.security.UserGroupInformation.doAs
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
org.apache.hadoop.hbase.client.ConnectionFactory.createConnectionorg.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection
org.apache.phoenix.query.ConnectionQueryServicesImpl.access$?
org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
org.apache.phoenix.util.PhoenixContextExecutor.call
org.apache.phoenix.query.ConnectionQueryServicesImpl.init
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices
org.apache.phoenix.jdbc.HighAvailabilityGroup.connectToOneCluster
org.apache.phoenix.jdbc.ParallelPhoenixConnection.getConnection
org.apache.phoenix.jdbc.ParallelPhoenixConnection.lambda$new$?
org.apache.phoenix.jdbc.ParallelPhoenixConnection$$Lambda$?.get
org.apache.phoenix.jdbc.ParallelPhoenixContext.lambda$chainOnConnClusterContext$?

[jira] [Assigned] (HBASE-28424) Set correct Result to RegionActionResult for successful Put/Delete mutations

2024-03-06 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28424:


Assignee: Jing Yu

> Set correct Result to RegionActionResult for successful Put/Delete mutations
> 
>
> Key: HBASE-28424
> URL: https://issues.apache.org/jira/browse/HBASE-28424
> Project: HBase
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
>
> While returning response of multi(), RSRpcServices build the 
> RegionActionResult with Result or Exception (ClientProtos.ResultOrException). 
> It sets the Exception to this class in all cases where the operation fails 
> with corresponding exception types e.g. NoSuchColumnFamilyException or 
> FailedSanityCheckException etc.
> In case of atomic mutations Increment and Append, we add the Result object to 
> ClientProtos.ResultOrException, which is used by client to retrieve result 
> from the batch API: {_}Table#batch(List actions, Object[] 
> results){_}.
> Phoenix performs atomic mutation for Put using _preBatchMutate()_ endpoint. 
> Hence, returning Result object with ResultOrException is important for the 
> purpose of returning the result back to the client as part of the atomic 
> operation. Even if Phoenix returns the OperationStatus (with Result) to 
> MiniBatchOperationInProgress, since HBase uses the empty Result for the 
> Success case, the client would not be able to get the expected result.
> {code:java}
> case SUCCESS:
>   builder.addResultOrException(
> getResultOrException(ClientProtos.Result.getDefaultInstance(), index));
>   break; {code}
> If OperationStatus returned by _Region#batchMutate_ has valid Result object, 
> it should be used by RSRpcServices while returning the response.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-28424) Set correct Result to RegionActionResult for successful Put/Delete mutations

2024-03-06 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28424:


 Summary: Set correct Result to RegionActionResult for successful 
Put/Delete mutations
 Key: HBASE-28424
 URL: https://issues.apache.org/jira/browse/HBASE-28424
 Project: HBase
  Issue Type: Improvement
Reporter: Viraj Jasani


While returning response of multi(), RSRpcServices build the RegionActionResult 
with Result or Exception (ClientProtos.ResultOrException). It sets the 
Exception to this class in all cases where the operation fails with 
corresponding exception types e.g. NoSuchColumnFamilyException or 
FailedSanityCheckException etc.

In case of atomic mutations Increment and Append, we add the Result object to 
ClientProtos.ResultOrException, which is used by client to retrieve result from 
the batch API: {_}Table#batch(List actions, Object[] results){_}.

Phoenix performs atomic mutation for Put using _preBatchMutate()_ endpoint. 
Hence, returning Result object with ResultOrException is important for the 
purpose of returning the result back to the client as part of the atomic 
operation. Even if Phoenix returns the OperationStatus (with Result) to 
MiniBatchOperationInProgress, since HBase uses the empty Result for the Success 
case, the client would not be able to get the expected result.
{code:java}
case SUCCESS:
  builder.addResultOrException(
getResultOrException(ClientProtos.Result.getDefaultInstance(), index));
  break; {code}
If OperationStatus returned by _Region#batchMutate_ has valid Result object, it 
should be used by RSRpcServices while returning the response.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28422) SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target RegionServer indefinitely

2024-03-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823811#comment-17823811
 ] 

Viraj Jasani commented on HBASE-28422:
--

Might as well be a good opportunity to refactor _isSaslError()_ as a global 
static utility, available for use to anyone.

> SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target 
> RegionServer indefinitely
> ---
>
> Key: HBASE-28422
> URL: https://issues.apache.org/jira/browse/HBASE-28422
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2, wal
>Affects Versions: 2.5.5
>Reporter: David Manning
>Priority: Minor
>
> Similar to HBASE-28050. If HMaster selects a RegionServer for 
> SplitWalRemoteProcedure, it will retry this server as long as the server is 
> alive. I believe this is because even though 
> {{RSProcedureDispatcher.ExecuteProceduresRemoteCall.run}} calls 
> {{{}remoteCallFailed{}}}, there is no logic after this to select a new target 
> server. For {{TransitRegionStateProcedure}} there is logic to select a new 
> server for opening a region, using {{{}forceNewPlan{}}}. But 
> SplitWalRemoteProcedure only has logic to try another server if we receive a 
> {{DoNotRetryIOException}} in SplitWALRemoteProcedure#complete: 
> [https://github.com/apache/hbase/blob/780ff56b3f23e7041ef1b705b7d3d0a53fdd05ae/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALRemoteProcedure.java#L104-L110]
> If we receive any other IOException, we will just retry the target server 
> forever. Just like in HBASE-28050, if there is a SaslException, this will 
> never lead to retrying a SplitWalRemoteProcedure on a new server, which can 
> lead to ServerCrashProcedure never finishing until the target server for 
> SplitWalRemoteProcedure is restarted. The following log is seen repeatedly, 
> always sending to the same host.
> {code:java}
> 2024-01-31 15:59:43,616 WARN  [RSProcedureDispatcher-pool-72846] 
> procedure.SplitWALRemoteProcedure - Failed split of 
> hdfs:///hbase/WALs/,1704984571464-splitting/1704984571464.1706710908543,
>  retry...
> java.io.IOException: Call to address= failed on local exception: 
> java.io.IOException: Can not send request because relogin is in progress.
>   at sun.reflect.GeneratedConstructorAccessor363.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:420)
>   at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:114)
>   at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:129)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:365)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Can not send request because relogin is in 
> progress.
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.sendRequest0(NettyRpcConnection.java:321)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:363)
>   ... 8 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-28422) SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target RegionServer indefinitely

2024-03-05 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28422:


Assignee: Viraj Jasani

> SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target 
> RegionServer indefinitely
> ---
>
> Key: HBASE-28422
> URL: https://issues.apache.org/jira/browse/HBASE-28422
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2, wal
>Affects Versions: 2.5.5
>Reporter: David Manning
>Assignee: Viraj Jasani
>Priority: Minor
>
> Similar to HBASE-28050. If HMaster selects a RegionServer for 
> SplitWalRemoteProcedure, it will retry this server as long as the server is 
> alive. I believe this is because even though 
> {{RSProcedureDispatcher.ExecuteProceduresRemoteCall.run}} calls 
> {{{}remoteCallFailed{}}}, there is no logic after this to select a new target 
> server. For {{TransitRegionStateProcedure}} there is logic to select a new 
> server for opening a region, using {{{}forceNewPlan{}}}. But 
> SplitWalRemoteProcedure only has logic to try another server if we receive a 
> {{DoNotRetryIOException}} in SplitWALRemoteProcedure#complete: 
> [https://github.com/apache/hbase/blob/780ff56b3f23e7041ef1b705b7d3d0a53fdd05ae/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALRemoteProcedure.java#L104-L110]
> If we receive any other IOException, we will just retry the target server 
> forever. Just like in HBASE-28050, if there is a SaslException, this will 
> never lead to retrying a SplitWalRemoteProcedure on a new server, which can 
> lead to ServerCrashProcedure never finishing until the target server for 
> SplitWalRemoteProcedure is restarted. The following log is seen repeatedly, 
> always sending to the same host.
> {code:java}
> 2024-01-31 15:59:43,616 WARN  [RSProcedureDispatcher-pool-72846] 
> procedure.SplitWALRemoteProcedure - Failed split of 
> hdfs:///hbase/WALs/,1704984571464-splitting/1704984571464.1706710908543,
>  retry...
> java.io.IOException: Call to address= failed on local exception: 
> java.io.IOException: Can not send request because relogin is in progress.
>   at sun.reflect.GeneratedConstructorAccessor363.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:420)
>   at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:114)
>   at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:129)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:365)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Can not send request because relogin is in 
> progress.
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.sendRequest0(NettyRpcConnection.java:321)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:363)
>   ... 8 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-28422) SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target RegionServer indefinitely

2024-03-05 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28422:


Assignee: (was: Viraj Jasani)

> SplitWalProcedure will attempt SplitWalRemoteProcedure on the same target 
> RegionServer indefinitely
> ---
>
> Key: HBASE-28422
> URL: https://issues.apache.org/jira/browse/HBASE-28422
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2, wal
>Affects Versions: 2.5.5
>Reporter: David Manning
>Priority: Minor
>
> Similar to HBASE-28050. If HMaster selects a RegionServer for 
> SplitWalRemoteProcedure, it will retry this server as long as the server is 
> alive. I believe this is because even though 
> {{RSProcedureDispatcher.ExecuteProceduresRemoteCall.run}} calls 
> {{{}remoteCallFailed{}}}, there is no logic after this to select a new target 
> server. For {{TransitRegionStateProcedure}} there is logic to select a new 
> server for opening a region, using {{{}forceNewPlan{}}}. But 
> SplitWalRemoteProcedure only has logic to try another server if we receive a 
> {{DoNotRetryIOException}} in SplitWALRemoteProcedure#complete: 
> [https://github.com/apache/hbase/blob/780ff56b3f23e7041ef1b705b7d3d0a53fdd05ae/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALRemoteProcedure.java#L104-L110]
> If we receive any other IOException, we will just retry the target server 
> forever. Just like in HBASE-28050, if there is a SaslException, this will 
> never lead to retrying a SplitWalRemoteProcedure on a new server, which can 
> lead to ServerCrashProcedure never finishing until the target server for 
> SplitWalRemoteProcedure is restarted. The following log is seen repeatedly, 
> always sending to the same host.
> {code:java}
> 2024-01-31 15:59:43,616 WARN  [RSProcedureDispatcher-pool-72846] 
> procedure.SplitWALRemoteProcedure - Failed split of 
> hdfs:///hbase/WALs/,1704984571464-splitting/1704984571464.1706710908543,
>  retry...
> java.io.IOException: Call to address= failed on local exception: 
> java.io.IOException: Can not send request because relogin is in progress.
>   at sun.reflect.GeneratedConstructorAccessor363.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:239)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:420)
>   at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:114)
>   at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:129)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:365)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
>   at 
> org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Can not send request because relogin is in 
> progress.
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.sendRequest0(NettyRpcConnection.java:321)
>   at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.lambda$sendRequest$4(NettyRpcConnection.java:363)
>   ... 8 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28048) RSProcedureDispatcher to abort executing request after configurable retries

2024-03-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823735#comment-17823735
 ] 

Viraj Jasani commented on HBASE-28048:
--

Indeed, that's good idea. Somewhat similar HBASE-28366 : if the 
AssignmentManager accepts old regionserver report instead of rejecting it, we 
get into trouble. If we implement Nick's idea, first we might want to take care 
of this i.e. we will need consistency b/ ServerManager and AssignmentManager.

> RSProcedureDispatcher to abort executing request after configurable retries
> ---
>
> Key: HBASE-28048
> URL: https://issues.apache.org/jira/browse/HBASE-28048
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> In a recent incident, we observed that RSProcedureDispatcher continues 
> executing region open/close procedures with unbounded retries even in the 
> presence of known failures like GSS initiate failure:
>  
> {code:java}
> 2023-08-25 02:21:02,821 WARN [ispatcher-pool-40777] 
> procedure.RSProcedureDispatcher - request to rs1,61020,1692930044498 failed 
> due to java.io.IOException: Call to address=rs1:61020 failed on local 
> exception: java.io.IOException: 
> org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: 
> org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS 
> initiate failed, try=0, retrying... {code}
>  
>  
> If the remote execution results in IOException, the dispatcher attempts to 
> schedule the procedure for further retries:
>  
> {code:java}
>     private boolean scheduleForRetry(IOException e) {
>       LOG.debug("Request to {} failed, try={}", serverName, 
> numberOfAttemptsSoFar, e);
>       // Should we wait a little before retrying? If the server is starting 
> it's yes.
>       ...
>       ...
>       ...
>       numberOfAttemptsSoFar++;
>       // Add some backoff here as the attempts rise otherwise if a stuck 
> condition, will fill logs
>       // with failed attempts. None of our backoff classes -- RetryCounter or 
> ClientBackoffPolicy
>       // -- fit here nicely so just do something simple; increment by 
> rsRpcRetryInterval millis *
>       // retry^2 on each try
>       // up to max of 10 seconds (don't want to back off too much in case of 
> situation change).
>       submitTask(this,
>         Math.min(rsRpcRetryInterval * (this.numberOfAttemptsSoFar * 
> this.numberOfAttemptsSoFar),
>           10 * 1000),
>         TimeUnit.MILLISECONDS);
>       return true;
>     }
>  {code}
>  
>  
> Even though we try to provide backoff while retrying, max wait time is 10s:
>  
> {code:java}
> submitTask(this,
>   Math.min(rsRpcRetryInterval * (this.numberOfAttemptsSoFar * 
> this.numberOfAttemptsSoFar),
> 10 * 1000),
>   TimeUnit.MILLISECONDS); {code}
>  
>  
> This results in endless loop of retries, until either the underlying issue is 
> fixed (e.g. krb issue in this case) or regionserver is killed and the ongoing 
> open/close region procedure (and perhaps entire SCP) for the affected 
> regionserver is sidelined manually.
> {code:java}
> 2023-08-25 03:04:18,918 WARN  [ispatcher-pool-41274] 
> procedure.RSProcedureDispatcher - request to rs1,61020,1692930044498 failed 
> due to java.io.IOException: Call to address=rs1:61020 failed on local 
> exception: java.io.IOException: 
> org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: 
> org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS 
> initiate failed, try=217, retrying...
> 2023-08-25 03:04:18,916 WARN  [ispatcher-pool-41280] 
> procedure.RSProcedureDispatcher - request to rs1,61020,1692930044498 failed 
> due to java.io.IOException: Call to address=rs1:61020 failed on local 
> exception: java.io.IOException: 
> org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: 
> org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS 
> initiate failed, try=193, retrying...
> 2023-08-25 03:04:28,968 WARN  [ispatcher-pool-41315] 
> procedure.RSProcedureDispatcher - request to rs1,61020,1692930044498 failed 
> due to java.io.IOException: Call to address=rs1:61020 failed on local 
> exception: java.io.IOException: 
> org.apache.hbase.thirdparty.io.netty.handler.codec.DecoderException: 
> org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS 
> initiate failed, try=266, retrying...
> 2023-08-25 03:04:28,969 WARN  [ispatcher-pool-41240] 
> procedure.RSProcedureDispatcher - request to rs1,61020,1692930044498 failed 
> due to java.io.IOException: Call to address=rs1:61020 failed on local 
> exception: java.io.IOException: 
>

[jira] [Commented] (HBASE-27312) Update create-release to work with maven-gpg-plugin-3.0.1 and gnupg >= 2.1.x

2024-02-26 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-27312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820920#comment-17820920
 ] 

Viraj Jasani commented on HBASE-27312:
--

[~ndimiduk] Thank you for this Jira! Do you recall if the error seemed 
something like this:
{code:java}
gpg: setting pinentry mode 'error' failed: Forbidden
gpg: keydb_search failed: Forbidden
gpg: skipped "0x1012D134": Forbidden
gpg: signing failed: Forbidden {code}
I am facing similar issue with phoenix release 5.2.0 during publish release 
phase.

 

cc [~stoty] 

> Update create-release to work with maven-gpg-plugin-3.0.1 and gnupg >= 2.1.x
> 
>
> Key: HBASE-27312
> URL: https://issues.apache.org/jira/browse/HBASE-27312
> Project: HBase
>  Issue Type: Task
>  Components: build
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
>
> I've run into some trouble with `create-release`. The issue seems to come 
> down to changes to maven-release-plugin via MGPG-79 and working with the 
> `agent-extra-socket`. After MGPG-79, maven-release-plugin uses 
> `--pinentry-mode error` in a non-interactive setting, which is not 
> permissible over the restricted permissions exposed via `agent-extra-socket`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28376) Column family ns does not exist in region during upgrade to 3.0.0-beta-2

2024-02-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818625#comment-17818625
 ] 

Viraj Jasani commented on HBASE-28376:
--

Somewhat similar case to HBASE-25902

> Column family ns does not exist in region during upgrade to 3.0.0-beta-2
> 
>
> Key: HBASE-28376
> URL: https://issues.apache.org/jira/browse/HBASE-28376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta-1
>Reporter: Bryan Beaudreault
>Priority: Blocker
>
> Upgrading from 2.5.x to 3.0.0-alpha-2, migrateNamespaceTable kicks in to copy 
> data from the namespace table to an "ns" family of the meta table. If you 
> don't have an "ns" family, the migration fails and the hmaster will crash 
> loop. You then can't rollback, because the briefly alive upgraded hmaster 
> created a procedure that can't be deserialized by 2.x (I don't have this log 
> handy unfortunately). I tried pushing code to create the ns family on 
> startup, but it doesnt work becuase the migration happens while the hmaster 
> is still initializing.
> So it seems imperative that you create the ns family before upgrading. We 
> should handle this more gracefully.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-25749) Improved logging when interrupting active RPC handlers holding the region close lock (HBASE-25212 hbase.regionserver.close.wait.abort)

2024-02-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818311#comment-17818311
 ] 

Viraj Jasani commented on HBASE-25749:
--

[~umesh9414] granted you with the contributor access, you can assign any jira 
to yourself going forward.

> Improved logging when interrupting active RPC handlers holding the region 
> close lock (HBASE-25212 hbase.regionserver.close.wait.abort)
> --
>
> Key: HBASE-25749
> URL: https://issues.apache.org/jira/browse/HBASE-25749
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: David Manning
>Assignee: Umesh Kumar Kumawat
>Priority: Minor
> Fix For: 3.0.0-beta-2
>
>
> HBASE-25212 adds an optional improvement to Close Region, for interrupting 
> active RPC handlers holding the region close lock. If, after the timeout is 
> reached, the close lock can still not be acquired, the regionserver may 
> abort. It would be helpful to add logging for which threads or components are 
> holding the region close lock at this time.
> Depending on the size of regionLockHolders, or use of any stack traces, log 
> output may need to be truncated. The interrupt code is in 
> HRegion#interruptRegionOperations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-25749) Improved logging when interrupting active RPC handlers holding the region close lock (HBASE-25212 hbase.regionserver.close.wait.abort)

2024-02-18 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-25749:


Assignee: Umesh Kumar Kumawat

> Improved logging when interrupting active RPC handlers holding the region 
> close lock (HBASE-25212 hbase.regionserver.close.wait.abort)
> --
>
> Key: HBASE-25749
> URL: https://issues.apache.org/jira/browse/HBASE-25749
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, rpc
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: David Manning
>Assignee: Umesh Kumar Kumawat
>Priority: Minor
> Fix For: 3.0.0-beta-2
>
>
> HBASE-25212 adds an optional improvement to Close Region, for interrupting 
> active RPC handlers holding the region close lock. If, after the timeout is 
> reached, the close lock can still not be acquired, the regionserver may 
> abort. It would be helpful to add logging for which threads or components are 
> holding the region close lock at this time.
> Depending on the size of regionLockHolders, or use of any stack traces, log 
> output may need to be truncated. The interrupt code is in 
> HRegion#interruptRegionOperations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818310#comment-17818310
 ] 

Viraj Jasani commented on HBASE-28366:
--

{quote}And for regionServerReport, we used to make use of the information to 
fix inconsistency but it introduced bunch of new inconsistency so IIRC finally 
we choose to only log the inconsistency, instead of fixing it in 
regionServerReport, or at least we will not fix it immediately
{quote}
I see, that makes sense. As part of this Jira, we could still reject serving rs 
report from old server, i.e. make that check more stringent by not only relying 
on dead server map but also compare with online server map and if we find that 
we have new online server with same host + port but with higher startcode, we 
reject rs report from old server immediately. As of today, we accept it after 
logging a warning, and that results into inconsistencies.
{quote}I think, there could be race that master think a region server is dead, 
but the region server is still alive.
{quote}
That is correct for this case. As per the logs, at 16:50:33, master scheduled 
SCP due to removal of ephemeral rs znode.

>From master side, the SCP was completed at 16:53:01:
{code:java}
2024-01-29 16:53:01,640 INFO  [PEWorker-39] procedure2.ProcedureExecutor - 
Finished pid=9812440, state=SUCCESS; ServerCrashProcedure 
hbase2a-dnds1-114-ia6.ops.sfdc.net,61020,1706541866103, splitWal=true, 
meta=false in 2 mins, 27.679 sec{code}
>From regionserver side, at 16:54:27, we saw the first occurrence of:

 
{code:java}
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/master{code}
 
{quote}But after SCP, the region server can not accept any write requests any 
more, although it could still serve read requests.
{quote}
That's because of the WAL splitting done by the master, correct?

 

 

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818310#comment-17818310
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/18/24 8:21 PM:
---

{quote}And for regionServerReport, we used to make use of the information to 
fix inconsistency but it introduced bunch of new inconsistency so IIRC finally 
we choose to only log the inconsistency, instead of fixing it in 
regionServerReport, or at least we will not fix it immediately
{quote}
I see, that makes sense. As part of this Jira, we could still reject serving rs 
report from old server, i.e. make that check more stringent by not only relying 
on dead server map but also compare with online server map and if we find that 
we have new online server with same host + port but with higher startcode, we 
reject rs report from old server immediately. As of today, we accept it after 
logging a warning, and that results into inconsistencies.
{quote}I think, there could be race that master think a region server is dead, 
but the region server is still alive.
{quote}
That is correct for this case. As per the logs, at 16:50:33, master scheduled 
SCP due to removal of ephemeral rs znode.

>From master side, the SCP was completed at 16:53:01:
{code:java}
2024-01-29 16:53:01,640 INFO  [PEWorker-39] procedure2.ProcedureExecutor - 
Finished pid=9812440, state=SUCCESS; ServerCrashProcedure 
hbase2a-dnds1-114-ia6.ops.sfdc.net,61020,1706541866103, splitWal=true, 
meta=false in 2 mins, 27.679 sec{code}
>From regionserver side, at 16:54:27, we saw the first occurrence of:
{code:java}
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/master{code}
{quote}But after SCP, the region server can not accept any write requests any 
more, although it could still serve read requests.
{quote}
That's because of the WAL splitting done by the master, correct?


was (Author: vjasani):
{quote}And for regionServerReport, we used to make use of the information to 
fix inconsistency but it introduced bunch of new inconsistency so IIRC finally 
we choose to only log the inconsistency, instead of fixing it in 
regionServerReport, or at least we will not fix it immediately
{quote}
I see, that makes sense. As part of this Jira, we could still reject serving rs 
report from old server, i.e. make that check more stringent by not only relying 
on dead server map but also compare with online server map and if we find that 
we have new online server with same host + port but with higher startcode, we 
reject rs report from old server immediately. As of today, we accept it after 
logging a warning, and that results into inconsistencies.
{quote}I think, there could be race that master think a region server is dead, 
but the region server is still alive.
{quote}
That is correct for this case. As per the logs, at 16:50:33, master scheduled 
SCP due to removal of ephemeral rs znode.

>From master side, the SCP was completed at 16:53:01:
{code:java}
2024-01-29 16:53:01,640 INFO  [PEWorker-39] procedure2.ProcedureExecutor - 
Finished pid=9812440, state=SUCCESS; ServerCrashProcedure 
hbase2a-dnds1-114-ia6.ops.sfdc.net,61020,1706541866103, splitWal=true, 
meta=false in 2 mins, 27.679 sec{code}
>From regionserver side, at 16:54:27, we saw the first occurrence of:

 
{code:java}
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/master{code}
 
{quote}But after SCP, the region server can not accept any write requests any 
more, although it could still serve read requests.
{quote}
That's because of the WAL splitting done by the master, correct?

 

 

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
>

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-17 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818096#comment-17818096
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/17/24 6:12 PM:
---

That's a good question, my guess is that the report server API was handled with 
delay at master side (unless the logs themselves are written with delay?). 
However, what is even more interesting is that even though master did schedule 
SCP for old server ~10 min back, while serving report rpc from the old server 
after ~10 min, it did not find the record in DeadServer map and did not throw 
YouAreDeadException, and moved on with updating AssignmentManager's in-memory 
records.

No logs seen with "{} {} came back up, removed it from the dead servers list" 
or "Server what rejected; currently processing serverName as dead server".

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.


was (Author: vjasani):
That's a good question, my guess is that the report server API was handled with 
delay at master side (unless the logs themselves are written with delay?). 
However, what is even more interesting is that even though master did schedule 
SCP for old server ~10 min back, while serving report rpc from the old server 
after ~10 min, it did not find the record in DeadServer map and did not throw 
YouAreDeadException, and moved on with updating AssignmentManager's in-memory 
records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818096#comment-17818096
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/16/24 10:08 PM:


That's a good question, my guess is that the report server API was handled with 
delay at master side (unless the logs themselves are written with delay?). 
However, what is even more interesting is that even though master did schedule 
SCP for old server ~10 min back, while serving report rpc from the old server 
after ~10 min, it did not find the record in DeadServer map and did not throw 
YouAreDeadException, and moved on with updating AssignmentManager's in-memory 
records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.


was (Author: vjasani):
That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818096#comment-17818096
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/16/24 8:45 PM:
---

That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

Our check should be more stringent i.e. even if the server with old host + port 
tries to report back, given that we have new server with the same host + port 
entry, we should throw YouAreDeadException and not move forward with updating 
AssignmentManager record.


was (Author: vjasani):
That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> =

[jira] [Commented] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818098#comment-17818098
 ] 

Viraj Jasani commented on HBASE-28366:
--

btw we have seen several similar inconsistencies in the past also ever since 
moving to hbase 2.4 (and now with 2.5 too), however getting to the bottom was 
not possible due to expired logs.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
>     at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
>     at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
>  {code}
>  
> Several region transition failure report logs:
> {code:java}
> 2024-01-29 16:55:13,029 INFO  [_REGION-regionserver/server1-114:61020-0] 
> regionserver.HRegionServer - Failed report transition server { host_name: 
> "server1-114.xyz" port: 61020 start_code: 1706541866103 } transition {

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818096#comment-17818096
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/16/24 8:43 PM:
---

That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

In the meantime, master had not crashed so DeadServer map should not have been 
refreshed. Scheduling SCP is definitely supposed to enter the server into the 
dead server map.


was (Author: vjasani):
That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
>     at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
>     at 
>

[jira] [Commented] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818096#comment-17818096
 ] 

Viraj Jasani commented on HBASE-28366:
--

That's a good question, my guess is that the report server API was handled with 
delay at master side. However, what is even more interesting is that even 
though master did schedule SCP for old server ~10 min back, while serving 
report rpc from the old server after ~10 min, it did not find the record in 
DeadServer map and did not throw YouAreDeadException, and moved on with 
updating AssignmentManager's in-memory records.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
>     at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
>     at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
>  {code}
>  
> Several region transition failure report logs:
>

[jira] [Comment Edited] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817535#comment-17817535
 ] 

Viraj Jasani edited comment on HBASE-28366 at 2/14/24 9:45 PM:
---

The fact that we had logs mentioned in above comment clearly says that even for 
old server, we were not able to throw YouAreDeadException. If we were, we would 
not have these logs.
{code:java}
public void regionServerReport(ServerName sn, ServerMetrics sl) throws 
YouAreDeadException {
  checkIsDead(sn, "REPORT");
  if (null == this.onlineServers.replace(sn, sl)) {
// Already have this host+port combo and its just different start code?
// Just let the server in. Presume master joining a running cluster.
// recordNewServer is what happens at the end of reportServerStartup.
// The only thing we are skipping is passing back to the regionserver
// the ServerName to use. Here we presume a master has already done
// that so we'll press on with whatever it gave us for ServerName.
if (!checkAndRecordNewServer(sn, sl)) {
  LOG.info("RegionServerReport ignored, could not record the server: " + 
sn);
  return; // Not recorded, so no need to move on
}
  }
  updateLastFlushedSequenceIds(sn, sl);
} {code}
Here, only checkIsDead() throws YouAreDeadException.

 

So we have problem: for the same host+port combination with different start 
code, while we log that the report is ignored and we no longer proceed with 
updating last flushed seq id for each region, but we still proceed with 
recording the reported regions in AssignmentManager:
{code:java}
ServerMetrics newLoad =
  ServerMetricsBuilder.toServerMetrics(serverName, versionNumber, version, sl);
server.getServerManager().regionServerReport(serverName, newLoad);
server.getAssignmentManager().reportOnlineRegions(serverName,
  newLoad.getRegionMetrics().keySet());{code}
We can easily create inconsistencies if we just return gracefully from 
regionServerReport() even though the server was not successfully recorded, we 
should throw YouAreDeadException even here, because the start code is old and 
we have record of new online server with new start code.


was (Author: vjasani):
The fact that we had logs mentioned in above comment clearly says that even for 
old server, we were not able to throw YouAreDeadException. If we were, we would 
not have these logs.

 
{code:java}
public void regionServerReport(ServerName sn, ServerMetrics sl) throws 
YouAreDeadException {
  checkIsDead(sn, "REPORT");
  if (null == this.onlineServers.replace(sn, sl)) {
// Already have this host+port combo and its just different start code?
// Just let the server in. Presume master joining a running cluster.
// recordNewServer is what happens at the end of reportServerStartup.
// The only thing we are skipping is passing back to the regionserver
// the ServerName to use. Here we presume a master has already done
// that so we'll press on with whatever it gave us for ServerName.
if (!checkAndRecordNewServer(sn, sl)) {
  LOG.info("RegionServerReport ignored, could not record the server: " + 
sn);
  return; // Not recorded, so no need to move on
}
  }
  updateLastFlushedSequenceIds(sn, sl);
} {code}
Here, only checkIsDead() throws YouAreDeadException.

 

So we have problem: for the same host+port combination with different start 
code, we are log that the report is ignored but we still proceed with recording 
the report report:

 
{code:java}
ServerMetrics newLoad =
  ServerMetricsBuilder.toServerMetrics(serverName, versionNumber, version, sl);
server.getServerManager().regionServerReport(serverName, newLoad);
server.getAssignmentManager().reportOnlineRegions(serverName,
  newLoad.getRegionMetrics().keySet());{code}
 

We can easily create inconsistencies if we just return gracefully from 
regionServerReport() even though the server was not successfully recorded, we 
should throw YouAreDeadException even here, because the start code is old and 
we have record of new online server with new start code.

 

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is

[jira] [Commented] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817535#comment-17817535
 ] 

Viraj Jasani commented on HBASE-28366:
--

The fact that we had logs mentioned in above comment clearly says that even for 
old server, we were not able to throw YouAreDeadException. If we were, we would 
not have these logs.

 
{code:java}
public void regionServerReport(ServerName sn, ServerMetrics sl) throws 
YouAreDeadException {
  checkIsDead(sn, "REPORT");
  if (null == this.onlineServers.replace(sn, sl)) {
// Already have this host+port combo and its just different start code?
// Just let the server in. Presume master joining a running cluster.
// recordNewServer is what happens at the end of reportServerStartup.
// The only thing we are skipping is passing back to the regionserver
// the ServerName to use. Here we presume a master has already done
// that so we'll press on with whatever it gave us for ServerName.
if (!checkAndRecordNewServer(sn, sl)) {
  LOG.info("RegionServerReport ignored, could not record the server: " + 
sn);
  return; // Not recorded, so no need to move on
}
  }
  updateLastFlushedSequenceIds(sn, sl);
} {code}
Here, only checkIsDead() throws YouAreDeadException.

 

So we have problem: for the same host+port combination with different start 
code, we are log that the report is ignored but we still proceed with recording 
the report report:

 
{code:java}
ServerMetrics newLoad =
  ServerMetricsBuilder.toServerMetrics(serverName, versionNumber, version, sl);
server.getServerManager().regionServerReport(serverName, newLoad);
server.getAssignmentManager().reportOnlineRegions(serverName,
  newLoad.getRegionMetrics().keySet());{code}
 

We can easily create inconsistencies if we just return gracefully from 
regionServerReport() even though the server was not successfully recorded, we 
should throw YouAreDeadException even here, because the start code is old and 
we have record of new online server with new start code.

 

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5

[jira] [Commented] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-14 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817531#comment-17817531
 ] 

Viraj Jasani commented on HBASE-28366:
--

{quote}I’ve seen many times that if the znode goes away (delete or expire), the 
master will throw YouAreDeadException the next time the RS checks in
{quote}
Yes, you are correct, however it seems we have possibility of edge case here. 
Without TRACE log, report regionserver rpc calls are hard to track but it seems 
that beyond the dead server check (which throws YouAreDeadException), the 
region report went ahead, then SCP removed the server entry from 
AssignmentManager's rsReports map, but by that time, regionServerReport() 
validation was already completed and the master went ahead with 
reportOnlineRegions() call and that eventually added the rsReports map entry 
again.

One more surprising observation is that the old server (start code: 
1706541866103)'s regionServerReport() was again served by master when new 
server on the same host (start code: 1706547696987) was already online and 
registered to master:

 
{code:java}
2024-01-29 17:02:17,315 INFO 
[iority.RWQ.Fifo.write.handler=2,queue=0,port=61000] master.ServerManager - 
Server serverName=server1-114.xyz,61020,1706541866103 rejected; we already have 
server1-114.xyz,61020,1706547696987 registered with same hostname and port

2024-01-29 17:02:17,315 INFO 
[iority.RWQ.Fifo.write.handler=2,queue=0,port=61000] master.ServerManager - 
RegionServerReport ignored, could not record the server: 
server1-114.xyz,61020,1706541866103 {code}
 

New server was brought online at 17:01:35. Perhaps the master handler was able 
to serve old server's regionServerReport() rpc call with some delay.

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time (though it's temporary state because 
> the rs will be aborted soon). While old regionserver can have zookeeper 
> connectivity issues, it can still make rpc calls to active master.
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
> rs abort, after ~5 min:
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException:

[jira] [Updated] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-13 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28366:
-
Description: 
If the regionserver is online but due to network issue, if it's rs ephemeral 
node gets deleted in zookeeper, active master schedules the SCP. However, if 
the regionserver is alive, it can still send regionServerReport to active 
master. In the case where SCP assigns regions to other regionserver that were 
previously hosted on the old regionserver (which is still alive), the old rs 
can continue to sent regionServerReport to active master.

Eventually this results into region inconsistencies because region is alive on 
two regionservers at the same time (though it's temporary state because the rs 
will be aborted soon). While old regionserver can have zookeeper connectivity 
issues, it can still make rpc calls to active master.

Logs:

SCP:
{code:java}
2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
 locks = 1, Read locks = 0], oldState=ONLINE.

2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
procedure2.ProcedureExecutor - Stored pid=9812440, 
state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
server1-114.xyz,61020,1706541866103, splitWal=true, meta=false

2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
splitWal=true, meta=false, isMeta: false
 {code}
As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another server:

 
{code:java}
2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler - 
Took xlock for pid=9818494, ppid=9812440, 
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
TransitRegionStateProcedure 
table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN

2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
regionLocation=server1-65.xyz,61020,1706165574050
 {code}
 

rs abort, after ~5 min:
{code:java}
2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
regionserver.HRegionServer - * ABORTING region server 
server1-114.xyz,61020,1706541866103: Unexpected exception handling getData *
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
    at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
    at 
org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
    at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
 {code}
 

Several region transition failure report logs:
{code:java}
2024-01-29 16:55:13,029 INFO  [_REGION-regionserver/server1-114:61020-0] 
regionserver.HRegionServer - Failed report transition server { host_name: 
"server1-114.xyz" port: 61020 start_code: 1706541866103 } transition { 
transition_code: CLOSED region_info { region_id: 1671555604277 table_name { 
namespace: "default" qualifier: "TABLE1" } start_key: "abc" end_key: "xyz" 
offline: false split: false replica_id: 0 } proc_id: -1 }; retry (#0) 
immediately.
java.net.UnknownHostException: Call to address=master-server1.xyz:61000 failed 
on local exception: java.net.UnknownHostException: master-server1.xyz:61000 
could not be resolved
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at

[jira] [Created] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-13 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28366:


 Summary: Mis-order of SCP and regionServerReport results into 
region inconsistencies
 Key: HBASE-28366
 URL: https://issues.apache.org/jira/browse/HBASE-28366
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.7, 3.0.0-beta-1, 2.4.17
Reporter: Viraj Jasani


If the regionserver is online but due to network issue, if it's rs ephemeral 
node gets deleted in zookeeper, active master schedules the SCP. However, if 
the regionserver is alive, it can still send regionServerReport to active 
master. In the case where SCP assigns regions to other regionserver that were 
previously hosted on the old regionserver (which is still alive), the old rs 
can continue to sent regionServerReport to active master.

Eventually this results into region inconsistencies because region is alive on 
two regionservers at the same time. While old regionserver can have zookeeper 
connectivity issues, it can still make rpc calls to active master.

 

Logs:

SCP:
{code:java}
2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
 locks = 1, Read locks = 0], oldState=ONLINE.

2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
procedure2.ProcedureExecutor - Stored pid=9812440, 
state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
server1-114.xyz,61020,1706541866103, splitWal=true, meta=false

2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
splitWal=true, meta=false, isMeta: false
 {code}
As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another server:

 
{code:java}
2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler - 
Took xlock for pid=9818494, ppid=9812440, 
state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
TransitRegionStateProcedure 
table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN

2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
regionLocation=server1-65.xyz,61020,1706165574050
 {code}
 

 

rs abort, after ~5 min:

 
{code:java}
2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
regionserver.HRegionServer - * ABORTING region server 
server1-114.xyz,61020,1706541866103: Unexpected exception handling getData *
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
    at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
    at 
org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
    at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
    at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
 {code}
 

 

Several region transition failure report logs:

 
{code:java}
2024-01-29 16:55:13,029 INFO  [_REGION-regionserver/server1-114:61020-0] 
regionserver.HRegionServer - Failed report transition server { host_name: 
"server1-114.xyz" port: 61020 start_code: 1706541866103 } transition { 
transition_code: CLOSED region_info { region_id: 1671555604277 table_name { 
namespace: "default" qualifier: "TABLE1" } start_key: "abc" end_key: "xyz" 
offline: false split: false replica_id: 0 } proc_id: -1 }; retry (#0) 
immediately.
java.net.UnknownHostException: Call to address=master-server1.xyz:61000 failed 
on local exception: java.net.UnknownHostException: master-server1.xyz:61000 
could not be resolved
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at

[jira] [Assigned] (HBASE-28366) Mis-order of SCP and regionServerReport results into region inconsistencies

2024-02-13 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-28366:


Assignee: Viraj Jasani

> Mis-order of SCP and regionServerReport results into region inconsistencies
> ---
>
> Key: HBASE-28366
> URL: https://issues.apache.org/jira/browse/HBASE-28366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.4.17, 3.0.0-beta-1, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If the regionserver is online but due to network issue, if it's rs ephemeral 
> node gets deleted in zookeeper, active master schedules the SCP. However, if 
> the regionserver is alive, it can still send regionServerReport to active 
> master. In the case where SCP assigns regions to other regionserver that were 
> previously hosted on the old regionserver (which is still alive), the old rs 
> can continue to sent regionServerReport to active master.
> Eventually this results into region inconsistencies because region is alive 
> on two regionservers at the same time. While old regionserver can have 
> zookeeper connectivity issues, it can still make rpc calls to active master.
>  
> Logs:
> SCP:
> {code:java}
> 2024-01-29 16:50:33,956 INFO [RegionServerTracker-0] 
> assignment.AssignmentManager - Scheduled ServerCrashProcedure pid=9812440 for 
> server1-114.xyz,61020,1706541866103 (carryingMeta=false) 
> server1-114.xyz,61020,1706541866103/CRASHED/regionCount=364/lock=java.util.concurrent.locks.ReentrantReadWriteLock@5d5fc31[Write
>  locks = 1, Read locks = 0], oldState=ONLINE.
> 2024-01-29 16:50:33,956 DEBUG [RegionServerTracker-0] 
> procedure2.ProcedureExecutor - Stored pid=9812440, 
> state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
> server1-114.xyz,61020,1706541866103, splitWal=true, meta=false
> 2024-01-29 16:50:33,973 INFO [PEWorker-36] procedure.ServerCrashProcedure - 
> Splitting WALs pid=9812440, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, 
> locked=true; ServerCrashProcedure server1-114.xyz,61020,1706541866103, 
> splitWal=true, meta=false, isMeta: false
>  {code}
> As part of SCP, d743ace9f70d55f55ba1ecc6dc49a5cb was assigned to another 
> server:
>  
> {code:java}
> 2024-01-29 16:50:42,656 INFO [PEWorker-24] procedure.MasterProcedureScheduler 
> - Took xlock for pid=9818494, ppid=9812440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure 
> table=PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA, 
> region=d743ace9f70d55f55ba1ecc6dc49a5cb, ASSIGN
> 2024-01-29 16:50:43,106 INFO [PEWorker-23] assignment.RegionStateStore - 
> pid=9818494 updating hbase:meta row=d743ace9f70d55f55ba1ecc6dc49a5cb, 
> regionState=OPEN, repBarrier=12867482, openSeqNum=12867482, 
> regionLocation=server1-65.xyz,61020,1706165574050
>  {code}
>  
>  
> rs abort, after ~5 min:
>  
> {code:java}
> 2024-01-29 16:54:27,235 ERROR [regionserver/server1-114:61020] 
> regionserver.HRegionServer - * ABORTING region server 
> server1-114.xyz,61020,1706541866103: Unexpected exception handling getData 
> *
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1229)
>     at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:414)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:403)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:367)
>     at 
> org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.getData(ZKNodeTracker.java:180)
>     at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:152)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionServerStatusStub(HRegionServer.java:2892)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1352)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1142)
>  {code}
>  
>  
> Several region transition failure report logs:
>  
> {code:java}
> 2024-01-29 16:55:13,029 INFO  [_REGION-regionserver/server1-114:61020-0] 
> regionserver.HRegionServer - Failed report transition server { host_name: 
> "server1-114.xyz" port: 61020 start_code: 1706541866103 } transition { 
> transition_code: CLOSED region_info { region_id: 1671555604277 table_name { 
> namespace: "default" qualifier: "TABLE1" } start_key: "abc" end_key: "xyz" 
> offline: false split: false replica_id: 0 } proc_id: -1 }; retry (#0) 
> immediately.

[jira] [Resolved] (HBASE-26352) Provide HBase upgrade guidelines from 1.6 to 2.4+ versions

2024-02-13 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-26352.
--
Resolution: Implemented

> Provide HBase upgrade guidelines from 1.6 to 2.4+ versions
> --
>
> Key: HBASE-26352
> URL: https://issues.apache.org/jira/browse/HBASE-26352
> Project: HBase
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Provide some ref guide under section: 
> [https://hbase.apache.org/book.html#upgrade2.0.rolling.upgrades] 
> This should include experience of performing in-place rolling upgrade 
> (without downtime) from 1.6/1.7 to 2.4+ release versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-26352) Provide HBase upgrade guidelines from 1.6 to 2.4+ versions

2024-02-13 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-26352:
-
Fix Version/s: (was: 3.0.0-beta-2)

> Provide HBase upgrade guidelines from 1.6 to 2.4+ versions
> --
>
> Key: HBASE-26352
> URL: https://issues.apache.org/jira/browse/HBASE-26352
> Project: HBase
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Provide some ref guide under section: 
> [https://hbase.apache.org/book.html#upgrade2.0.rolling.upgrades] 
> This should include experience of performing in-place rolling upgrade 
> (without downtime) from 1.6/1.7 to 2.4+ release versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28356) RegionServer Canary can should use Scan just like Region Canary with option to enable Raw Scan

2024-02-12 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28356.
--
Fix Version/s: 2.6.0
   2.4.18
   2.5.8
   3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

> RegionServer Canary can should use Scan just like Region Canary with option 
> to enable Raw Scan
> --
>
> Key: HBASE-28356
> URL: https://issues.apache.org/jira/browse/HBASE-28356
> Project: HBase
>  Issue Type: Improvement
>  Components: canary
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 2.5.8, 3.0.0-beta-2
>
>
> While working on HBASE-28204 to improve Region Canary, It came to notice that 
> RegionServer canary uses the same code as Region Canary to check if a region 
> is accessible. Plus RegionSever Canary doesn't have Raw Scan enabled. 
>  
> This JIRA aims to enable Raw Scan option for RegionServer Canary and use Scan 
> only just like Region Canary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28357) MoveWithAck#isSuccessfulScan for Region movement should use Region End Key for limiting scan to one region only.

2024-02-12 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28357.
--
Fix Version/s: 2.6.0
   2.4.18
   2.5.8
   3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

> MoveWithAck#isSuccessfulScan for Region movement should use Region End Key 
> for limiting scan to one region only.
> 
>
> Key: HBASE-28357
> URL: https://issues.apache.org/jira/browse/HBASE-28357
> Project: HBase
>  Issue Type: Improvement
>  Components: Region Assignment
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 2.5.8, 3.0.0-beta-2
>
>
> Based on recent learnings and improvements in HBase Canary in HBASE-28204 and 
> HBASE-28356, I noticed that MoveWithAck.java class also uses similar code to 
> check that Region is online after region move.
>  
> {code:java}
>   private void isSuccessfulScan(RegionInfo region) throws IOException {
>     Scan scan = new 
> Scan().withStartRow(region.getStartKey()).setRaw(true).setOneRowLimit()
>       .setMaxResultSize(1L).setCaching(1).setFilter(new 
> FirstKeyOnlyFilter()).setCacheBlocks(false); {code}
> If the region, that was moved, is empty then MoveWithAck#isSuccessfulScan() 
> will end up scanning next region key space, which is not the intent. If 
> multiple regions in sequence are empty, then this could create too many 
> unnecessary scans.  By setting withStopRow(endKeyOfRegion, false) for the 
> scan object, this scan can be bound to only single region.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28204) Region Canary can take lot more time If any region (except the first region) starts with delete markers

2024-02-12 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28204.
--
Fix Version/s: 2.6.0
   (was: 2.7.0)
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Region Canary can take lot more time If any region (except the first region) 
> starts with delete markers
> ---
>
> Key: HBASE-28204
> URL: https://issues.apache.org/jira/browse/HBASE-28204
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 2.5.8, 3.0.0-beta-2
>
>
> In CanaryTool.java, Canary reads only the first row of the region using 
> [Get|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L520C33-L520C33]
>  for any region of the table. Canary uses [Scan with FirstRowKeyFilter for 
> table 
> scan|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L530]
>  If the said region has empty start key (This will only happen when region is 
> the first region for a table)
> With -[HBASE-16091|https://issues.apache.org/jira/browse/HBASE-16091]- 
> RawScan was 
> [implemented|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519-L534]
>  to improve performance for regions which can have high number of delete 
> markers. Based on currently implementation, [RawScan is only 
> enabled|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519]
>  if region has empty start-key (or region is first region for the table).
> RawScan doesn't work for rest of the regions in the table except first 
> region. Also If the region has all the rows or majority of the rows with 
> delete markers, Get Operation can take a lot of time. This is can cause 
> timeouts for CanaryTool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28221) Introduce regionserver metric for delayed flushes

2024-02-10 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816340#comment-17816340
 ] 

Viraj Jasani commented on HBASE-28221:
--

{quote}So there could still be some value in adding this type of metric (but 
consider whether alerting on client impact, i.e. {{RegionTooBusyException}} and 
{{blockedRequestsCount}} would be sufficient first.)
{quote}
That sounds quite reasonable. Unless a new Jira is opened to provide metric for 
client impact related exceptions, we can repurpose this Jira to provide the 
metric.

> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 2.4.18, 2.7.0, 2.5.8, 3.0.0-beta-2, 2.6.1
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
>     at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) 
> {code}
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28344) Flush journal logs are missing from 2.x

2024-02-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816173#comment-17816173
 ] 

Viraj Jasani commented on HBASE-28344:
--

Applicable to master branch also?

> Flush journal logs are missing from 2.x 
> 
>
> Key: HBASE-28344
> URL: https://issues.apache.org/jira/browse/HBASE-28344
> Project: HBase
>  Issue Type: Improvement
>Reporter: Prathyusha
>Assignee: Prathyusha
>Priority: Minor
>
> After refactoring of TaskMonitor from branch 
> [1|https://git.soma.salesforce.com/bigdata-packaging/hbase/blob/efc1e0bb2bdfe46d07b1ae38692d616c02efe85d/hbase-server/src/main/java/org/apache/hadoop/hbase/monitoring/TaskMonitor.java#L87]
>  to 
> [2|https://git.soma.salesforce.com/bigdata-packaging/hbase/blob/efc1e0bb2bdfe46d07b1ae38692d616c02efe85d/hbase-server/src/main/java/org/apache/hadoop/hbase/monitoring/TaskMonitor.java#L87]
>  Flush journal logs are missing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28339) HBaseReplicationEndpoint creates new ZooKeeper client every time it tries to reconnect

2024-02-02 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813841#comment-17813841
 ] 

Viraj Jasani commented on HBASE-28339:
--

Thank you [~andor]!!

> HBaseReplicationEndpoint creates new ZooKeeper client every time it tries to 
> reconnect
> --
>
> Key: HBASE-28339
> URL: https://issues.apache.org/jira/browse/HBASE-28339
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.7, 2.7.0
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
>
> Asbtract base class {{HBaseReplicationEndpoint}} and therefore 
> {{HBaseInterClusterReplicationEndpoint}} creates new ZooKeeper client 
> instance every time there's an error occurs in communication and it tries to 
> reconnect. This was not a problem with ZooKeeper 3.4.x versions, because the 
> TGT Login thread was a static reference and only created once for all clients 
> in the same JVM. With the upgrade to ZooKeeper 3.5.x the login thread is 
> dedicated to the client instance, hence we have a new login thread every time 
> the replication endpoint reconnects.
> {code:java}
> /**
>  * A private method used to re-establish a zookeeper session with a peer 
> cluster.
>  */
> protected void reconnect(KeeperException ke) {
>   if (
> ke instanceof ConnectionLossException || ke instanceof 
> SessionExpiredException
>   || ke instanceof AuthFailedException
>   ) {
> String clusterKey = ctx.getPeerConfig().getClusterKey();
> LOG.warn("Lost the ZooKeeper connection for peer " + clusterKey, ke);
> try {
>   reloadZkWatcher();
> } catch (IOException io) {
>   LOG.warn("Creation of ZookeeperWatcher failed for peer " + clusterKey, 
> io);
> }
>   }
> }{code}
> {code:java}
> /**
>  * Closes the current ZKW (if not null) and creates a new one
>  * @throws IOException If anything goes wrong connecting
>  */
> synchronized void reloadZkWatcher() throws IOException {
>   if (zkw != null) zkw.close();
>   zkw = new ZKWatcher(ctx.getConfiguration(), "connection to cluster: " + 
> ctx.getPeerId(), this);
>   getZkw().registerListener(new PeerRegionServerListener(this));
> } {code}
> If the target cluster of replication is unavailable for some reason, the 
> replication endpoint keeps trying to reconnect to ZooKeeper destroying and 
> creating new Login threads constantly which will carpet bomb the KDC host 
> with login requests.
>  
> I'm not sure how to fix this yet, trying to create a unit test first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28151) hbck -o should not allow bypassing pre transit check by default

2024-01-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-28151.
--
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Incompatible change,Reviewed
   Resolution: Fixed

> hbck -o should not allow bypassing pre transit check by default
> ---
>
> Key: HBASE-28151
> URL: https://issues.apache.org/jira/browse/HBASE-28151
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.5
>Reporter: Viraj Jasani
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 3.0.0-beta-2
>
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28151) hbck -o should not allow bypassing pre transit check by default

2024-01-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28151:
-
Release Note: 
Added new param "force" to AssignsRequest and UnassignsRequest protos. Kept it 
enabled by default to maintain compatibility.
This can help hbck introduce new flag "-f" or "--force" to allow bypassing 
pre-transit check and also help document the consequences of bypassing the 
pre-transit checks, allowing users to use it with careful evaluation.

> hbck -o should not allow bypassing pre transit check by default
> ---
>
> Key: HBASE-28151
> URL: https://issues.apache.org/jira/browse/HBASE-28151
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.5
>Reporter: Viraj Jasani
>Assignee: Rahul Kumar
>Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28328) Add an option to count different types of Delete Markers in RowCounter

2024-01-24 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810692#comment-17810692
 ] 

Viraj Jasani commented on HBASE-28328:
--

In addition to total num of delete marker cells, shall we also output total num 
of rows that have delete markers? Since this is row counter, it might be good 
to output rows that have delete markers (vs rows that don't have any delete 
markers).

Otherwise, total count of DELETE, DELETE_COLUMN, DELETE_FAMILY and 
DELETE_FAMILY_VERSION cells would be great anyways.

> Add an option to count different types of Delete Markers in RowCounter
> --
>
> Key: HBASE-28328
> URL: https://issues.apache.org/jira/browse/HBASE-28328
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Himanshu Gwalani
>Assignee: Himanshu Gwalani
>Priority: Minor
>
> Add an option (count-delete-markers) to the 
> [RowCounter|https://github.com/apache/hbase/blob/8a9ad0736621fa1b00b5ae90529ca6065f88c67f/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java#L240C62-L240C75]
>  tool to count the number of Delete Markers of all types, i.e. (DELETE, 
> DELETE_COLUMN, DELETE_FAMILY,DELETE_FAMILY_VERSION)
> We already have such a feature within our internal implementation of 
> RowCounter and it's very useful.
> Implementation Ideas:
> 1. If the option is passed, initialize the empty job counters for all 4 types 
> of deletes.
> 2. Within mapper, increase the respective delete counts while processing each 
> row.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-22 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28271:
-
Fix Version/s: 2.6.0
   (was: 2.4.18)
   (was: 2.6.1)

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.5.8, 3.0.0-beta-2
>
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-22 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28271:
-
Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.5.8, 3.0.0-beta-2
>
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28221) Introduce regionserver metric for delayed flushes

2024-01-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808820#comment-17808820
 ] 

Viraj Jasani commented on HBASE-28221:
--

{quote}While going through _MemStoreFlusher_ I realised, we do have 
_flushQueueLength_ 
[metrics|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java#L238]
 exposed already which tells us 
[flushQueueSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L504]
 per RS. Isn't this the one we are looking for to figure out delayed flush 
counts ?
{quote}
Good catch, i think i missed this one. Looks like this should be good for us to 
alert on.

> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 2.4.18, 2.7.0, 2.5.8, 3.0.0-beta-2, 2.6.1
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
>     at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) 
> {code}
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-16 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807447#comment-17807447
 ] 

Viraj Jasani commented on HBASE-28271:
--

I am working on some comments from Duo, let me try to finish this today. Even 
if we can't get it in 2.6.0, it's fine IMO. Thanks Bryan.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.4.18, 2.5.8, 3.0.0-beta-2, 2.6.1
>
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-28293) Add metric for GetClusterStatus request count.

2024-01-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803724#comment-17803724
 ] 

Viraj Jasani edited comment on HBASE-28293 at 1/5/24 11:18 PM:
---

+1, maybe for this Jira we can focus on getClusterStatus as it is heavy one, 
and in follow-up jiras, we can extend this for other RPCs served by master.


was (Author: vjasani):
+1

> Add metric for GetClusterStatus request count.
> --
>
> Key: HBASE-28293
> URL: https://issues.apache.org/jira/browse/HBASE-28293
> Project: HBase
>  Issue Type: Bug
>Reporter: Rushabh Shah
>Priority: Major
>
> We have been bitten multiple times by GetClusterStatus request overwhelming 
> HMaster's memory usage. It would be good to add a metric for the total 
> GetClusterStatus requests count.
> In almost all of our production incidents involving GetClusterStatus request, 
> HMaster were running out of memory with many clients call this RPC in 
> parallel and the response size is very big.
> In hbase2 we have 
> [ClusterMetrics.Option|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterMetrics.java#L164-L224]
>  which can reduce the size of the response.
> It would be nice to add another metric to indicate if the response size of 
> GetClusterStatus is greater than some threshold (like 5MB)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28293) Add metric for GetClusterStatus request count.

2024-01-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803724#comment-17803724
 ] 

Viraj Jasani commented on HBASE-28293:
--

+1

> Add metric for GetClusterStatus request count.
> --
>
> Key: HBASE-28293
> URL: https://issues.apache.org/jira/browse/HBASE-28293
> Project: HBase
>  Issue Type: Bug
>Reporter: Rushabh Shah
>Priority: Major
>
> We have been bitten multiple times by GetClusterStatus request overwhelming 
> HMaster's memory usage. It would be good to add a metric for the total 
> GetClusterStatus requests count.
> In almost all of our production incidents involving GetClusterStatus request, 
> HMaster were running out of memory with many clients call this RPC in 
> parallel and the response size is very big.
> In hbase2 we have 
> [ClusterMetrics.Option|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterMetrics.java#L164-L224]
>  which can reduce the size of the response.
> It would be nice to add another metric to indicate if the response size of 
> GetClusterStatus is greater than some threshold (like 5MB)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28293) Add metric for GetClusterStatus request count.

2024-01-05 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803718#comment-17803718
 ] 

Viraj Jasani commented on HBASE-28293:
--

We can have two metrics: response size and request count

> Add metric for GetClusterStatus request count.
> --
>
> Key: HBASE-28293
> URL: https://issues.apache.org/jira/browse/HBASE-28293
> Project: HBase
>  Issue Type: Bug
>Reporter: Rushabh Shah
>Priority: Major
>
> We have been bitten multiple times by GetClusterStatus request overwhelming 
> HMaster's memory usage. It would be good to add a metric for the total 
> GetClusterStatus requests count.
> In almost all of our production incidents involving GetClusterStatus request, 
> HMaster were running out of memory with many clients call this RPC in 
> parallel and the response size is very big.
> In hbase2 we have 
> [ClusterMetrics.Option|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterMetrics.java#L164-L224]
>  which can reduce the size of the response.
> It would be nice to add another metric to indicate if the response size of 
> GetClusterStatus is greater than some threshold (like 5MB)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-03 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28271:
-
Fix Version/s: 2.6.0
   2.4.18
   2.5.8
   3.0.0-beta-2
   Status: Patch Available  (was: In Progress)

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.5.7, 2.4.17, 3.0.0-alpha-4
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 2.5.8, 3.0.0-beta-2
>
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work started] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-03 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-28271 started by Viraj Jasani.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2024-01-03 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802345#comment-17802345
 ] 

Viraj Jasani commented on HBASE-28271:
--

Thanks for pointing that out [~dmanning], yes it is actually worse than what i 
thought earlier.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HBASE-26192) Master UI hbck should provide a JSON formatted output option

2023-12-23 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-26192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-26192:


Assignee: Mihir Monani

> Master UI hbck should provide a JSON formatted output option
> 
>
> Key: HBASE-26192
> URL: https://issues.apache.org/jira/browse/HBASE-26192
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Kyle Purtell
>Assignee: Mihir Monani
>Priority: Minor
> Fix For: 2.6.0, 3.0.0-beta-2
>
> Attachments: Screen Shot 2022-05-31 at 5.18.15 PM.png
>
>
> It used to be possible to get hbck's verdict of cluster status from the 
> command line, especially useful for headless deployments, i.e. without 
> requiring a browser with sufficient connectivity to load a UI, or scrape 
> information out of raw HTML, or write regex to comb over log4j output. The 
> hbck tool's output wasn't particularly convenient to parse but it was 
> straightforward to extract the desired information with a handful of regular 
> expressions. 
> HBCK2 has a different design philosophy than the old hbck, which is to serve 
> as a collection of small and discrete recovery and repair functions, rather 
> than attempt to be a universal repair tool. This makes a lot of sense and 
> isn't the issue at hand. Unfortunately the old hbck's utility for reporting 
> the current cluster health assessment has not been replaced either in whole 
> or in part. Instead:
> {quote}
> HBCK2 is for fixes. For listings of inconsistencies or blockages in the 
> running cluster, you go elsewhere, to the logs and UI of the running cluster 
> Master. Once an issue has been identified, you use the HBCK2 tool to ask the 
> Master to effect fixes or to skip-over bad state. Asking the Master to make 
> the fixes rather than try and effect the repair locally in a fix-it tool's 
> context is another important difference between HBCK2 and hbck1. 
> {quote}
> Developing custom tooling to mine logs and scrape UI simply to gain a top 
> level assessment of system health is unsatisfying. There should be a 
> convenient means for querying the system if issues that rise to the level of 
> _inconsistency_, in the hbck parlance, are believed to be present. It would 
> be relatively simple to bring back the experience of invoking a command line 
> tool to deliver a verdict. This could be added to the hbck2 tool itself but 
> given that hbase-operator-tools is a separate project an intrinsic solution 
> is desirable. 
> An option that immediately comes to mind is modification of the Master's 
> hbck.jsp page to provide a JSON formatted output option if the HTTP Accept 
> header asks for text/json. However, looking at the source of hbck.jsp, it 
> makes more sense to leave it as is and implement a convenient machine 
> parseable output format elsewhere. This can be trivially accomplished with a 
> new servlet. Like hbck.jsp the servlet implementation would get a reference 
> to HbckChore and present the information this class makes available via its 
> various getters.  
> The machine parseable output is sufficient to enable headless hbck status 
> checking but it still would be nice if we could provide operators a command 
> line tool that formats the information for convenient viewing in a terminal. 
> That part could be implemented in the hbck2 tool after this proposal is 
> implemented.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-20 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799124#comment-17799124
 ] 

Viraj Jasani commented on HBASE-28271:
--

Thank you [~frostruan]! I also wonder if setting 
"hbase.snapshot.zk.coordinated" in any test would even make any difference 
since we no longer use that config?

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798791#comment-17798791
 ] 

Viraj Jasani commented on HBASE-28271:
--

[~frostruan] After HBASE-26323, do we have any test for snapshot creation 
without specifying nonce group and nonce?

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798696#comment-17798696
 ] 

Viraj Jasani edited comment on HBASE-28271 at 12/19/23 6:19 PM:


LockProcedure implementation at high level:

Just like any procedure, first it tries to acquire lock => lock acquired (here 
the lock is it's own lock implementation i.e. exclusive/shared locks at 
table/namespace/region level).

Only if lock is acquired, the execution begins as per the generic logic:
{code:java}
LockState lockState = acquireLock(proc);
switch (lockState) {
  case LOCK_ACQUIRED:
execProcedure(procStack, proc);
break;
  case LOCK_YIELD_WAIT:
LOG.info(lockState + " " + proc);
scheduler.yield(proc);
break;
  case LOCK_EVENT_WAIT:
// Someone will wake us up when the lock is available
LOG.debug(lockState + " " + proc);
break;
  default:
throw new UnsupportedOperationException();
} {code}
For LockProc, only when it is executed, the latch is accessed. This is the way 
snapshot ensures that the lock at the table level is already acquired and it 
can move forward with creating snapshot.


was (Author: vjasani):
LockProcedure implementation at high level:

Just like any procedure, first it tries to acquire lock => lock acquired.

Only if lock is acquired, the execution begins as per the generic logic:
{code:java}
LockState lockState = acquireLock(proc);
switch (lockState) {
  case LOCK_ACQUIRED:
execProcedure(procStack, proc);
break;
  case LOCK_YIELD_WAIT:
LOG.info(lockState + " " + proc);
scheduler.yield(proc);
break;
  case LOCK_EVENT_WAIT:
// Someone will wake us up when the lock is available
LOG.debug(lockState + " " + proc);
break;
  default:
throw new UnsupportedOperationException();
} {code}
For LockProc, only when it is executed, the latch is accessed. This is the way 
snapshot ensures that the lock at the table level is already acquired and it 
can move forward with creating snapshot.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798696#comment-17798696
 ] 

Viraj Jasani commented on HBASE-28271:
--

LockProcedure implementation at high level:

Just like any procedure, first it tries to acquire lock => lock acquired.

Only if lock is acquired, the execution begins as per the generic logic:
{code:java}
LockState lockState = acquireLock(proc);
switch (lockState) {
  case LOCK_ACQUIRED:
execProcedure(procStack, proc);
break;
  case LOCK_YIELD_WAIT:
LOG.info(lockState + " " + proc);
scheduler.yield(proc);
break;
  case LOCK_EVENT_WAIT:
// Someone will wake us up when the lock is available
LOG.debug(lockState + " " + proc);
break;
  default:
throw new UnsupportedOperationException();
} {code}
For LockProc, only when it is executed, the latch is accessed. This is the way 
snapshot ensures that the lock at the table level is already acquired and it 
can move forward with creating snapshot.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-19 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798694#comment-17798694
 ] 

Viraj Jasani commented on HBASE-28271:
--

Snapshot is the only consumer of the lock procedure that provides countdown 
latch to the procedure and waits until the latch is accessed by the procedure, 
no other consumers of lock procedure provides non-null latch to implement any 
wait strategy.

So, yes the plan is to make it generic enough but the only consumers we have is 
snapshots.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798470#comment-17798470
 ] 

Viraj Jasani commented on HBASE-28271:
--

I think in general we can keep the default timeout much lower (5-10 min?) and 
make it throw SnapshotCreationException sooner so that we don't keep master 
handlers occupied. But otherwise no problem with taking rpc timeout in equation 
too.

> Infinite waiting on lock acquisition by snapshot can result in unresponsive 
> master
> --
>
> Key: HBASE-28271
> URL: https://issues.apache.org/jira/browse/HBASE-28271
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.7
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: image.png
>
>
> When a region is stuck in transition for significant time, any attempt to 
> take snapshot on the table would keep master handler thread in forever 
> waiting state. As part of the creating snapshot on enabled or disabled table, 
> in order to get the table level lock, LockProcedure is executed but if any 
> region of the table is in transition, LockProcedure could not be executed by 
> the snapshot handler, resulting in forever waiting until the region 
> transition is completed, allowing the table level lock to be acquired by the 
> snapshot handler.
> In cases where a region stays in RIT for considerable time, if enough 
> attempts are made by the client to create snapshots on the table, it can 
> easily exhaust all handler threads, leading to potentially unresponsive 
> master. Attached a sample thread dump.
> Proposal: The snapshot handler should not stay stuck forever if it cannot 
> take table level lock, it should fail-fast.
> !image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-28271) Infinite waiting on lock acquisition by snapshot can result in unresponsive master

2023-12-18 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28271:


 Summary: Infinite waiting on lock acquisition by snapshot can 
result in unresponsive master
 Key: HBASE-28271
 URL: https://issues.apache.org/jira/browse/HBASE-28271
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.5.7, 2.4.17, 3.0.0-alpha-4
Reporter: Viraj Jasani
Assignee: Viraj Jasani
 Attachments: image.png

When a region is stuck in transition for significant time, any attempt to take 
snapshot on the table would keep master handler thread in forever waiting 
state. As part of the creating snapshot on enabled or disabled table, in order 
to get the table level lock, LockProcedure is executed but if any region of the 
table is in transition, LockProcedure could not be executed by the snapshot 
handler, resulting in forever waiting until the region transition is completed, 
allowing the table level lock to be acquired by the snapshot handler.

In cases where a region stays in RIT for considerable time, if enough attempts 
are made by the client to create snapshots on the table, it can easily exhaust 
all handler threads, leading to potentially unresponsive master. Attached a 
sample thread dump.

Proposal: The snapshot handler should not stay stuck forever if it cannot take 
table level lock, it should fail-fast.

!image.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-26192) Master UI hbck should provide a JSON formatted output option

2023-12-18 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-26192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798293#comment-17798293
 ] 

Viraj Jasani commented on HBASE-26192:
--

Some folks are interested to pick this up, will update assignee in sometime. 
Thanks

> Master UI hbck should provide a JSON formatted output option
> 
>
> Key: HBASE-26192
> URL: https://issues.apache.org/jira/browse/HBASE-26192
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Kyle Purtell
>Priority: Minor
> Fix For: 2.6.0, 3.0.0-beta-2
>
> Attachments: Screen Shot 2022-05-31 at 5.18.15 PM.png
>
>
> It used to be possible to get hbck's verdict of cluster status from the 
> command line, especially useful for headless deployments, i.e. without 
> requiring a browser with sufficient connectivity to load a UI, or scrape 
> information out of raw HTML, or write regex to comb over log4j output. The 
> hbck tool's output wasn't particularly convenient to parse but it was 
> straightforward to extract the desired information with a handful of regular 
> expressions. 
> HBCK2 has a different design philosophy than the old hbck, which is to serve 
> as a collection of small and discrete recovery and repair functions, rather 
> than attempt to be a universal repair tool. This makes a lot of sense and 
> isn't the issue at hand. Unfortunately the old hbck's utility for reporting 
> the current cluster health assessment has not been replaced either in whole 
> or in part. Instead:
> {quote}
> HBCK2 is for fixes. For listings of inconsistencies or blockages in the 
> running cluster, you go elsewhere, to the logs and UI of the running cluster 
> Master. Once an issue has been identified, you use the HBCK2 tool to ask the 
> Master to effect fixes or to skip-over bad state. Asking the Master to make 
> the fixes rather than try and effect the repair locally in a fix-it tool's 
> context is another important difference between HBCK2 and hbck1. 
> {quote}
> Developing custom tooling to mine logs and scrape UI simply to gain a top 
> level assessment of system health is unsatisfying. There should be a 
> convenient means for querying the system if issues that rise to the level of 
> _inconsistency_, in the hbck parlance, are believed to be present. It would 
> be relatively simple to bring back the experience of invoking a command line 
> tool to deliver a verdict. This could be added to the hbck2 tool itself but 
> given that hbase-operator-tools is a separate project an intrinsic solution 
> is desirable. 
> An option that immediately comes to mind is modification of the Master's 
> hbck.jsp page to provide a JSON formatted output option if the HTTP Accept 
> header asks for text/json. However, looking at the source of hbck.jsp, it 
> makes more sense to leave it as is and implement a convenient machine 
> parseable output format elsewhere. This can be trivially accomplished with a 
> new servlet. Like hbck.jsp the servlet implementation would get a reference 
> to HbckChore and present the information this class makes available via its 
> various getters.  
> The machine parseable output is sufficient to enable headless hbck status 
> checking but it still would be nice if we could provide operators a command 
> line tool that formats the information for convenient viewing in a terminal. 
> That part could be implemented in the hbck2 tool after this proposal is 
> implemented.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-12-03 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792703#comment-17792703
 ] 

Viraj Jasani commented on HBASE-28221:
--

Yeah i was thinking about that earlier but flushes can be delayed even if 
compaction is extremely slow or not efficient, hence thought this would be 
better metric at MetricsRegionServerSource level.

> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
>     at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) 
> {code}
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28204) Canary can take lot more time If any region (except the first region) starts with delete markers

2023-12-01 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28204:
-
Fix Version/s: 2.7.0

> Canary can take lot more time If any region (except the first region) starts 
> with delete markers
> 
>
> Key: HBASE-28204
> URL: https://issues.apache.org/jira/browse/HBASE-28204
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7, 2.7.0
>
>
> In CanaryTool.java, Canary reads only the first row of the region using 
> [Get|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L520C33-L520C33]
>  for any region of the table. Canary uses [Scan with FirstRowKeyFilter for 
> table 
> scan|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L530]
>  If the said region has empty start key (This will only happen when region is 
> the first region for a table)
> With -[HBASE-16091|https://issues.apache.org/jira/browse/HBASE-16091]- 
> RawScan was 
> [implemented|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519-L534]
>  to improve performance for regions which can have high number of delete 
> markers. Based on currently implementation, [RawScan is only 
> enabled|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519]
>  if region has empty start-key (or region is first region for the table).
> RawScan doesn't work for rest of the regions in the table except first 
> region. Also If the region has all the rows or majority of the rows with 
> delete markers, Get Operation can take a lot of time. This is can cause 
> timeouts for CanaryTool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28204) Canary can take lot more time If any region (except the first region) starts with delete markers

2023-11-30 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791900#comment-17791900
 ] 

Viraj Jasani commented on HBASE-28204:
--

FYI [~bbeaudreault], we have seen a bit of perf regression so need to revert 
the commit. Just wanted to keep you in loop in case you have started preparing 
RC0 already.

> Canary can take lot more time If any region (except the first region) starts 
> with delete markers
> 
>
> Key: HBASE-28204
> URL: https://issues.apache.org/jira/browse/HBASE-28204
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> In CanaryTool.java, Canary reads only the first row of the region using 
> [Get|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L520C33-L520C33]
>  for any region of the table. Canary uses [Scan with FirstRowKeyFilter for 
> table 
> scan|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L530]
>  If the said region has empty start key (This will only happen when region is 
> the first region for a table)
> With -[HBASE-16091|https://issues.apache.org/jira/browse/HBASE-16091]- 
> RawScan was 
> [implemented|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519-L534]
>  to improve performance for regions which can have high number of delete 
> markers. Based on currently implementation, [RawScan is only 
> enabled|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519]
>  if region has empty start-key (or region is first region for the table).
> RawScan doesn't work for rest of the regions in the table except first 
> region. Also If the region has all the rows or majority of the rows with 
> delete markers, Get Operation can take a lot of time. This is can cause 
> timeouts for CanaryTool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (HBASE-28204) Canary can take lot more time If any region (except the first region) starts with delete markers

2023-11-30 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HBASE-28204:
--

Reopening for revert.

> Canary can take lot more time If any region (except the first region) starts 
> with delete markers
> 
>
> Key: HBASE-28204
> URL: https://issues.apache.org/jira/browse/HBASE-28204
> Project: HBase
>  Issue Type: Bug
>  Components: canary
>Reporter: Mihir Monani
>Assignee: Mihir Monani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> In CanaryTool.java, Canary reads only the first row of the region using 
> [Get|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L520C33-L520C33]
>  for any region of the table. Canary uses [Scan with FirstRowKeyFilter for 
> table 
> scan|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L530]
>  If the said region has empty start key (This will only happen when region is 
> the first region for a table)
> With -[HBASE-16091|https://issues.apache.org/jira/browse/HBASE-16091]- 
> RawScan was 
> [implemented|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519-L534]
>  to improve performance for regions which can have high number of delete 
> markers. Based on currently implementation, [RawScan is only 
> enabled|https://github.com/apache/hbase/blob/23c41560d58cc1353b8a466deacd02dfee9e6743/hbase-server/src/main/java/org/apache/hadoop/hbase/tool/CanaryTool.java#L519]
>  if region has empty start-key (or region is first region for the table).
> RawScan doesn't work for rest of the regions in the table except first 
> region. Also If the region has all the rows or majority of the rows with 
> delete markers, Get Operation can take a lot of time. This is can cause 
> timeouts for CanaryTool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-25714) Offload the compaction job to independent Compaction Server

2023-11-30 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-25714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791724#comment-17791724
 ] 

Viraj Jasani commented on HBASE-25714:
--

[~niuyulin] looks like the feature branch has not had much update for sometime. 
Do you have plans to move this forward?

> Offload the compaction job to independent Compaction Server
> ---
>
> Key: HBASE-25714
> URL: https://issues.apache.org/jira/browse/HBASE-25714
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yulin Niu
>Assignee: Yulin Niu
>Priority: Major
> Attachments: CoprocessorSupport1.png, CoprocessorSupport2.png
>
>
> The basic idea is add a role "CompactionServer" to take the Compaction job. 
> HMaster is responsible for scheduling the compaction job to different 
> CompactionServer.
> [design 
> doc|https://docs.google.com/document/d/1exmhQpQArAgnryLaV78K3260rKm64BHBNzZE4VdTz0c/edit?usp=sharing]
> Suggestions are welcomed. Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-11-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28221:
-
Fix Version/s: 2.4.18
   2.5.7

> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
>     at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) 
> {code}
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-11-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28221:
-
Affects Version/s: 2.5.6
   2.4.17

> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
>     at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
>     at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) 
> {code}
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-11-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28221:
-
Description: 
If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
forget re-enabling the compaction. This can result into flushes getting delayed 
for "hbase.hstore.blockingWaitTime" time (90s). While flushes do happen 
eventually after waiting for max blocking time, it is important to realize that 
any cluster cannot function well with compaction disabled for significant 
amount of time.

 

We would also block any write requests until region is flushed (90+ sec, by 
default):
{code:java}
2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
Region is too busy due to exceeding memstore size limit.
org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
server=server-1,60020,1699421714454, memstoreSize=1073820928, 
blockingMemStoreSize=1073741824
    at 
org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
    at 
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
    at 
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:895)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2524)
    at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2432)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
    at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311)
    at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291) {code}
 

Delayed flush logs:
{code:java}
LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
  region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
  this.blockingWaitTime); {code}
Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
num of flushes getting delayed due to too many store files.

  was:
If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
forget re-enabling the compaction. This can result into flushes getting delayed 
for "hbase.hstore.blockingWaitTime" time (90s). While flushes do happen 
eventually after waiting for max blocking time, it is important to realize that 
any cluster cannot function well with compaction disabled for significant 
amount of time as we block any write requests until region memstore stays at 
full capacity.

 

Delayed flush logs:
{code:java}
LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
  region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
  this.blockingWaitTime); {code}
Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
num of flushes getting delayed due to too many store files.


> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time.
>  
> We would also block any write requests until region is flushed (90+ sec, by 
> default):
> {code:java}
> 2023-11-27 20:40:52,124 WARN  [,queue=18,port=60020] regionserver.HRegion - 
> Region is too busy due to exceeding memstore size limit.
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, 
> regionName=table1,1699923733811.4fd5e52e2133df1e347f32c646f23ab4., 
> server=server-1,60020,1699421714454, memstoreSize=1073820928, 
> blockingMemStoreSize=1073741824
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:4200)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3264)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3215)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:967)
>     at 
>

[jira] [Updated] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-11-27 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28221:
-
Description: 
If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
forget re-enabling the compaction. This can result into flushes getting delayed 
for "hbase.hstore.blockingWaitTime" time (90s). While flushes do happen 
eventually after waiting for max blocking time, it is important to realize that 
any cluster cannot function well with compaction disabled for significant 
amount of time as we block any write requests until region memstore stays at 
full capacity.

 

Delayed flush logs:
{code:java}
LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
  region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
  this.blockingWaitTime); {code}
Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
num of flushes getting delayed due to too many store files.

  was:
If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
forget re-enabling the compaction. This can result into flushes getting delayed 
for "hbase.hstore.blockingWaitTime" time (90s). While flushes do happen 
eventually after waiting for max blocking time, it is important to realize that 
any cluster cannot function well with compaction disabled for significant 
amount of time.

 

Delayed flush logs:
{code:java}
LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
  region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
  this.blockingWaitTime); {code}
Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
num of flushes getting delayed due to too many store files.


> Introduce regionserver metric for delayed flushes
> -
>
> Key: HBASE-28221
> URL: https://issues.apache.org/jira/browse/HBASE-28221
> Project: HBase
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
> forget re-enabling the compaction. This can result into flushes getting 
> delayed for "hbase.hstore.blockingWaitTime" time (90s). While flushes do 
> happen eventually after waiting for max blocking time, it is important to 
> realize that any cluster cannot function well with compaction disabled for 
> significant amount of time as we block any write requests until region 
> memstore stays at full capacity.
>  
> Delayed flush logs:
> {code:java}
> LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
>   region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
>   this.blockingWaitTime); {code}
> Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
> num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-28221) Introduce regionserver metric for delayed flushes

2023-11-27 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28221:


 Summary: Introduce regionserver metric for delayed flushes
 Key: HBASE-28221
 URL: https://issues.apache.org/jira/browse/HBASE-28221
 Project: HBase
  Issue Type: Improvement
Reporter: Viraj Jasani
 Fix For: 2.6.0, 3.0.0-beta-1


If compaction is disabled temporarily to allow stabilizing hdfs load, we can 
forget re-enabling the compaction. This can result into flushes getting delayed 
for "hbase.hstore.blockingWaitTime" time (90s). While flushes do happen 
eventually after waiting for max blocking time, it is important to realize that 
any cluster cannot function well with compaction disabled for significant 
amount of time.

 

Delayed flush logs:
{code:java}
LOG.warn("{} has too many store files({}); delaying flush up to {} ms",
  region.getRegionInfo().getEncodedName(), getStoreFileCount(region),
  this.blockingWaitTime); {code}
Suggestion: Introduce regionserver metric (MetricsRegionServerSource) for the 
num of flushes getting delayed due to too many store files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-21785) master reports open regions as RITs and also messes up rit age metric

2023-11-21 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-21785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788526#comment-17788526
 ] 

Viraj Jasani commented on HBASE-21785:
--

Agree, this deserves to be rolled out with any upcoming 2.x releases (patch or 
minor).

> master reports open regions as RITs and also messes up rit age metric
> -
>
> Key: HBASE-21785
> URL: https://issues.apache.org/jira/browse/HBASE-21785
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0
>
> Attachments: HBASE-21785.01.patch, HBASE-21785.patch
>
>
> {noformat}
> RegionState   RIT time (ms)   Retries
> dba183f0dadfcc9dc8ae0a6dd59c84e6  dba183f0dadfcc9dc8ae0a6dd59c84e6. 
> state=OPEN, ts=Wed Dec 31 16:00:00 PST 1969 (1548453918s ago), 
> server=server,17020,1548452922054  1548453918735   0
> {noformat}
> RIT age metric also gets set to a bogus value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784677#comment-17784677
 ] 

Viraj Jasani commented on HBASE-28192:
--

{quote}Actually, the most dangerous thing is always that, people think they can 
fix something without knowing the root cause and then they just make thing 
worse...
{quote}
I agree, but this case is quite particular. I am not suggesting we schedule 
recovery for any inconsistent state of meta, I just meant to say that if meta 
is already online as per AssignmentManager but the server it is online on is 
not even live, we already have a problem that we will likely not recover unless 
that dead server SCP is being processed. The only way out for this case is for 
operator to schedule recovery of the old server, the more it takes for operator 
to understand what the current state of the cluster is, higher are the chances 
of client requests failures in that duration and higher num of stuck procedures 
will be accumulated.

If meta state is not online, we don't need any change in the current logic.

 
{quote}So here, meta is already online on server3-1,61020,1699456864765, but 
after server1 becomes active, the loaded meta location is 
server3-1,61020,1698687384632, which is a dead server?
{quote}
Correct.
{quote}And this happens on a rolling upgrading from 2.4 to 2.5? What is the 
version for server1 and server4? Server4 is 2.4.x and server and server1 is 
2.5.x?
{quote}
Yes, so far we observed this only during 2.4 to 2.5 upgrade. Let me get back 
with the version details of masters (server4 and server1) in sometime.

> Master should recover if meta region state is inconsistent
> --
>
> Key: HBASE-28192
> URL: https://issues.apache.org/jira/browse/HBASE-28192
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> During active master initialization, before we set master as active (i.e. 
> {_}setInitialized(true){_}), we need both meta and namespace regions online. 
> If the region state of meta or namespace is inconsistent, active master can 
> get stuck in the initialization step:
> {code:java}
> private boolean isRegionOnline(RegionInfo ri) {
>   RetryCounter rc = null;
>   while (!isStopped()) {
> ...
> ...
> ...
> // Check once-a-minute.
> if (rc == null) {
>   rc = new RetryCounterFactory(Integer.MAX_VALUE, 1000, 60_000).create();
> }
> Threads.sleep(rc.getBackoffTimeAndIncrementAttempts());
>   }
>   return false;
> }
>  {code}
> In one of the recent outage, we observed that meta was online on a server, 
> which was correctly reflected in meta znode, but the server starttime was 
> different. This means that as per the latest transition record, meta was 
> marked online on old server (same server with old start time). This kept 
> active master initialization waiting forever and some SCPs got stuck in 
> initial stage where they need to access meta table before getting candidate 
> for region moves.
> The only way out of this outage is for operator to schedule recoveries using 
> hbck for old server, which triggers SCP for old server address of meta. Since 
> many SCPs were stuck, the processing of new SCP too was taking some time and 
> manual restart of active master triggered failover, and new master was able 
> to complete SCP for old meta server, correcting the meta assignment details, 
> which eventually marked master as active and only after this, we were able to 
> see real large num of RITs that were hidden so far.
> We need to let master recover from this state to avoid manual intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784659#comment-17784659
 ] 

Viraj Jasani edited comment on HBASE-28192 at 11/10/23 3:15 AM:


Let me add some logs:

regionserver where meta is online:
{code:java}
2023-11-08 18:10:31,079 INFO  [MemStoreFlusher.1] regionserver.HStore - Added 
hdfs://{cluster}/hbase/data/hbase/meta/1588230740/rep_barrier/3e5faf652f1e4c6db1c4ba1ae676c3ee,
 entries=1630, sequenceid=94325525, filesize=362.1 K {code}
master server 4 which thought it was active:
{code:java}
2023-11-08 18:14:34,563 DEBUG [0:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1699456864765, table=hbase:meta, region=1588230740

2023-11-08 18:14:34,609 INFO  [0:becomeActiveMaster] master.ServerManager - 
Registering regionserver=server3-1,61020,1699456864765 {code}
master server 1 which thought it was active:
{code:java}
2023-11-08 18:15:50,350 DEBUG [aster/server1:61000:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1698687384632, table=hbase:meta, region=1588230740

2023-11-08 18:15:50,399 INFO  [aster/server1:61000:becomeActiveMaster] 
master.ServerManager - Registering regionserver=server3-1,61020,1699456864765 
{code}
master server 4 gave up:
{code:java}
2023-11-08 18:16:22,776 INFO  [aster/server4:61000:becomeActiveMaster] 
master.ActiveMasterManager - Another master is the active master, 
server1,61000,1699467212235; waiting to become the next active master {code}
 

When server 4 was trying to be active master and loaded meta, it retrieved the 
correct location of meta i.e. server3-1,61020,1699456864765

However, when server 1 (eventual active master) loaded meta, it retrieved 
incorrect location i.e. server3-1,61020,1698687384632

 

For hbase 2.5, i see that HBASE-26193 no longer relies on zookeeper and rather 
relies on scanning master region:
{code:java}
  // Start the Assignment Thread
  startAssignmentThread();
  // load meta region states.
  // here we are still in the early steps of active master startup. There is 
only one thread(us)
  // can access AssignmentManager and create region node, so here we do not 
need to lock the
  // region node.
  try (ResultScanner scanner =
masterRegion.getScanner(new Scan().addFamily(HConstants.CATALOG_FAMILY))) {
for (;;) {
  Result result = scanner.next();
  if (result == null) {
break;
  }
  RegionStateStore
.visitMetaEntry((r, regionInfo, state, regionLocation, lastHost, 
openSeqNum) -> {
  RegionStateNode regionNode = 
regionStates.getOrCreateRegionStateNode(regionInfo);
  regionNode.setState(state);
  regionNode.setLastHost(lastHost);
  regionNode.setRegionLocation(regionLocation);
  regionNode.setOpenSeqNum(openSeqNum);
  if (regionNode.getProcedure() != null) {
regionNode.getProcedure().stateLoaded(this, regionNode);
  }
  if (regionLocation != null) {
regionStates.addRegionToServer(regionNode);
  }
  if (RegionReplicaUtil.isDefaultReplica(regionInfo.getReplicaId())) {
setMetaAssigned(regionInfo, state == State.OPEN);
  }
  LOG.debug("Loaded hbase:meta {}", regionNode);
}, result);
}
  }
  mirrorMetaLocations();
}
 {code}
 

Maybe this incident was one-off case, maybe only happens during hbase 2.4 to 
2.5 upgrade. Once meta location is only read from master region, there should 
not be inconsistency I think.


was (Author: vjasani):
Let me add some logs:

regionserver where meta is online:
{code:java}
2023-11-08 18:10:31,079 INFO  [MemStoreFlusher.1] regionserver.HStore - Added 
hdfs://{cluster}/hbase/data/hbase/meta/1588230740/rep_barrier/3e5faf652f1e4c6db1c4ba1ae676c3ee,
 entries=1630, sequenceid=94325525, filesize=362.1 K {code}
master server 4 which thought it was active:
{code:java}
2023-11-08 18:14:34,563 DEBUG [0:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1699456864765, table=hbase:meta, region=1588230740

2023-11-08 18:14:34,609 INFO  [0:becomeActiveMaster] master.ServerManager - 
Registering regionserver=server3-1,61020,1699456864765 {code}
master server 1 which thought it was active:
{code:java}
2023-11-08 18:15:50,350 DEBUG [aster/server1:61000:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1698687384632, table=hbase:meta, region=1588230740

2023-11-08 18:15:50,399 INFO  [aster/server1:61000:becomeActiveMaster] 
master.ServerManager - Registering regionserver=server3-1,61020,1699456864765 
{code}
master server 4 gave up:
{code:java}
2023-11-08 18:16:22,776 INFO  [aster/server4:61000:becomeActiveMaster] 
master.ActiveMasterManager - Another master is the active master,

[jira] [Comment Edited] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784659#comment-17784659
 ] 

Viraj Jasani edited comment on HBASE-28192 at 11/10/23 3:15 AM:


Let me add some logs:

regionserver where meta is online:
{code:java}
2023-11-08 18:10:31,079 INFO  [MemStoreFlusher.1] regionserver.HStore - Added 
hdfs://{cluster}/hbase/data/hbase/meta/1588230740/rep_barrier/3e5faf652f1e4c6db1c4ba1ae676c3ee,
 entries=1630, sequenceid=94325525, filesize=362.1 K {code}
master server 4 which thought it was active:
{code:java}
2023-11-08 18:14:34,563 DEBUG [0:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1699456864765, table=hbase:meta, region=1588230740

2023-11-08 18:14:34,609 INFO  [0:becomeActiveMaster] master.ServerManager - 
Registering regionserver=server3-1,61020,1699456864765 {code}
master server 1 which thought it was active:
{code:java}
2023-11-08 18:15:50,350 DEBUG [aster/server1:61000:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1698687384632, table=hbase:meta, region=1588230740

2023-11-08 18:15:50,399 INFO  [aster/server1:61000:becomeActiveMaster] 
master.ServerManager - Registering regionserver=server3-1,61020,1699456864765 
{code}
master server 4 gave up:
{code:java}
2023-11-08 18:16:22,776 INFO  [aster/server4:61000:becomeActiveMaster] 
master.ActiveMasterManager - Another master is the active master, 
server1,61000,1699467212235; waiting to become the next active master {code}
 

When server 4 was trying to be active master and loaded meta, it retrieved the 
correct location of meta i.e. server3-1,61020,1699456864765

However, when server 1 (eventual active master) loaded meta, it retrieved 
incorrect location i.e. server3-1,61020,1698687384632

 

For hbase 2.5, i see that HBASE-26193 no longer relies on zookeeper and rather 
relies on scanning master region:
{code:java}
  // Start the Assignment Thread
  startAssignmentThread();
  // load meta region states.
  // here we are still in the early steps of active master startup. There is 
only one thread(us)
  // can access AssignmentManager and create region node, so here we do not 
need to lock the
  // region node.
  try (ResultScanner scanner =
masterRegion.getScanner(new Scan().addFamily(HConstants.CATALOG_FAMILY))) {
for (;;) {
  Result result = scanner.next();
  if (result == null) {
break;
  }
  RegionStateStore
.visitMetaEntry((r, regionInfo, state, regionLocation, lastHost, 
openSeqNum) -> {
  RegionStateNode regionNode = 
regionStates.getOrCreateRegionStateNode(regionInfo);
  regionNode.setState(state);
  regionNode.setLastHost(lastHost);
  regionNode.setRegionLocation(regionLocation);
  regionNode.setOpenSeqNum(openSeqNum);
  if (regionNode.getProcedure() != null) {
regionNode.getProcedure().stateLoaded(this, regionNode);
  }
  if (regionLocation != null) {
regionStates.addRegionToServer(regionNode);
  }
  if (RegionReplicaUtil.isDefaultReplica(regionInfo.getReplicaId())) {
setMetaAssigned(regionInfo, state == State.OPEN);
  }
  LOG.debug("Loaded hbase:meta {}", regionNode);
}, result);
}
  }
  mirrorMetaLocations();
}
 {code}
 

Maybe this incident was one-off case, maybe only happens during hbase 2.4 to 
2.5 upgrade. Once meta location is only read from master region (for 2.5+ 
releases), there should not be any inconsistency I think.


was (Author: vjasani):
Let me add some logs:

regionserver where meta is online:
{code:java}
2023-11-08 18:10:31,079 INFO  [MemStoreFlusher.1] regionserver.HStore - Added 
hdfs://{cluster}/hbase/data/hbase/meta/1588230740/rep_barrier/3e5faf652f1e4c6db1c4ba1ae676c3ee,
 entries=1630, sequenceid=94325525, filesize=362.1 K {code}
master server 4 which thought it was active:
{code:java}
2023-11-08 18:14:34,563 DEBUG [0:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1699456864765, table=hbase:meta, region=1588230740

2023-11-08 18:14:34,609 INFO  [0:becomeActiveMaster] master.ServerManager - 
Registering regionserver=server3-1,61020,1699456864765 {code}
master server 1 which thought it was active:
{code:java}
2023-11-08 18:15:50,350 DEBUG [aster/server1:61000:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1698687384632, table=hbase:meta, region=1588230740

2023-11-08 18:15:50,399 INFO  [aster/server1:61000:becomeActiveMaster] 
master.ServerManager - Registering regionserver=server3-1,61020,1699456864765 
{code}
master server 4 gave up:
{code:java}
2023-11-08 18:16:22,776 INFO  [aster/server4:61000:becomeActiveMaster] 
master.ActiveMasterManager - Another

[jira] [Commented] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784659#comment-17784659
 ] 

Viraj Jasani commented on HBASE-28192:
--

Let me add some logs:

regionserver where meta is online:
{code:java}
2023-11-08 18:10:31,079 INFO  [MemStoreFlusher.1] regionserver.HStore - Added 
hdfs://{cluster}/hbase/data/hbase/meta/1588230740/rep_barrier/3e5faf652f1e4c6db1c4ba1ae676c3ee,
 entries=1630, sequenceid=94325525, filesize=362.1 K {code}
master server 4 which thought it was active:
{code:java}
2023-11-08 18:14:34,563 DEBUG [0:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1699456864765, table=hbase:meta, region=1588230740

2023-11-08 18:14:34,609 INFO  [0:becomeActiveMaster] master.ServerManager - 
Registering regionserver=server3-1,61020,1699456864765 {code}
master server 1 which thought it was active:
{code:java}
2023-11-08 18:15:50,350 DEBUG [aster/server1:61000:becomeActiveMaster] 
assignment.AssignmentManager - Loaded hbase:meta state=OPEN, 
location=server3-1,61020,1698687384632, table=hbase:meta, region=1588230740

2023-11-08 18:15:50,399 INFO  [aster/server1:61000:becomeActiveMaster] 
master.ServerManager - Registering regionserver=server3-1,61020,1699456864765 
{code}
master server 4 gave up:
{code:java}
2023-11-08 18:16:22,776 INFO  [aster/server4:61000:becomeActiveMaster] 
master.ActiveMasterManager - Another master is the active master, 
server1,61000,1699467212235; waiting to become the next active master {code}
 

When server 4 was trying to be active master and loaded meta, it retrieved the 
correct location of meta i.e. server3-1,61020,1699456864765

However, when server 1 (eventual active master) loaded meta, it retrieved 
incorrect location i.e. server3-1,61020,1698687384632

 

For hbase 2.5, i see that HBASE-26193 no longer relies on zookeeper and rather 
relies on scanning master region:
{code:java}
  // Start the Assignment Thread
  startAssignmentThread();
  // load meta region states.
  // here we are still in the early steps of active master startup. There is 
only one thread(us)
  // can access AssignmentManager and create region node, so here we do not 
need to lock the
  // region node.
  try (ResultScanner scanner =
masterRegion.getScanner(new Scan().addFamily(HConstants.CATALOG_FAMILY))) {
for (;;) {
  Result result = scanner.next();
  if (result == null) {
break;
  }
  RegionStateStore
.visitMetaEntry((r, regionInfo, state, regionLocation, lastHost, 
openSeqNum) -> {
  RegionStateNode regionNode = 
regionStates.getOrCreateRegionStateNode(regionInfo);
  regionNode.setState(state);
  regionNode.setLastHost(lastHost);
  regionNode.setRegionLocation(regionLocation);
  regionNode.setOpenSeqNum(openSeqNum);
  if (regionNode.getProcedure() != null) {
regionNode.getProcedure().stateLoaded(this, regionNode);
  }
  if (regionLocation != null) {
regionStates.addRegionToServer(regionNode);
  }
  if (RegionReplicaUtil.isDefaultReplica(regionInfo.getReplicaId())) {
setMetaAssigned(regionInfo, state == State.OPEN);
  }
  LOG.debug("Loaded hbase:meta {}", regionNode);
}, result);
}
  }
  mirrorMetaLocations();
}
 {code}

> Master should recover if meta region state is inconsistent
> --
>
> Key: HBASE-28192
> URL: https://issues.apache.org/jira/browse/HBASE-28192
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> During active master initialization, before we set master as active (i.e. 
> {_}setInitialized(true){_}), we need both meta and namespace regions online. 
> If the region state of meta or namespace is inconsistent, active master can 
> get stuck in the initialization step:
> {code:java}
> private boolean isRegionOnline(RegionInfo ri) {
>   RetryCounter rc = null;
>   while (!isStopped()) {
> ...
> ...
> ...
> // Check once-a-minute.
> if (rc == null) {
>   rc = new RetryCounterFactory(Integer.MAX_VALUE, 1000, 60_000).create();
> }
> Threads.sleep(rc.getBackoffTimeAndIncrementAttempts());
>   }
>   return false;
> }
>  {code}
> In one of the recent outage, we observed that meta was online on a server, 
> which was correctly reflected in meta znode, but the server starttime was 
> different. This means that as per the latest transition record, meta was 
> marked online on old server (same server with old start time). This kept 
> active master initialization waiting forever and some SCPs got stuck in 
> initial stage where they

[jira] [Comment Edited] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784650#comment-17784650
 ] 

Viraj Jasani edited comment on HBASE-28192 at 11/10/23 2:41 AM:


[~zhangduo] i am not aware of the exact root cause but this was hbase 2.4 to 
2.5 upgrade and HBASE-26193 might be suspect, i am not sure, need to dig in, 
but let's say we get to know about the cause and resolve it, but there could be 
something else tomorrow that can make active master init stuck in the loop, 
maybe during upgrade or maybe during usual restarts, it's not good anyway right?

If meta is online but not on live server, master should be able to recover. Any 
cause should be handled separately too, but right now we let master get stuck 
in infinite loop for this edge case, which is also not reliable IMO. At least 
we should not expect operator to perform hbck recovery for meta and/or 
namespace regions while master stay stuck forever in loop.


was (Author: vjasani):
[~zhangduo] i am not aware of the exact root cause but this was hbase 2.4 to 
2.5 upgrade and HBASE-26193 might be suspect, i am not sure, need to dig in, 
but let's say we do know the reason and there could be something else tomorrow 
that can make active master init stuck in the loop, it's not good anyway right? 
If meta is online but not on live server, master should be able to recover. Any 
cause should be handled separately too, but right now we let master stuck in 
infinite loop for this edge case, which is also not reliable IMO.

> Master should recover if meta region state is inconsistent
> --
>
> Key: HBASE-28192
> URL: https://issues.apache.org/jira/browse/HBASE-28192
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> During active master initialization, before we set master as active (i.e. 
> {_}setInitialized(true){_}), we need both meta and namespace regions online. 
> If the region state of meta or namespace is inconsistent, active master can 
> get stuck in the initialization step:
> {code:java}
> private boolean isRegionOnline(RegionInfo ri) {
>   RetryCounter rc = null;
>   while (!isStopped()) {
> ...
> ...
> ...
> // Check once-a-minute.
> if (rc == null) {
>   rc = new RetryCounterFactory(Integer.MAX_VALUE, 1000, 60_000).create();
> }
> Threads.sleep(rc.getBackoffTimeAndIncrementAttempts());
>   }
>   return false;
> }
>  {code}
> In one of the recent outage, we observed that meta was online on a server, 
> which was correctly reflected in meta znode, but the server starttime was 
> different. This means that as per the latest transition record, meta was 
> marked online on old server (same server with old start time). This kept 
> active master initialization waiting forever and some SCPs got stuck in 
> initial stage where they need to access meta table before getting candidate 
> for region moves.
> The only way out of this outage is for operator to schedule recoveries using 
> hbck for old server, which triggers SCP for old server address of meta. Since 
> many SCPs were stuck, the processing of new SCP too was taking some time and 
> manual restart of active master triggered failover, and new master was able 
> to complete SCP for old meta server, correcting the meta assignment details, 
> which eventually marked master as active and only after this, we were able to 
> see real large num of RITs that were hidden so far.
> We need to let master recover from this state to avoid manual intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784650#comment-17784650
 ] 

Viraj Jasani commented on HBASE-28192:
--

[~zhangduo] i am not aware of the exact root cause but this was hbase 2.4 to 
2.5 upgrade and HBASE-26193 might be suspect, i am not sure, need to dig in, 
but let's say we do know the reason and there could be something else tomorrow 
that can make active master init stuck in the loop, it's not good anyway right? 
If meta is online but not on live server, master should be able to recover. Any 
cause should be handled separately too, but right now we let master stuck in 
infinite loop for this edge case, which is also not reliable IMO.

> Master should recover if meta region state is inconsistent
> --
>
> Key: HBASE-28192
> URL: https://issues.apache.org/jira/browse/HBASE-28192
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> During active master initialization, before we set master as active (i.e. 
> {_}setInitialized(true){_}), we need both meta and namespace regions online. 
> If the region state of meta or namespace is inconsistent, active master can 
> get stuck in the initialization step:
> {code:java}
> private boolean isRegionOnline(RegionInfo ri) {
>   RetryCounter rc = null;
>   while (!isStopped()) {
> ...
> ...
> ...
> // Check once-a-minute.
> if (rc == null) {
>   rc = new RetryCounterFactory(Integer.MAX_VALUE, 1000, 60_000).create();
> }
> Threads.sleep(rc.getBackoffTimeAndIncrementAttempts());
>   }
>   return false;
> }
>  {code}
> In one of the recent outage, we observed that meta was online on a server, 
> which was correctly reflected in meta znode, but the server starttime was 
> different. This means that as per the latest transition record, meta was 
> marked online on old server (same server with old start time). This kept 
> active master initialization waiting forever and some SCPs got stuck in 
> initial stage where they need to access meta table before getting candidate 
> for region moves.
> The only way out of this outage is for operator to schedule recoveries using 
> hbck for old server, which triggers SCP for old server address of meta. Since 
> many SCPs were stuck, the processing of new SCP too was taking some time and 
> manual restart of active master triggered failover, and new master was able 
> to complete SCP for old meta server, correcting the meta assignment details, 
> which eventually marked master as active and only after this, we were able to 
> see real large num of RITs that were hidden so far.
> We need to let master recover from this state to avoid manual intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-28192) Master should recover if meta region state is inconsistent

2023-11-09 Thread Viraj Jasani (Jira)

Viraj Jasani created HBASE-28192:


 Summary: Master should recover if meta region state is inconsistent
 Key: HBASE-28192
 URL: https://issues.apache.org/jira/browse/HBASE-28192
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.5.6, 2.4.17
Reporter: Viraj Jasani
Assignee: Viraj Jasani
 Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7


During active master initialization, before we set master as active (i.e. 
{_}setInitialized(true){_}), we need both meta and namespace regions online. If 
the region state of meta or namespace is inconsistent, active master can get 
stuck in the initialization step:
{code:java}
private boolean isRegionOnline(RegionInfo ri) {
  RetryCounter rc = null;
  while (!isStopped()) {
...
...
...
// Check once-a-minute.
if (rc == null) {
  rc = new RetryCounterFactory(Integer.MAX_VALUE, 1000, 60_000).create();
}
Threads.sleep(rc.getBackoffTimeAndIncrementAttempts());
  }
  return false;
}
 {code}
In one of the recent outage, we observed that meta was online on a server, 
which was correctly reflected in meta znode, but the server starttime was 
different. This means that as per the latest transition record, meta was marked 
online on old server (same server with old start time). This kept active master 
initialization waiting forever and some SCPs got stuck in initial stage where 
they need to access meta table before getting candidate for region moves.

The only way out of this outage is for operator to schedule recoveries using 
hbck for old server, which triggers SCP for old server address of meta. Since 
many SCPs were stuck, the processing of new SCP too was taking some time and 
manual restart of active master triggered failover, and new master was able to 
complete SCP for old meta server, correcting the meta assignment details, which 
eventually marked master as active and only after this, we were able to see 
real large num of RITs that were hidden so far.

We need to let master recover from this state to avoid manual intervention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region

2023-11-06 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783471#comment-17783471
 ] 

Viraj Jasani commented on HBASE-20881:
--

I have run out of the example, but when I see another incident of 
ABNORMALLY_CLOSED, will be happy to share the logs.

In the meantime, I was curious, what is the best resolution to a region stuck 
in this state? Is running "hbck assigns -o" the only resolution?

> Introduce a region transition procedure to handle all the state transition 
> for a region
> ---
>
> Key: HBASE-20881
> URL: https://issues.apache.org/jira/browse/HBASE-20881
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0
>
> Attachments: HBASE-20881-branch-2-v1.patch, 
> HBASE-20881-branch-2-v2.patch, HBASE-20881-branch-2.patch, 
> HBASE-20881-v1.patch, HBASE-20881-v10.patch, HBASE-20881-v11.patch, 
> HBASE-20881-v12.patch, HBASE-20881-v13.patch, HBASE-20881-v13.patch, 
> HBASE-20881-v14.patch, HBASE-20881-v14.patch, HBASE-20881-v15.patch, 
> HBASE-20881-v16.patch, HBASE-20881-v2.patch, HBASE-20881-v3.patch, 
> HBASE-20881-v4.patch, HBASE-20881-v4.patch, HBASE-20881-v5.patch, 
> HBASE-20881-v6.patch, HBASE-20881-v7.patch, HBASE-20881-v7.patch, 
> HBASE-20881-v8.patch, HBASE-20881-v9.patch, HBASE-20881.patch
>
>
> Now have an AssignProcedure, an UnssignProcedure, and also a 
> MoveRegionProcedure which schedules an AssignProcedure and an 
> UnssignProcedure to move a region. This makes the logic a bit complicated, as 
> MRP is not a RIT, so when SCP can not interrupt it directly...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region

2023-10-23 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778758#comment-17778758
 ] 

Viraj Jasani edited comment on HBASE-20881 at 10/23/23 5:09 PM:


[~zhangduo] IIUC, the only reason why we had to introduce ABNORMALLY_CLOSED 
state is because when a region is already in RIT, and the target server where 
it is assigned or getting assigned to or getting moved from (and getting closed 
therefore) crashes, SCP has to interrupt old TRSP. SCP anyways creates new 
TRSPs to take care of assigning all regions that were previously hosted by the 
target server, but any region which was already in transition might require 
manual intervention because SCP cannot be certain what step of the previous 
TRSP, the region was stuck in while it was in RIT.

For SCP, any RIT on dead server is a complex state to deal with because it 
cannot know for certain whether the region was stuck in any coproc hook on the 
host or it was stuck while making RPC call to remote server and what was the 
outcome of the RPC call etc.

 

Does this seem correct? We were thinking of digging a bit more in detail to see 
if there are any cases for which we can convert region state to CLOSED rather 
than ABNORMALLY_CLOSED and therefore avoid any operator intervention, but i 
fear we might introduce double assignment of regions if this is not done 
carefully.


was (Author: vjasani):
[~zhangduo] IIUC, the only reason why we had to introduce ABNORMALLY_CLOSED 
state is because when a region is already in RIT, and the target server where 
it is assigned or getting assigned to crashes, SCP has to interrupt old TRSP 
and create new TRSPs to take care of assigning all regions that were previously 
hosted by the target server, but any region already in transition might require 
manual intervention because SCP cannot be certain what step of the previous 
TRSP, the region was stuck while it was in RIT.

For SCP, any RIT on dead server is a complex state to deal with because it 
cannot know for certain whether the region was stuck in any coproc hook on the 
host or it was stuck while making RPC call to remote server and what was the 
outcome of the RPC call etc.

 

Does this seem correct? We were thinking of digging a bit more in detail to see 
if there are any cases for which we can convert region state to CLOSED rather 
than ABNORMALLY_CLOSED and therefore avoid any operator intervention, but i 
fear we might introduce double assignment of regions if this is not done 
carefully.

> Introduce a region transition procedure to handle all the state transition 
> for a region
> ---
>
> Key: HBASE-20881
> URL: https://issues.apache.org/jira/browse/HBASE-20881
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0
>
> Attachments: HBASE-20881-branch-2-v1.patch, 
> HBASE-20881-branch-2-v2.patch, HBASE-20881-branch-2.patch, 
> HBASE-20881-v1.patch, HBASE-20881-v10.patch, HBASE-20881-v11.patch, 
> HBASE-20881-v12.patch, HBASE-20881-v13.patch, HBASE-20881-v13.patch, 
> HBASE-20881-v14.patch, HBASE-20881-v14.patch, HBASE-20881-v15.patch, 
> HBASE-20881-v16.patch, HBASE-20881-v2.patch, HBASE-20881-v3.patch, 
> HBASE-20881-v4.patch, HBASE-20881-v4.patch, HBASE-20881-v5.patch, 
> HBASE-20881-v6.patch, HBASE-20881-v7.patch, HBASE-20881-v7.patch, 
> HBASE-20881-v8.patch, HBASE-20881-v9.patch, HBASE-20881.patch
>
>
> Now have an AssignProcedure, an UnssignProcedure, and also a 
> MoveRegionProcedure which schedules an AssignProcedure and an 
> UnssignProcedure to move a region. This makes the logic a bit complicated, as 
> MRP is not a RIT, so when SCP can not interrupt it directly...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-20881) Introduce a region transition procedure to handle all the state transition for a region

2023-10-23 Thread Viraj Jasani (Jira)



[ 
https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778758#comment-17778758
 ] 

Viraj Jasani commented on HBASE-20881:
--

[~zhangduo] IIUC, the only reason why we had to introduce ABNORMALLY_CLOSED 
state is because when a region is already in RIT, and the target server where 
it is assigned or getting assigned to crashes, SCP has to interrupt old TRSP 
and create new TRSPs to take care of assigning all regions that were previously 
hosted by the target server, but any region already in transition might require 
manual intervention because SCP cannot be certain what step of the previous 
TRSP, the region was stuck while it was in RIT.

For SCP, any RIT on dead server is a complex state to deal with because it 
cannot know for certain whether the region was stuck in any coproc hook on the 
host or it was stuck while making RPC call to remote server and what was the 
outcome of the RPC call etc.

 

Does this seem correct? We were thinking of digging a bit more in detail to see 
if there are any cases for which we can convert region state to CLOSED rather 
than ABNORMALLY_CLOSED and therefore avoid any operator intervention, but i 
fear we might introduce double assignment of regions if this is not done 
carefully.

> Introduce a region transition procedure to handle all the state transition 
> for a region
> ---
>
> Key: HBASE-20881
> URL: https://issues.apache.org/jira/browse/HBASE-20881
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2, proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0
>
> Attachments: HBASE-20881-branch-2-v1.patch, 
> HBASE-20881-branch-2-v2.patch, HBASE-20881-branch-2.patch, 
> HBASE-20881-v1.patch, HBASE-20881-v10.patch, HBASE-20881-v11.patch, 
> HBASE-20881-v12.patch, HBASE-20881-v13.patch, HBASE-20881-v13.patch, 
> HBASE-20881-v14.patch, HBASE-20881-v14.patch, HBASE-20881-v15.patch, 
> HBASE-20881-v16.patch, HBASE-20881-v2.patch, HBASE-20881-v3.patch, 
> HBASE-20881-v4.patch, HBASE-20881-v4.patch, HBASE-20881-v5.patch, 
> HBASE-20881-v6.patch, HBASE-20881-v7.patch, HBASE-20881-v7.patch, 
> HBASE-20881-v8.patch, HBASE-20881-v9.patch, HBASE-20881.patch
>
>
> Now have an AssignProcedure, an UnssignProcedure, and also a 
> MoveRegionProcedure which schedules an AssignProcedure and an 
> UnssignProcedure to move a region. This makes the logic a bit complicated, as 
> MRP is not a RIT, so when SCP can not interrupt it directly...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28151) hbck -o should not allow bypassing pre transit check by default

2023-10-12 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28151:
-
Description: 
When operator uses hbck assigns or unassigns with "-o", the override will also 
skip pre transit checks. While this is one of the intentions with "-o", the 
primary purpose should still be to only unattach existing procedure from 
RegionStateNode so that newly scheduled assign proc can take exclusive region 
level lock.

We should restrict bypassing preTransitCheck by only providing it as site 
config.

If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
allowed to bypass this check, otherwise by default they should go through the 
check.

 

It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it further e.g. if a region was successfully split 
and now if operator performs "hbck assigns \{region} -o" and if it bypasses the 
transit check, master would bring the region online and it could compact store 
files and archive the store file which is referenced by daughter region. This 
would not allow daughter region to come online.

Let's introduce hbase site config to allow bypassing preTransitCheck, it should 
not be doable only by operator using hbck alone.

 

"-o" should mean "override" the procedure that is attached to the 
RegionStateNode, it should not mean forcefully skip any region transition 
validation checks.

  was:
When operator uses hbck assigns or unassigns with "-o", the override will also 
skip pre transit checks. While this is one of the intentions with "-o", the 
primary purpose should still be to only unattach existing procedure from 
RegionStateNode so that newly scheduled assign proc can take exclusive region 
level lock.

We should restrict bypassing preTransitCheck by only providing it as site 
config.

If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
allowed to bypass this check, otherwise by default they should go through the 
check.

 

It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it further e.g. if a region was successfully split 
and now if operator performs "hbck assigns \{region} -o" and if it bypasses the 
transit check, master would bring the region online and it could compact store 
files and archive the store file which is referenced by daughter region. This 
would not allow daughter region to come online.

Let's introduce hbase site config to allow bypassing preTransitCheck, it should 
not be doable only by operator using hbck alone.

 

"-o" should mean "override" the procedure that is attached to the 
RegionStateNode, it should not mean forcefully skip any region transition 
validation checks and perform the region assignments.


> hbck -o should not allow bypassing pre transit check by default
> ---
>
> Key: HBASE-28151
> URL: https://issues.apache.org/jira/browse/HBASE-28151
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.5
>Reporter: Viraj Jasani
>Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28151) hbck -o should not allow bypassing pre transit check by default

2023-10-12 Thread Viraj Jasani (Jira)



 [ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-28151:
-
Description: 
When operator uses hbck assigns or unassigns with "-o", the override will also 
skip pre transit checks. While this is one of the intentions with "-o", the 
primary purpose should still be to only unattach existing procedure from 
RegionStateNode so that newly scheduled assign proc can take exclusive region 
level lock.

We should restrict bypassing preTransitCheck by only providing it as site 
config.

If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
allowed to bypass this check, otherwise by default they should go through the 
check.

 

It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it further e.g. if a region was successfully split 
and now if operator performs "hbck assigns \{region} -o" and if it bypasses the 
transit check, master would bring the region online and it could compact store 
files and archive the store file which is referenced by daughter region. This 
would not allow daughter region to come online.

Let's introduce hbase site config to allow bypassing preTransitCheck, it should 
not be doable only by operator using hbck alone.

 

"-o" should mean "override" the procedure that is attached to the 
RegionStateNode, it should not mean forcefully skip any region transition 
validation checks and perform the region assignments.

  was:
When operator uses hbck assigns or unassigns with "-o", the override will also 
skip pre transit checks. While this is one of the intentions with "-o", the 
primary purpose should still be to only unattach existing procedure from 
RegionStateNode so that newly scheduled assign proc can take exclusive region 
level lock.

We should restrict bypassing preTransitCheck by only providing it as site 
config.

If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
allowed to bypass this check, otherwise by default they should go through the 
check.

 

It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it further e.g. if a region was successfully split 
and now if operator performs "hbck assigns \{region} -o" and if it bypasses the 
transit check, master would bring the region online and it could compact store 
files and archive the store file which is referenced by daughter region. This 
would not allow daughter region to come online.

Let's introduce hbase site config to allow bypassing preTransitCheck, it should 
not be doable only by operator using hbck alone.


> hbck -o should not allow bypassing pre transit check by default
> ---
>
> Key: HBASE-28151
> URL: https://issues.apache.org/jira/browse/HBASE-28151
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.4.17, 2.5.5
>Reporter: Viraj Jasani
>Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks and perform the region assignments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2376 matches

Mail list logo