[
https://issues.apache.org/jira/browse/HBASE-22143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809712#comment-16809712
]
Wellington Chevreuil edited comment on HBASE-22143 at 4/4/19 10:42 AM:
-----------------------------------------------------------------------
Thanks for the comments, [~elserj]! All good points, am sending a patch
addressing most of it soon. Meanwhile, some comments below:
bq. Why the choice to have the user pass in a region path in the filesystem
instead of just the encoded name? Seems like you might have started out
accepting an encoded region name given the comment:
Yep, that was bit hacky, as it was the easiest/first thought I had about how to
manipulate the region in meta (getting region info from region dir, then use
MetaTableAccessor, instead of having to go through client scan filter). I am
re-writing this to not rely on "audience private" interfaces/classes, will just
use client API scan/put interfaces, instead.
bq. How does this fail if I give you a region name that is bogus? Could you add
a test for that?
Good point, will be addressed in new patch. Same for invalid state.
bq. Finally, can you think of a situation where we'd want to ever move a region
to a state that isn't "CLOSED"? Do we want to give the operator the ability to
push to any state? I would be worried any of the "in-transit" states (e..g
opening, closing) are just ways folks can shoot themselves in the foot.
Yes, the original situation that motivated this was that we had a region in
OPENING state, then tried to run unassigns, and the related proc got stuck
because it expects regions to be in state SPLITTING, SPLIT, MERGING, OPEN, or
CLOSING (see error message on the initial jira description). Sure we could had
just moved the state to CLOSED, then bypassed the stuck proc, but thought maybe
worth give a flexibility to force any state. After all, manually setting region
states even to CLOSED is already too risky, would require operators to really
know what they are doing. What do you think? If you still find worth lock the
options, I can change that too,
was (Author: wchevreuil):
Thanks for the comments, [~elserj]! All good points, am sending a patch
addressing most of it soon. Meanwhile,
bq. Why the choice to have the user pass in a region path in the filesystem
instead of just the encoded name? Seems like you might have started out
accepting an encoded region name given the comment:
Yep, that was bit hacky, as it was the easiest/first thought I had about how to
manipulate the region in meta (getting region info from region dir, then use
MetaTableAccessor, instead of having to go through client scan filter). I am
re-writing this to not rely on "audience private" interfaces/classes, will just
use client API scan/put interfaces, instead.
bq. How does this fail if I give you a region name that is bogus? Could you add
a test for that?
Good point, will be addressed in new patch. Same for invalid state.
bq. Finally, can you think of a situation where we'd want to ever move a region
to a state that isn't "CLOSED"? Do we want to give the operator the ability to
push to any state? I would be worried any of the "in-transit" states (e..g
opening, closing) are just ways folks can shoot themselves in the foot.
Yes, the original situation that motivated this was that we had a region in
OPENING state, then tried to run unassigns, and the related proc got stuck
because it expects regions to be in state SPLITTING, SPLIT, MERGING, OPEN, or
CLOSING (see error message on the initial jira description). Sure we could had
just moved the state to CLOSED, then bypassed the stuck proc, but thought maybe
worth give a flexibility to force any state. After all, manually setting region
states even to CLOSED is already too risky, would require operators to really
know what they are doing. What do you think? If you still find worth lock the
options, I can change that too,
> HBCK2 setRegionState command
> ----------------------------
>
> Key: HBASE-22143
> URL: https://issues.apache.org/jira/browse/HBASE-22143
> Project: HBase
> Issue Type: New Feature
> Components: hbase-operator-tools, hbck2
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Minor
> Attachments: HBASE-22143.master.0001.patch,
> HBASE-22143.master.0002.patch
>
>
> Among some of the current AMv2 issues, we faced situation where some regions
> had state as OPENING in meta, with an RS startcode that was not valid
> anymore. There was no AP running, the region stays permanently being logged
> as IN-Transition on master logs, yet no procedure is really trying to bring
> it online. Current hbck2 unassigns/assigns commands didn't work either, as
> per the exception shown, it expects regions to be in state SPLITTING, SPLIT,
> MERGING, OPEN, or CLOSING:
> {noformat}
> WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure:
> Failed transition, suspend 1secs pid=7093,
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure
> table=rc_accounts, region=db85127b77fa56f7ad44e2c988e53925,
> server=server1.example.com,16020,1552682193324; rit=OPENING,
> location=server1.example.com,16020,1552682193324; waiting on rectified
> condition fixed by other Procedure or operator intervention
> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected
> [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but
> current state=OPENING
> at
> org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:166)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1479)
> at
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:212)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:957)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1835)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1595){noformat}
> In this specific case, since we know the region is not actually being
> operated by any proc and is not really open anywhere, it's ok to manually set
> it's state to one of those assigns/unassigns can operate on, so this jira
> proposes a new hbck2 command that allows for arbitrarily set a region to a
> given state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)