[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754246#comment-16754246 ] Sergey Shelukhin commented on HBASE-21743: -- Ok, I have less time for that now due to having to debug all the issues ;) However split and merge are not covered by my "smaller" proposal, where master (if configured) will ignore only recovery-related procedures. During failure, master should already be able to handle not persisting the state of some procedure (because by definition cluster is much more likely to be in a bad state), so it should also be able to abandon old recovery procedures (SCP & RIT and their children) as if they were not saved, and create new ones during startup. I will keep this JIRA for the larger feature (and later move the discussion to dev@ when there's more time :)), and file a separate JIRA for the recovery part... > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750582#comment-16750582 ] Duo Zhang commented on HBASE-21743: --- To prevent wasting time, could you please share how do you want to deal with split and merge? For assignment, I think the state in meta is almost enough, I can imagine how we deal with the state transition, as for assignment it is simple, if you see OPEN then no problem, if you see CLOSED and the table is not disabled then just try to open it, etc. But split and merge is another story... > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750528#comment-16750528 ] stack commented on HBASE-21743: --- Thanks for starting the discussion. Yeah, per Sean, lets do up on the list. Agree 'concurrent interactions' is still hard in AMv2. Would dispute that it worse than AMv1 given state in less places and state manipulated by a single writer only -- both courtesy of AMv2 -- but this is not the important point here. Stateless would be sweet if possible. I went back and started reading from the top. * How would we do assignment without state give it is the building block out of which most Procedures are made (table create, region move/split)? * How would we do region locking and enforce locking hierarchy (region is of table) if stateless? * How would the signaling between SCP and assign be done -- you can't assign until after WALs have been split. Looking forward to the discussion. Would also be interested in what Master version you were up on. Things were bad there for a good while but got better. > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750358#comment-16750358 ] Sergey Shelukhin commented on HBASE-21743: -- AMv1 chief problems in my view were too many persistent places to store state (meta, ZK, in addition to implicit cluster state) so it was not stateless :) And multi-threaded bugs with lots of custom code partying on that state. And the fact that code was not well isolated, so you needed to know it all to reason about it. Looks like procv2 solved the last problem, and also improved on the state issue by removing ZK (although there can still be issues with persistent proc state), but the MT issue was not addressed. Isolated code does make it easier to investigate! But lots of races are still there and it seems to be a systematic issue to me... when procedures don't coordinate there are bugs, but if they coordinate too much in future there's risk of devolving back into tightly coupled model. For the race condition problem, long time ago I was proposing smth like hybrid actor model with non blocking single threaded amnesiac AM farming out async operations, updating state and (more or less) forgetting about them, but at this point it would be rather radical... making a pluggable AM might be a model to allow for experimentation like this in future. For now, given that master startup already acts almost like an "actor" (it's one thread doing steps in sequence) we can also add an option to make it "amnesiac". That should make it more resilient to consequences of race condition bugs, and manual intervention, and also reduce bugs. Let me think about this and get feedback, I'll try to start a thread in a couple days. > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749872#comment-16749872 ] Sean Busbey commented on HBASE-21743: - this should probably be a DISCUSS thread on the dev@hbase mailing list. It'll get more eyes that way rather than here on JIRA. > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749459#comment-16749459 ] Duo Zhang commented on HBASE-21743: --- You can see AMv1, this is what we did where we did not have proc-v2, and maybe it is something like your stateless assignment? But obviously, with years of stabilizing, it just becomes unmaintainable, and still have lots of bugs, like double assign, failed split or merge, etc... That's why we choose proc-v2 to implement AMv2, full of blood and tears... > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749406#comment-16749406 ] Sergey Shelukhin commented on HBASE-21743: -- We've been running a master snapshot. Indeed, we found that sometimes procv2 deletion can lead to additional issues, however sometimes it's also the only way forward. [~stack] there's no way to find out about old dead servers on restart other than WAL directories (or inferring from stale region assignments stored in meta), because the servers are not stored anywhere else (and ZK node is gone for a dead server, as intended). The basic idea is to look at list of regions (meta), look at live and dead servers - both of which master already does - and schedule procedures from scratch as required, instead of relying on procedure WAL. Personally (as we've discussed years ago ;)) I would prefer to have something like actor model where a central fast actor does this in a loop and fires off idempotent slow actions asynchronously, but within the current paradigm I think reducing state (optionally i.e. with a config) would provide some benefit. Right now for every bug I file (and all those I don't file that result from subtly-incorrect/too-aggressive manual interventions needed to address other bugs) if master was looking at cluster state it would be trivial to resolve, but because of the split brain problem every part of the system is waiting for some other part with incorrect assumptions; so, the whole thing is very fragile w.r.t. both bugs, and also manual intervention that as we know are often necessary despite best intentions (hence hbck/offlinerepair/etc). For example the above bug with incorrect SCP for meta server resulted because master init is waiting for SCP to fix meta, but SCP doesn't know it needs to fix meta because of some bug. OFC if persistent SCP didn't exist it wouldn't have the bug in the first place, but abstractly if one actor was looking at this he'd just see meta assigned to a dead server, and recover it just like that. No state needed other than where meta is and the list of servers. Then, to resolve this we had to nuke the proc WAL to get rid of the bad SCP. Some more SCPs for some servers got lost in the nuke, and we had some regions CLOSING on dead servers that have neither SCP nor WAL directory. Again, looking from a unified perspective we can see - woops, region closing on the server, server has no WALs to split - just count it as closed. Whereas now close region procedure is not responsible for this, it just waits for SCP to deal with the server. But there's no SCP because there's no WAL directory. So, nobody looks at these two together... so after this manual intervention (or for example imagine there was an HDFS issue, and the WAL write did not succeed) cluster is broken and I have to go and fix those regions. Now I go to meta and set regions to CLOSED (pretend I'm actually hbck2). If assignment was stateless, master would see closed regions and assign them. Whereas now confirm-close retry loop is well-isolated so it doesn't care about anything in the world and just blindly resets them back to CLOSING, so I have to additionally kill -9 the master to make sure that stupid RITs go away and on restart master actually recovers the region. Luckily when recovered RIT procedures in this case see CLOSED region with empty server, they just silently go away (which might technically be a bug but it works for me ;)); I've seen other cases where some procedure sees region in an unexpected state (due to a race condition) it either fails master (as with meta replicas) or updates it to some other state, resulting in a strange state. This is just one example. And on all 3.5 steps the persistent procedure is 100% unnecessary, because master has all the information to make correct decisions. As long as it's done in a sane way like with hybrid actor model without its own persistent state... > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747643#comment-16747643 ] Allan Yang commented on HBASE-21743: {quote} deleting procv2 WAL is often the best way to recover the cluster, because master can already figure out what to do without additional state. {quote} What version are you using, IIRC, deleting prco v2 WAL will possible leaving cluster in a unrecoverable state(HBCK2 is needed) > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747418#comment-16747418 ] Duo Zhang commented on HBASE-21743: --- The read replica feature is a bit broken, especially meta replicas. If you enable meta replicas the cluster will easily go into a strange state... > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747320#comment-16747320 ] stack commented on HBASE-21743: --- [~sershe] Thanks for the interesting issue. bq. Running HBase... What version of hbase? bq. ...split brain between 2 procedures; or between procedure and master startup (meta replica bugs) Say more. On the split brain, 2.2 has TRSP so one Procedure does compound assign stuff like move. On meta replica bugs, I've not followed too closely. Read replicas need love generally. bq. ...in AMv2 the problem of unclear concurrent interactions has been preserved and in fact increased because of the operation state persistence and isolation My experience has been that AMv2 makes assignment more tractable; its possible to reason around what is happening up on AMv2 which was less the case in AMv1. I think the variety of concurrent interactions amongst procedures keeps throwing up 'surprises'. The number of issues is in diminution? bq. ...server list as WAL directory list Lets remove this. Listing dirs to find server-list is an indirection. Region state in meta is an AMv2 attribute. bq. ...deleting procv2 WAL is often the best way to recover the cluster... This has been last resort -- an extreme -- and the recovery is not guaranteed compiete. A stateless assignment would be super-cool. Say more on how it might work [~sershe] Thanks. > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21743) stateless assignment
[ https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746888#comment-16746888 ] Duo Zhang commented on HBASE-21743: --- I think the current HBCK2 can handle the assignment problems? > stateless assignment > > > Key: HBASE-21743 > URL: https://issues.apache.org/jira/browse/HBASE-21743 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Running HBase for only a few weeks we found dozen(s?) of bugs with assignment > that all seem to have the same nature - split brain between 2 procedures; or > between procedure and master startup (meta replica bugs); or procedure and > master shutdown (HBASE-21742); or procedure and something else (when SCP had > incorrect region list persisted, don't recall the bug#). > To me, it starts to look like a pattern where, like in AMv1 where concurrent > interactions were unclear and hard to reason about, despite the cleaner > individual pieces in AMv2 the problem of unclear concurrent interactions has > been preserved and in fact increased because of the operation state > persistence and isolation. > Procedures are great for multi-step operations that need rollback and stuff > like that, e.g. creating a table or snapshot, or even region splitting. > However I'm not so sure about assignment. > We have the persisted information - region state in meta (incl transition > states like opening, or closing), server list as WAL directory list. > Procedure state is not any more reliable then those (we can argue that meta > update can fail, but so can procv2 WAL flush, so we have to handle cases of > out of date information regardless). So, we don't need any extra state to > decide on assignment, whether for recovery and balancing. In fact, as > mentioned in some bugs, deleting procv2 WAL is often the best way to recover > the cluster, because master can already figure out what to do without > additional state. > I think there should be an option for stateless assignment that does that. > It can either be as a separate pluggable assignment procedure; or an option > that will not recover SCP, RITs etc from WAL but always derive recovery > procedures from the existing cluster state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)