[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-28 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754246#comment-16754246
 ] 

Sergey Shelukhin commented on HBASE-21743:
--

Ok, I have less time for that now due to having to debug all the issues ;)
However split and merge are not covered by my "smaller" proposal, where master 
(if configured) will ignore only recovery-related procedures.
During failure, master should already be able to handle not persisting the 
state of some procedure (because by definition cluster is much more likely to 
be in a bad state), so it should also be able to abandon old recovery 
procedures (SCP & RIT and their children) as if they were not saved, and create 
new ones during startup.
 I will keep this JIRA for the larger feature (and later move the discussion to 
dev@ when there's more time :)), and file a separate JIRA for the recovery 
part... 

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-23 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750582#comment-16750582
 ] 

Duo Zhang commented on HBASE-21743:
---

To prevent wasting time, could you please share how do you want to deal with 
split and merge? For assignment, I think the state in meta is almost enough, I 
can imagine how we deal with the state transition, as for assignment it is 
simple, if you see OPEN then no problem, if you see CLOSED and the table is not 
disabled then just try to open it, etc. But split and merge is another story...

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-23 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750528#comment-16750528
 ] 

stack commented on HBASE-21743:
---

Thanks for starting the discussion. Yeah, per Sean, lets do up on the list. 
Agree 'concurrent interactions' is still hard in AMv2. Would dispute that it 
worse than AMv1 given state in less places and state manipulated by a single 
writer only -- both courtesy of AMv2 --  but this is not the important point 
here. Stateless would be sweet if possible.

I went back and started reading from the top.

 * How would we do assignment without state give it is the building block out 
of which most Procedures are made (table create, region move/split)?
 * How would we do region locking and enforce locking hierarchy (region is of 
table) if stateless?
 * How would the signaling between SCP and assign be done -- you can't assign 
until after WALs have been split.

Looking forward to the discussion.

Would also be interested in what Master version you were up on. Things were bad 
there for a good while but got better.


> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-23 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750358#comment-16750358
 ] 

Sergey Shelukhin commented on HBASE-21743:
--

AMv1 chief problems in my view were too many persistent places to store state 
(meta, ZK, in addition to implicit cluster state) so it was not stateless :) 
And multi-threaded bugs with lots of custom code partying on that state. And 
the fact that code was not well isolated, so you needed to know it all to 
reason about it.
Looks like procv2 solved the last problem, and also improved on the state issue 
by removing ZK (although there can still be issues with persistent proc state), 
but the MT issue was not addressed. Isolated code does make it easier to 
investigate! But lots of races are still there and it seems to be a systematic 
issue to me... when procedures don't coordinate there are bugs, but if they 
coordinate too much in future there's risk of devolving back into tightly 
coupled model.

For the race condition problem, long time ago I was proposing smth like hybrid 
actor model with non blocking single threaded amnesiac AM farming out async 
operations, updating state and (more or less) forgetting about them, but at 
this point it would be rather radical... making a pluggable AM might be a model 
to allow for experimentation like this in future.
For now, given that master startup already acts almost like an "actor" (it's 
one thread doing steps in sequence) we can also add an option to make it 
"amnesiac". That should make it more resilient to consequences of race 
condition bugs, and manual intervention, and also reduce bugs.

Let me think about this and get feedback, I'll try to start a thread in a 
couple days.



> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-23 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749872#comment-16749872
 ] 

Sean Busbey commented on HBASE-21743:
-

this should probably be a DISCUSS thread on the dev@hbase mailing list. It'll 
get more eyes that way rather than here on JIRA.

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-22 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749459#comment-16749459
 ] 

Duo Zhang commented on HBASE-21743:
---

You can see AMv1, this is what we did where we did not have proc-v2, and maybe 
it is something like your stateless assignment? But obviously, with years of 
stabilizing, it just becomes unmaintainable, and still have lots of bugs, like 
double assign, failed split or merge, etc...

That's why we choose proc-v2 to implement AMv2, full of blood and tears...

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749406#comment-16749406
 ] 

Sergey Shelukhin commented on HBASE-21743:
--

We've been running a master snapshot. Indeed, we found that sometimes procv2 
deletion can lead to additional issues, however sometimes it's also the only 
way forward. 
[~stack] there's no way to find out about old dead servers on restart other 
than WAL directories (or inferring from stale region assignments stored in 
meta), because the servers are not stored anywhere else (and ZK node is gone 
for a dead server, as intended).

The basic idea is to look at list of regions (meta), look at live and dead 
servers - both of which master already does - and schedule procedures from 
scratch as required, instead of relying on procedure WAL. 
Personally (as we've discussed years ago ;)) I would prefer to have something 
like actor model where a central fast actor does this in a loop and fires off 
idempotent slow actions asynchronously, but within the current paradigm I think 
 reducing state (optionally i.e. with a config) would provide some benefit. 
Right now for every bug I file (and all those I don't file that result from 
subtly-incorrect/too-aggressive manual interventions needed to address other 
bugs)  if master was looking at cluster state it would be trivial to resolve, 
but because of the split brain problem every part of the system is waiting for 
some other part with incorrect assumptions; so, the whole thing is very fragile 
w.r.t. both bugs, and also manual intervention that as we know are often 
necessary despite best intentions (hence hbck/offlinerepair/etc).

For example the above bug with incorrect SCP for meta server resulted because 
master init is waiting for SCP to fix meta, but SCP doesn't know it needs to 
fix meta because of some bug. OFC if persistent SCP didn't exist it wouldn't 
have the bug in the first place, but abstractly if one actor was looking at 
this he'd just see meta assigned to a dead server, and recover it just like 
that. No state needed other than where meta is and the list of servers.

Then, to resolve this we had to nuke the proc WAL to get rid of the bad SCP. 
Some more SCPs for some servers got lost in the nuke, and we had some regions 
CLOSING on dead servers that have neither SCP nor WAL directory. Again, looking 
from a unified perspective we can see - woops, region closing on the server, 
server has no WALs to split - just count it as closed. Whereas now close region 
procedure is not responsible for this, it just waits for SCP to deal with the 
server. But there's no SCP because there's no WAL directory. So, nobody looks 
at these two together... so after this manual intervention (or for example 
imagine there was an HDFS issue, and the WAL write did not succeed) cluster is 
broken and I have to go and fix those regions.

Now I go to meta and set regions to CLOSED (pretend I'm actually hbck2). If 
assignment was stateless, master would see closed regions and assign them. 
Whereas now confirm-close retry loop is well-isolated so it doesn't care about 
anything in the world and just blindly resets them back to CLOSING, so I have 
to additionally kill -9 the master to make sure that stupid RITs go away and on 
restart master actually recovers the region.

Luckily when recovered RIT procedures in this case see CLOSED region with empty 
server, they just silently go away (which might technically be a bug but it 
works for me ;)); I've seen other cases where some procedure sees region in an 
unexpected state (due to a race condition) it either fails master (as with meta 
replicas) or updates it to some other state, resulting in a strange state.

This is just one example. And on all 3.5 steps the persistent procedure is 100% 
unnecessary, because master has all the information to make correct decisions. 
As long as it's done in a sane way like with hybrid actor model without its own 
persistent state...



> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact 

[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-20 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747643#comment-16747643
 ] 

Allan Yang commented on HBASE-21743:


{quote}
deleting procv2 WAL is often the best way to recover the cluster, because 
master can already figure out what to do without additional state.
{quote}
What version are you using,  IIRC, deleting prco v2 WAL will possible leaving 
cluster in a unrecoverable state(HBCK2 is needed)

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-20 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747418#comment-16747418
 ] 

Duo Zhang commented on HBASE-21743:
---

The read replica feature is a bit broken, especially meta replicas. If you 
enable meta replicas the cluster will easily go into a strange state...

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747320#comment-16747320
 ] 

stack commented on HBASE-21743:
---

[~sershe] Thanks for the interesting issue.

bq. Running HBase...

What version of hbase?

bq. ...split brain between 2 procedures; or between procedure and master 
startup (meta replica bugs)

Say more. On the split brain, 2.2 has TRSP so one Procedure does compound 
assign stuff like move.  On meta replica bugs, I've not followed too closely. 
Read replicas need love generally.

bq. ...in AMv2 the problem of unclear concurrent interactions has been 
preserved and in fact increased because of the operation state persistence and 
isolation

My experience has been that AMv2 makes assignment more tractable; its possible 
to reason around what is happening up on AMv2 which was less the case in AMv1. 
I think the variety of concurrent interactions amongst procedures keeps 
throwing up 'surprises'. The number of issues is in diminution?

bq. ...server list as WAL directory list

Lets remove this. Listing dirs to find server-list is an indirection.

Region state in meta is an AMv2 attribute.

bq. ...deleting procv2 WAL is often the best way to recover the cluster...

This has been last resort -- an extreme -- and the recovery is not guaranteed 
compiete.

A stateless assignment would be super-cool. Say more on how it might work 
[~sershe] Thanks.




> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21743) stateless assignment

2019-01-18 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746888#comment-16746888
 ] 

Duo Zhang commented on HBASE-21743:
---

I think the current HBCK2 can handle the assignment problems?

> stateless assignment
> 
>
> Key: HBASE-21743
> URL: https://issues.apache.org/jira/browse/HBASE-21743
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
> that all seem to have the same nature - split brain between 2 procedures; or 
> between procedure and master startup (meta replica bugs); or procedure and 
> master shutdown (HBASE-21742); or procedure and something else (when SCP had 
> incorrect region list persisted, don't recall the bug#). 
> To me, it starts to look like a pattern where, like in AMv1 where concurrent 
> interactions were unclear and hard to reason about, despite the cleaner 
> individual pieces in AMv2 the problem of unclear concurrent interactions has 
> been preserved and in fact increased because of the operation state 
> persistence and  isolation.
> Procedures are great for multi-step operations that need rollback and stuff 
> like that, e.g. creating a table or snapshot, or even region splitting. 
> However I'm not so sure about assignment. 
> We have the persisted information - region state in meta (incl transition 
> states like opening, or closing), server list as WAL directory list. 
> Procedure state is not any more reliable then those (we can argue that meta 
> update can fail, but so can procv2 WAL flush, so we have to handle cases of 
> out of date information regardless). So, we don't need any extra state to 
> decide on assignment, whether for recovery and balancing. In fact, as 
> mentioned in some bugs, deleting procv2 WAL is often the best way to recover 
> the cluster, because master can already figure out what to do without 
> additional state.
> I think there should be an option for stateless assignment that does that.
> It can either be as a separate pluggable assignment procedure; or an option 
> that will not recover SCP, RITs etc from WAL but always derive recovery 
> procedures from the existing cluster state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)