[ https://issues.apache.org/jira/browse/IGNITE-18171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635806#comment-17635806 ]
Andrey Mashenkov commented on IGNITE-18171: ------------------------------------------- We can use brute force starting or stopping nodes to generate all possible grid configurations and transitions. Also, we have to check quorum restart separately, because transition {{{}[C]->[C, M]->{}}}{{[C, M, D1]}} and {{[C, M, D1] -> [C, M] -> [C, M, D1] }}are different, the first one is startup sequence, while the second one is restart sequence. However, {{{}[C, M]->{}}}{{[C, M, D1] and [C, M]->[C, M, D2] }}are equivalent. > Descibe nodes start/stop scenarios > ---------------------------------- > > Key: IGNITE-18171 > URL: https://issues.apache.org/jira/browse/IGNITE-18171 > Project: Ignite > Issue Type: Improvement > Components: sql > Reporter: Andrey Mashenkov > Assignee: Andrey Mashenkov > Priority: Major > Labels: ignite-3 > > h2. Definitions. > We can distinguish next cluster node groups, see below. Each node may be part > of one or more groups. > * Cluster Management Group (CMG), that control new nodes join process. > * MetaStorage group (MSG), that hosts meta storage. > * Data node group (DNG), that just hosts tables partitions. > The components (CMG, meta storage, tables components) are depends on each > other, but may resides on different (even disjoint) node subsets. So, some > components may become temporary unavailable, and dependant components must be > aware of such issues and handle them (wait, retry, throw exception or > whatever) in expected way, which has to be documented also. > [See IEP for > details|https://cwiki.apache.org/confluence/display/IGNITE/IEP-77%3A+Node+Join+Protocol+and+Initialization+for+Ignite+3] > h2. Motivation. > As of now, the correct way to start the grid (after it was stopped) is: start > CMG nodes, then Meta Storage nodes, then Data nodes. And in backward order > for correct stop. Other scenarios are not tested and may lead to unexpected > behaviour. > Let's describe all possible scenarios, expected behaviour for each of them > and extend test coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010)