[
https://issues.apache.org/jira/browse/HELIX-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080926#comment-16080926
]
Jiajun Wang edited comment on HELIX-659 at 7/10/17 7:25 PM:
------------------------------------------------------------
h2. Design Details
h3. Register Secondary States Model / Factory
Note that if a secondary state model is a dynamic state,
defaultTransitionHandler has to be implemented.
*State Model Factory*
public abstract class DynamicStateModelFactory extends
StateModelFactory<DynamicStateModel> {
...
}
public abstract class DynamicStateModel extends StateModel {
static final String DEFAULT_INITIAL_STATE = "UNKNOWN";
protected String _currentState = DEFAULT_INITIAL_STATE;
public String getCurrentState() {
return _currentState;
}
// !!!!!!!!!!! Changed part !!!!!!!!!!!! //
@transition(from='from', to='to')
public void defaultTransitionHandler(Message message, NotificationContext
context) {
logger
.error("Default transition handler. The idea is to invoke this if no
transition method is found. To be implemented");
}
public boolean updateState(String newState) {
_currentState = newState;
return true;
}
public void rollbackOnError(Message message, NotificationContext context,
StateTransitionError error) {
logger.error("Default rollback method invoked on error. Error Code: " +
error.getCode());
}
public void reset() {
logger
.warn("Default reset method invoked. Either because the process longer
own this resource or session timedout");
}
// !!!!!!!!!! Internal State such as ERROR will still exist and supported
!!!!!!!!!! //
@Transition(to = "DROPPED", from = "ERROR")
public void onBecomeDroppedFromError(Message message, NotificationContext
context)
throws Exception {
logger.info("Default ERROR->DROPPED transition invoked.");
}
}
h2. Resource Configuration
Secondary states are conceptually map values.
Besides the state itself, each state model may have different factory name as
well. So there will be <StateModel, Factory> and <StateModel, State>.
We keep the design that, 1. state configurations are at the partition level. 2.
state factory configurations are at the resource level.
In order to allow multiple states to be configured, we propose to represent it
in JSON string format. Note that the state model name is used as the key, so no
duplicate model can be used in one partition.
*Resource config with secondary state VERSION*
{
"id":"Test_Resource"
,"simpleFields":{
"SECONDARY_STATE_MODEL_DEF" : "{VERSION : VersionStateModelFactory}"
}
,"mapFields":{
"partition_1" : "{VERSION : 1.0.1}"
,"partition_2" : "{VERSION : 1.0.2}"
}
}
*Additional APIs to configure secondary states*
/**
* Set configuration values
* @param scope
* @param properties
*/
void setConfig(HelixConfigScope scope, Map<String, List<String>>
listProperties);
/**
* Get configuration values
* @param scope
* @param keys
* @return configuration values ordered by the provided keys
*/
Map<String, List<String>> getConfig(HelixConfigScope scope, List<String> keys);
h3. Partitions with the Secondary States shown in Current State and External
View
Current state shows both the secondary state models and states in the same
format with resource configuration.
*Current States*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF":"MasterSlave"
,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
,"BUCKET_SIZE":"0"
,"SESSION_ID":"25b2ce5dfbde0fa"
,"SECONDARY_STATE_MODEL_DEF" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"partition_1":{
"CURRENT_STATE":"MASTER"
,"SECONDARY_STATES":"{VERSION : 1.0.1}"
,"INFO":""
}
,"partition_2":{
"CURRENT_STATE":"SLAVE"
,"SECONDARY_STATES":"{VERSION : 1.0.1}"
,"INFO":""
}
}
}
As for the external view, we have 2 options to show secondary states.
1. Compressing all states by combining the main state with secondary states.
The states are separated by ":".
*Secondary state in External View*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF_REF":"MasterSlave"
,"ASSOCIATE_STATE_MODEL_DEF_REFS" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"example_resource_0":{
"app0004.stg.com_11900":"{MasterSlave : MASTER} : {VERSION : 1.0.1}"
,"app0048.stg.com_11900":"{MasterSlave : SLAVE} : {VERSION : 1.0.0}"
}
}
}
2. Adding new fields for showing secondary states separately.
*Secondary state in External View*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF_REF":"MasterSlave"
,"ASSOCIATE_STATE_MODEL_DEF_REFS" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"example_resource_0":{
"app0004.stg.com_11900":"MASTER"
,"app0048.stg.com_11900":"SLAVE"
,"app0048.stg.com_11900_SECONDARY_STATE":"{VERSION : 1.0.0}"
,"app0048.stg.com_11900_SECONDARY_STATE":"{VERSION : 1.0.0}"
}
}
}
Actually, both options have backward compatible issues. The first design will
change state string, so the legacy client won't be able to interpret. The
second design will increase map fields items. So the applications that read
this map for all partitions will find additional partitions. And the names are
incorrect.
Comparing these 2 options, the first one fit our long turn goals much better.
So it is our choice for phase one.
As for the backward compatible issue, we plan to create an additional external
view ZK node for holding new format. And the old external view node will be
kept the same.
h3. State Transition Message
On multiple states change, the messages are sent in order according to
priority. There won't be parallel state transition on one partition.
h3. Helix Controller Updates
When resource configuration is changed:
* Fill ClusterDataCache with secondary states and state models/factories.
* Compare for status delta and compose messages accordingly. Order messages
according to state model priority.
* Send the highest priority message to the participant.
One optimization opportunity is allowing parallel state transition messages if
there is no conflict.
When participant current state is changed:
* Read secondary states and fill new external view ZK node with encoded
complete status information.
h3. Helix Participant Updates
On receiving state transition message:
* Check if the message is a registered state model. Trigger state transition.
* If any state transition failed, set an error state and stop processing. The
user should fix the problem and reset to initial state.
* If state transition succeeds, update the current state.
h2. Alternative Options for Supporting Additional States
h3. Introducing special state for additional status change
Adding a new internal state UPGRADING (or other special states) for status
change.
So any additional status change will happen when a partition is transited "to"
or "from" UPGRADING state.
Note that application has the freedom to define whether UPGRADING is a special
online status or not.This is for decoupling the main state from additional
"states".
For Pinot case, upgrading partition (even before they are back to ONLINE) might
be active partition.
The problem of this new state is that it only works fine for a single
additional state model.
Once we have more than one state models to take care, and they are changed
separately, UPGRADING state is not enough.
h3. Rely on resetting partition to load new "states"
Whenever new states are going to be set, application updates resource
configuration. Then resetting all partitions.
Then during state transition from offline to online, participants will read new
states from the configuration and apply to the related partitions.
The problem is that changing in additional states will affect the main state.
The partition will be offline for a while.
h3. Application registers additional message handler for customized transition
message
In this method, application owns the logic. Helix just dispatches customized
state transition message to trigger the operation. In the message handler, the
application read and write the information of the additional state to the
property store.
Consider additional states is a generic requirement, letting multiple
applications to implement similar logic separately does not make sense.
was (Author: jiajunwang):
h2. Design Details
h3. Register Secondary States Model / Factory
Note that if a secondary state model is a dynamic state,
defaultTransitionHandler has to be implemented.
*State Model Factory*
public abstract class DynamicStateModelFactory extends
StateModelFactory<DynamicStateModel> {
...
}
public abstract class DynamicStateModel extends StateModel {
static final String DEFAULT_INITIAL_STATE = "UNKNOWN";
protected String _currentState = DEFAULT_INITIAL_STATE;
public String getCurrentState() {
return _currentState;
}
// !!!!!!!!!!! Changed part !!!!!!!!!!!! //
@transition(from='from', to='to')
public void defaultTransitionHandler(Message message, NotificationContext
context) {
logger
.error("Default transition handler. The idea is to invoke this if no
transition method is found. To be implemented");
}
public boolean updateState(String newState) {
_currentState = newState;
return true;
}
public void rollbackOnError(Message message, NotificationContext context,
StateTransitionError error) {
logger.error("Default rollback method invoked on error. Error Code: " +
error.getCode());
}
public void reset() {
logger
.warn("Default reset method invoked. Either because the process longer
own this resource or session timedout");
}
// !!!!!!!!!! Internal State such as ERROR will still exist and supported
!!!!!!!!!! //
@Transition(to = "DROPPED", from = "ERROR")
public void onBecomeDroppedFromError(Message message, NotificationContext
context)
throws Exception {
logger.info("Default ERROR->DROPPED transition invoked.");
}
}
h2. Resource Configuration
Secondary states are conceptually map values.
Besides the state itself, each state model may have different factory name as
well. So there will be <StateModel, Factory> and <StateModel, State>.
We keep the design that, 1. state configurations are at the partition level. 2.
state factory configurations are at the resource level.
In order to allow multiple states to be configured, we propose to represent it
in JSON string format. Note that the state model name is used as the key, so no
duplicate model can be used in one partition.
*Resource config with secondary state VERSION*
{
"id":"Test_Resource"
,"simpleFields":{
"SECONDARY_STATE_MODEL_DEF" : "{VERSION : VersionStateModelFactory}"
}
,"mapFields":{
"partition_1" : "{VERSION : 1.0.1}"
,"partition_2" : "{VERSION : 1.0.2}"
}
}
*Additional APIs to configure secondary states*
/**
* Set configuration values
* @param scope
* @param properties
*/
void setConfig(HelixConfigScope scope, Map<String, List<String>>
listProperties);
/**
* Get configuration values
* @param scope
* @param keys
* @return configuration values ordered by the provided keys
*/
Map<String, List<String>> getConfig(HelixConfigScope scope, List<String> keys);
h3. Partitions with the Secondary States shown in Current State and External
View
Current state shows both the secondary state models and states in the same
format with resource configuration.
*Current States*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF":"MasterSlave"
,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
,"BUCKET_SIZE":"0"
,"SESSION_ID":"25b2ce5dfbde0fa"
,"SECONDARY_STATE_MODEL_DEF" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"partition_1":{
"CURRENT_STATE":"MASTER"
,"SECONDARY_STATES":"{VERSION : 1.0.1}"
,"INFO":""
}
,"partition_2":{
"CURRENT_STATE":"SLAVE"
,"SECONDARY_STATES":"{VERSION : 1.0.1}"
,"INFO":""
}
}
}
As for the external view, we have 2 options to show secondary states.
1. Compressing all states by combining the main state with secondary states.
The states are separated by ":".
*Secondary state in External View*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF_REF":"MasterSlave"
,"ASSOCIATE_STATE_MODEL_DEF_REFS" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"example_resource_0":{
"lca1-app0004.stg.linkedin.com_11932":"{MasterSlave : MASTER} : {VERSION
: 1.0.1}"
,"lca1-app0048.stg.linkedin.com_11932":"{MasterSlave : SLAVE} : {VERSION
: 1.0.0}"
}
}
}
2. Adding new fields for showing secondary states separately.
*Secondary state in External View*
{
"id":"example_resource"
,"simpleFields":{
"STATE_MODEL_DEF_REF":"MasterSlave"
,"ASSOCIATE_STATE_MODEL_DEF_REFS" : "{VERSION : VersionStateModelFactory}"
}
,"listFields":{
}
,"mapFields":{
"example_resource_0":{
"lca1-app0004.stg.linkedin.com_11932":"MASTER"
,"lca1-app0048.stg.linkedin.com_11932":"SLAVE"
,"lca1-app0048.stg.linkedin.com_11932_SECONDARY_STATE":"{VERSION : 1.0.0}"
,"lca1-app0048.stg.linkedin.com_11932_SECONDARY_STATE":"{VERSION : 1.0.0}"
}
}
}
Actually, both options have backward compatible issues. The first design will
change state string, so the legacy client won't be able to interpret. The
second design will increase map fields items. So the applications that read
this map for all partitions will find additional partitions. And the names are
incorrect.
Comparing these 2 options, the first one fit our long turn goals much better.
So it is our choice for phase one.
As for the backward compatible issue, we plan to create an additional external
view ZK node for holding new format. And the old external view node will be
kept the same.
h3. State Transition Message
On multiple states change, the messages are sent in order according to
priority. There won't be parallel state transition on one partition.
h3. Helix Controller Updates
When resource configuration is changed:
* Fill ClusterDataCache with secondary states and state models/factories.
* Compare for status delta and compose messages accordingly. Order messages
according to state model priority.
* Send the highest priority message to the participant.
One optimization opportunity is allowing parallel state transition messages if
there is no conflict.
When participant current state is changed:
* Read secondary states and fill new external view ZK node with encoded
complete status information.
h3. Helix Participant Updates
On receiving state transition message:
* Check if the message is a registered state model. Trigger state transition.
* If any state transition failed, set an error state and stop processing. The
user should fix the problem and reset to initial state.
* If state transition succeeds, update the current state.
h2. Alternative Options for Supporting Additional States
h3. Introducing special state for additional status change
Adding a new internal state UPGRADING (or other special states) for status
change.
So any additional status change will happen when a partition is transited "to"
or "from" UPGRADING state.
Note that application has the freedom to define whether UPGRADING is a special
online status or not.This is for decoupling the main state from additional
"states".
For Pinot case, upgrading partition (even before they are back to ONLINE) might
be active partition.
The problem of this new state is that it only works fine for a single
additional state model.
Once we have more than one state models to take care, and they are changed
separately, UPGRADING state is not enough.
h3. Rely on resetting partition to load new "states"
Whenever new states are going to be set, application updates resource
configuration. Then resetting all partitions.
Then during state transition from offline to online, participants will read new
states from the configuration and apply to the related partitions.
The problem is that changing in additional states will affect the main state.
The partition will be offline for a while.
h3. Application registers additional message handler for customized transition
message
In this method, application owns the logic. Helix just dispatches customized
state transition message to trigger the operation. In the message handler, the
application read and write the information of the additional state to the
property store.
Consider additional states is a generic requirement, letting multiple
applications to implement similar logic separately does not make sense.
> Extend Helix to Support Resource with Multiple States
> -----------------------------------------------------
>
> Key: HELIX-659
> URL: https://issues.apache.org/jira/browse/HELIX-659
> Project: Apache Helix
> Issue Type: New Feature
> Components: helix-core
> Affects Versions: 0.6.x
> Reporter: Jiajun Wang
>
> h1. Problem Statement
> h2. Single State Model v.s. Multiple State Models
> Currently, Each Helix resource is associated with a single state model, and
> each replica of a partition can only be in any one of these states defined in
> the state model at any time. And Helix manages state transition based on the
> single state model.
> !https://documents.lucidchart.com/documents/e19ab04e-aa06-4ab3-9e57-cfe273554fa1/pages/0_0?a=2416&x=-11&y=71&w=517&h=198&store=1&accept=image%2F*&auth=LCA%20313ced8fb855e8fc1a7043f7fe91cdfa15fffb6b-ts%3D1498857664!
> However, in many scenarios, resources could be more complicated to be modeled
> by a single state model.
> As an example, partitions from a resource could be described in different
> dimensions: SlaveMaster state, Read or Write state and its versions. They
> represent different dimensions of the overall resource status. States from
> each dimension are based on different state models. Note that we have state
> machines simplified in this document.
> !https://documents.lucidchart.com/documents/e19ab04e-aa06-4ab3-9e57-cfe273554fa1/pages/0_0?a=2416&x=-71&y=66&w=1822&h=308&store=1&accept=image%2F*&auth=LCA%2041fa743ba130f41786dee3527de6206cebdd4534-ts%3D1498857664!
> The basic idea is that states in these 3 dimensions are in parallel and can
> be changed independently. For instance, R/W state may be changed without
> updating slave/master state.
> h2. Finite State Machine v.s. Dynamic State Model
> In addition, Helix employs finite state machine to define a state model.
> However, some state model can not be easily modeled by a finite state machine
> with fixed states, for example, the versions. We call such state model as
> the dynamic state model. It is read, set, and understood by the application.
> We will need to extend Helix to support such dynamic state model. Note that
> Helix should not and will not be able to calculate the best possible dynamic
> states.
> The version of a software is one of the best examples to understand dynamic
> state.
> Let's consider one application that is deployed on multiple nodes, which work
> together as a cluster. The green node works as the master, and all dark blue
> nodes are slaves. When Admins upgrades the service from 1.0.0 to 1.1.0, they
> need to ensure upgrading all nodes to the new version and then claim upgrade
> is done. After the upgrade process, it is important to ensure that all
> software versions are consistent.
> If Helix framework is leveraged to support upgrading the cluster, it will
> help to simplify application logic and ensure consistency. For instance, the
> service (cluster) itself is regarded as the resource. And each node is mapped
> as a partition. Then upgrading is simply a state transition. Admins can check
> external view for ensuring consistency.
> Note that during this version upgrade, the master node is still master node,
> and slave nodes are still slave nodes. So the version state is parallel to
> the other states.
> !https://documents.lucidchart.com/documents/e19ab04e-aa06-4ab3-9e57-cfe273554fa1/pages/0_0?a=2066&x=1466&y=922&w=560&h=455&store=1&accept=image%2F*&auth=LCA%20fa3d8fc0d113a82f4e94b127161cf91818a2fe64-ts%3D1497894598!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)