In roaming SC cluster, when both active/standby SC go down,
if SC Absence feature is enables, the cluster will nominate
another SC to become active. Amfnd in this nominated SC has
the led_state is true, therefore, amfd in this nominated SC
will proceed the restoration regardless the sync window.
: 876c9eb232f19de8ef8c5ddd1b3dcce4b1b4a8b3
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files
: 854a8e03042d6a53a45b903262f5197a52a87525
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts
This ticket revisit the waiting for peer info and
fix the problem of disordered peer_up and peer info
in the commit d1593b03b3c9bec292b14dde65264c261760bf46
---
src/rde/rded/rde_main.cc | 1 +
src/rde/rded/role.cc | 63 +++-
src/rde/rded/role.h | 7
RDE sends peer info message to whom it detects in peer up message.
In roaming SC, when all SCs rejoin from network split, all RDE now
are active. The duplicated active detection relies on peer info
message, which could be seen as one-on-one detection. The mechanism
may cause the last SC not
RDE detects the peer_up message and suppose the peer_info message
will come afterwards. However, in roaming SC, when all SCs rejoins
from network split, the last active SC may be missing out the peer
info message since the others SC have already reboot.
Patch adds timeout to wait for peer info
: f938c0c375bbd77c4343d4bf3bed57abd45b58aa
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files
RDE sends peer info message to whom it detects in peer up message.
In roaming SC, when all SCs rejoin from network split, all RDE now
are active. The duplicated active detection relies on peer info
message, which could be seen as one-on-one detection. The mechanism
may cause the last SC not
RDE detects the peer_up message and suppose the peer_info message
will come afterwards. However, in roaming SC, when all SCs rejoins
from network split, the last active SC may be missing out the peer
info message since the others SC have already reboot.
Patch adds timeout to wait for peer info
: f938c0c375bbd77c4343d4bf3bed57abd45b58aa
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
---
src/imm/agent/imma_oi_api.cc | 2 +-
src/imm/agent/imma_om_api.cc | 2 +-
src/imm/agent/imma_proc.cc | 5 +++--
src/imm/immd/immd_cb.h | 1 -
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/imm/agent/imma_oi_api.cc b/src/imm/agent/imma_oi_api.cc
index
Immnd allows IMMSV_FEVS_MAX_PENDING sourced from enviroment
variable, or uses default value (16) otherwise
---
src/imm/common/immsv_api.h | 4
src/imm/immloadd/imm_loader.cc | 2 +-
src/imm/immnd/ImmModel.cc | 2 +-
src/imm/immnd/immnd.conf | 4
: 8259816e20e83658f04f0264af19cafa0cdd2755
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Immnd allows IMMSV_FEVS_MAX_PENDING sourced from enviroment
variable, or uses default value (16) otherwise
---
src/imm/common/immsv_api.h | 4
src/imm/immnd/ImmModel.cc | 2 +-
src/imm/immnd/immnd_cb.h | 1 +
src/imm/immnd/immnd_evt.c | 41 ++
---
src/imm/agent/imma_oi_api.cc | 2 +-
src/imm/agent/imma_om_api.cc | 2 +-
src/imm/agent/imma_proc.cc | 5 +++--
src/imm/immd/immd_cb.h | 1 -
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/imm/agent/imma_oi_api.cc b/src/imm/agent/imma_oi_api.cc
index
: 8259816e20e83658f04f0264af19cafa0cdd2755
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
---
src/imm/agent/imma_oi_api.cc | 2 +-
src/imm/agent/imma_om_api.cc | 2 +-
src/imm/agent/imma_proc.cc | 5 +++--
src/imm/immd/immd_cb.h | 1 -
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/imm/agent/imma_oi_api.cc b/src/imm/agent/imma_oi_api.cc
index
Immnd allows IMMSV_FEVS_MAX_PENDING sourced from enviroment
variable, or uses default value (16) otherwise
---
src/imm/common/immsv_api.h | 4
src/imm/immnd/ImmModel.cc | 2 +-
src/imm/immnd/immnd_cb.h | 1 +
src/imm/immnd/immnd_evt.c | 41 ++
: 8259816e20e83658f04f0264af19cafa0cdd2755
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
: 2c13c9ea579dc064b1d6adcce98d62efb3d0032d
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
In the event of network partitioning that results in new etcd leader
change, the 'get' api in the bigger partition is not available for a
few seconds. Therefore, the SC in bigger partition can not promote
but self-fence instead.
This patch adds etcd_tolerance_timeout so the SC in bigger partition
In the scenario that amfnd terminates a huge number of components
at once (around 800 components), amfnd catches the sigchild signal
from components' processes in signal handler and calls write() to
notify amfnd's threads to proceed the component termination. As of
this result, multiple blocking
: 17038f9f9bbbde98b68fccb5b65413e14fe46418
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
: 17038f9f9bbbde98b68fccb5b65413e14fe46418
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
In the scenario that amfnd terminates a huge number of components
at once (around 800 components), amfnd catches the sigchild signal
from components' processes in signal handler and calls write() to
notify amfnd's threads to proceed the component termination. As of
this result, multiple blocking
In the scenario that amfnd terminates a huge number of components
at once (around 800 components), amfnd catches the sigchild signal
from components' processes in signal handler and calls write() to
notify amfnd's threads to proceed the component termination. As of
this result, multiple blocking
: fa78173f280133ceb47224bfbaf9e83b96873fc5
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
: f03fe23c17bd4e4e32dd4a1304d2ac8f247d05e7
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files
If a SC is separated from cluster, fmd calls opensaf_quick_reboot().
The reboot script returns yet the node has not been coming down.
In the code after opensaf_quick_reboot(), fmd tells rde to promote
to active. Hence, there is a short period of having two 2 active SC
This patch makes fmd to stop
: 740100f2ebfb5458a8052dea29b5583b3dc8df5a
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
---
src/rde/rded/role.cc | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/rde/rded/role.cc b/src/rde/rded/role.cc
index b890117..9446ccb 100644
--- a/src/rde/rded/role.cc
+++ b/src/rde/rded/role.cc
@@ -107,6 +107,7 @@ void Role::PromoteNode(const uint64_t cluster_size,
rc =
Correct indent and reduce code lines (<80 chars) for
mds_mdtm_send_tipc() and mdtm_frag_and_send()
---
src/mds/mds_dt_tipc.c | 490 ++
1 file changed, 256 insertions(+), 234 deletions(-)
diff --git a/src/mds/mds_dt_tipc.c b/src/mds/mds_dt_tipc.c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
SAF
The patch avoids message reallocation if the message is in
retransmission queue
---
src/mds/mds_dt_tipc.c| 68 +++-
src/mds/mds_tipc_fctrl_intf.cc | 6 ++--
src/mds/mds_tipc_fctrl_intf.h| 4 +--
src/mds/mds_tipc_fctrl_msg.cc| 2 +-
: b61bee5c8accd79e573ef726d40b945afc7c7b3e
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
The patch avoids message reallocation if the message is in
retransmission queue
---
src/mds/mds_dt_tipc.c| 42 +++-
src/mds/mds_tipc_fctrl_intf.cc | 6 --
src/mds/mds_tipc_fctrl_intf.h| 4 ++--
src/mds/mds_tipc_fctrl_msg.cc| 2 +-
Correct indent and reduce code lines (<80 chars) for
mds_mdtm_send_tipc() and mdtm_frag_and_send()
---
src/mds/mds_dt_tipc.c | 484 ++
1 file changed, 254 insertions(+), 230 deletions(-)
diff --git a/src/mds/mds_dt_tipc.c b/src/mds/mds_dt_tipc.c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
SAF
The patch avoids message reallocation if enable
MDS_TIPC_FCTRL_ENABLED
---
src/mds/mds_dt_tipc.c| 27 ---
src/mds/mds_tipc_fctrl_msg.cc| 2 +-
src/mds/mds_tipc_fctrl_portid.cc | 9 +++--
3 files changed, 24 insertions(+), 14 deletions(-)
diff --git
repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
SAF services
The logging of broadcast/multicast is currently logged with
NOTIFY as mds does not support broadcast/multicast message,
so the logging would be helpful in some cases. However, the
mds.log may be located in nfs file system, and this logging
may cause high rate traffic towards nfs file system.
This
: ddb9d7065376df7757716013779755864d53ebe5
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts
---
src/mds/apitest/mdstipc_api.c | 83 ---
1 file changed, 78 insertions(+), 5 deletions(-)
diff --git a/src/mds/apitest/mdstipc_api.c b/src/mds/apitest/mdstipc_api.c
index 5c0e28a..651365e 100644
--- a/src/mds/apitest/mdstipc_api.c
+++
The legacy mds encodes the protocol version in either non fragment
message or the first fragment only. Hence, the subsequent fragment
after the first one is not able for mds to determine the protocol
version.
The patch maintains the encoding of lengthcheck as same as the legacy
mds version. Also,
Since adding TipcPortId:ChangeState(), the patch refactors
logging to shorten the code.
---
src/mds/mds_tipc_fctrl_portid.cc | 71
1 file changed, 21 insertions(+), 50 deletions(-)
diff --git a/src/mds/mds_tipc_fctrl_portid.cc
Patch unsets MDS_TIPC_FCTRL_ENABLED, MDS_TIPC_FCTRL_ACKTIMEOUT,
and MDS_TIPC_FCTRL_ACKSIZE to prevent child process inheritance.
---
src/mds/mds_dt_tipc.c | 39 +--
1 file changed, 29 insertions(+), 10 deletions(-)
diff --git a/src/mds/mds_dt_tipc.c
: e685bdfb16dad852372751f80aa2ec49948db05c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
Patch unsets MDS_TIPC_FCTRL_ENABLED, MDS_TIPC_FCTRL_ACKTIMEOUT,
and MDS_TIPC_FCTRL_ACKSIZE to prevent child process inheritance.
---
src/mds/mds_dt_tipc.c | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/mds/mds_dt_tipc.c b/src/mds/mds_dt_tipc.c
index e7a7b48..12b275d 100644
: e685bdfb16dad852372751f80aa2ec49948db05c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
The mds flow control has been disabled for broadcast/mulitcast unfragment
message if tipc multicast is enabled. This patch revisits and continues
with fragment messages.
---
src/mds/mds_tipc_fctrl_intf.cc | 47
src/mds/mds_tipc_fctrl_msg.h | 11
: 95228b1a2a53e3b74c9a54f65e8b2345b8603582
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
: 95228b1a2a53e3b74c9a54f65e8b2345b8603582
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
According to RFC1982: "Addition of a value outside the range
[0 .. (2^(SERIAL_BITS - 1) - 1)] is undefined.". Mds uses 16
bits for mds flow control, thus the maximum allowed range of
window size is 2^15 - 1 = 32767.
The 'mdstest 27 8' has randomly hit this limitation with the
counter errors that
mds relies on data message sent from the peer to determine
whether the MDS_TIPC_FCTRL_ENABLED is set. The data message
may not be sent right after TIPC_PUBLISHED event, which can
cause the tx probation timer timeout.
This patch add Intro message, which is sent right after the
TIPC_PUBLISHED to
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
SAF
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup scripts n
SAF
mds currently uses MDS_PROT_FCTRL_ID 4 bytes value (0x00AC13F5)
from octet11 to octet14 to identify the flow control message
e.g., chunkack message. In case of fragmentation from big
message, the second fragment onwards will start from the octet11,
which may have arbitrary value and cause mds to
: e699c22ddc1ca8530318b0dc0bde46794a224bd9
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
This commit as part of #3095 updates the error string with
pattern "FCTRL:*Error[*]", in order to help grep-ing the
error in mds debug log.
---
src/mds/mds_tipc_fctrl_intf.cc | 59 +---
src/mds/mds_tipc_fctrl_portid.cc | 10 ---
2 files changed, 43
In the scenario of recovery from split-brain, where both
active director services may suffer mds message loss due
to lost-contact tipc link. If MDS_TIPC_FCTRL_ENABLED is
set, the out-of-order message will be dropped, and there
is no mechanism to trigger the retransmission from receiver
side at
: 46e9e0f310a6c21dbc89a9ffd8bee26829342c0c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
In the scenario of recovery from split-brain, where both
active director services may suffer mds message loss due
to lost-contact tipc link. If MDS_TIPC_FCTRL_ENABLED is
set, the out-of-order message will be dropped, and there
is no mechanism to trigger the retransmission from receiver
side at
This patch implements the kRcvBuffOverflow state machine as
described in README file.
---
src/mds/mds_tipc_fctrl_intf.cc | 6 +-
src/mds/mds_tipc_fctrl_msg.h | 1 +
src/mds/mds_tipc_fctrl_portid.cc | 137 ++-
src/mds/mds_tipc_fctrl_portid.h | 5 +-
(Sending on behalf of Thuan)
This patch solves the linking issue if mds_dt.h or mds_core.h
is included in c++ sources.
---
src/mds/mds_core.h| 74 +++
src/mds/mds_dt.h | 4 +--
src/mds/mds_dt2c.h| 67
This patch adds state machine to support tx probation timer.
---
src/mds/mds_tipc_fctrl_intf.cc | 47 +++--
src/mds/mds_tipc_fctrl_msg.h | 1 +
src/mds/mds_tipc_fctrl_portid.cc | 109 +++
src/mds/mds_tipc_fctrl_portid.h | 22
4
---
src/mds/README | 221 +
1 file changed, 221 insertions(+)
create mode 100644 src/mds/README
diff --git a/src/mds/README b/src/mds/README
new file mode 100644
index 000..1b94632
--- /dev/null
+++ b/src/mds/README
@@ -0,0 +1,221 @@
(Sending on behalf of Thuan)
---
src/mds/apitest/mdstest.c | 5 +-
src/mds/apitest/mdstipc.h | 6 +-
src/mds/apitest/mdstipc_api.c | 237 +
src/mds/apitest/mdstipc_conf.c | 19 ++--
4 files changed, 253 insertions(+), 14 deletions(-)
diff
This patch makes the solution of TIPC buffer overflow configurable,
as well as the ack timeout/ack size.
For example:
The service config file can export the following environment variables
export MDS_TIPC_FCTRL_ENABLED=1
export MDS_TIPC_FCTRL_ACKTIMEOUT=1000
export MDS_TIPC_FCTRL_ACKSIZE=1
If
This is a collaborative patch of two participants:Thuan, Minh.
Main changes:
- Add mds_tipc_fctrl_intf.h, mds_tipc_fctrl_intf.cc: These two files
introduce new functions which are called in mds_dt_tipc.c if the flow
control is enabled
- Add mds_tipc_fctrl_portid.h, mds_tipc_fctrl_portid.cc: These
: 2d85d5d9264c6a7d1c6601b900fb810facbee3ac
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
If the ack size is configured greater than 1, there should be a timeout
at receiver ends to send the ack message back to senders.
The ack message timeout utilizes the poll timeout in flow control thread
to make mds lightweight (in contrast to additional timer threads).
---
This patch applies the serial number arithmetic for the flow control
sequence number, referenced to RFC1982.
This is only temporary patch, a proper one could be made in /base
with template for others type, e.g uint32. Then mds reuses it from
/base.
---
src/mds/mds_tipc_fctrl_portid.cc | 53
: 2d85d5d9264c6a7d1c6601b900fb810facbee3ac
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
This is a collaborative patch of two participants:
- Tran Thuan
- Minh Chau
Main changes:
- Add mds_tipc_fctrl_intf.h, mds_tipc_fctrl_intf.cc: These two files
introduce new functions which are called in mds_dt_tipc.c if the flow
control is enabled
- Add mds_tipc_fctrl_portid.h
This patch adds state machine to support tx probation timer.
---
src/mds/mds_tipc_fctrl_intf.cc | 47 +++--
src/mds/mds_tipc_fctrl_msg.h | 1 +
src/mds/mds_tipc_fctrl_portid.cc | 109 +++
src/mds/mds_tipc_fctrl_portid.h | 22
4
---
src/mds/README | 221 +
1 file changed, 221 insertions(+)
create mode 100644 src/mds/README
diff --git a/src/mds/README b/src/mds/README
new file mode 100644
index 000..1b94632
--- /dev/null
+++ b/src/mds/README
@@ -0,0 +1,221 @@
From: Tran Thuan
This patch solves the linking issue if mds_dt.h or mds_core.h
is included in c++ sources.
---
src/mds/mds_core.h| 74 +++
src/mds/mds_dt.h | 4 +--
src/mds/mds_dt2c.h| 67
From: Tran Thuan
---
src/mds/apitest/mdstest.c | 5 +-
src/mds/apitest/mdstipc.h | 6 +-
src/mds/apitest/mdstipc_api.c | 237 +
src/mds/apitest/mdstipc_conf.c | 19 ++--
4 files changed, 253 insertions(+), 14 deletions(-)
diff --git
This patch implements the kRcvBuffOverflow state machine as
described in README file.
---
src/mds/mds_tipc_fctrl_intf.cc | 6 +-
src/mds/mds_tipc_fctrl_msg.h | 1 +
src/mds/mds_tipc_fctrl_portid.cc | 137 ++-
src/mds/mds_tipc_fctrl_portid.h | 5 +-
This patch makes the solution of TIPC buffer overflow configurable,
as well as the ack timeout/ack size.
For example:
The service config file can export the following environment variables
export MDS_TIPC_FCTRL_ENABLED=1
export MDS_TIPC_FCTRL_ACKTIMEOUT=1000
export MDS_TIPC_FCTRL_ACKSIZE=1
If
This patch applies the serial number arithmetic for the flow control
sequence number, referenced to RFC1982.
This is only temporary patch, a proper one could be made in /base
with template for others type, e.g uint32. Then mds reuses it from
/base.
---
src/mds/mds_tipc_fctrl_portid.cc | 53
If the ack size is configured greater than 1, there should be a timeout
at receiver ends to send the ack message back to senders.
The ack message timeout utilizes the poll timeout in flow control thread
to make mds lightweight (in contrast to additional timer threads).
---
Hi Gary,
ack for code review. Still a few other places that call
opensaf_quick_reboot can be visited later.
Thanks,
Minh
> Summary: fmd: improve failover response time V2 [#3008]
> Review request for Ticket(s): 3008
> Peer Reviewer(s): Hans, Minh
> Pull request to: *** LIST THE PERSON WITH PUSH
: 497b55530e1562b88522e3e2a6f4c5dd21fb4f50
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
---
src/fm/fmd/fm_rda.cc | 4 ++--
src/rde/rded/rde_main.cc | 8 +++-
src/rde/rded/role.cc | 8
3 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/src/fm/fmd/fm_rda.cc b/src/fm/fmd/fm_rda.cc
index 028bfa3..0aa5a3d 100644
--- a/src/fm/fmd/fm_rda.cc
+++
---
src/amf/amfd/sg_2n_fsm.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/amf/amfd/sg_2n_fsm.cc b/src/amf/amfd/sg_2n_fsm.cc
index a218786..91ffc63 100644
--- a/src/amf/amfd/sg_2n_fsm.cc
+++ b/src/amf/amfd/sg_2n_fsm.cc
@@ -1784,7 +1784,8 @@ uint32_t
: 9a730d22b0580e6e3c54fd3a4fd5bb4cf82c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
---
src/amf/amfd/imm.cc | 54 ++--
src/amf/amfd/imm.h | 2 --
src/amf/amfd/role.cc | 2 +-
3 files changed, 19 insertions(+), 39 deletions(-)
diff --git a/src/amf/amfd/imm.cc b/src/amf/amfd/imm.cc
index 82d2b13..d917b0d 100644
---
: 9a730d22b0580e6e3c54fd3a4fd5bb4cf82c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration files n
Startup
revision: 9a730d22b0580e6e3c54fd3a4fd5bb4cf82c
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
AMF performs headless recovery by syncing the assignments from AMFND(s) and
re-create them in AMFD's db and IMM. Next step, AMFD compares the assignment
objects from IMM and from AMFND(s) to figure out the on-going assignments
that have been left over before headless and failover them, the
: c43ae9d97d169cc4a3b57da14ed9191dca8dfba5
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
---
src/amf/amfd/sg_2n_fsm.cc | 2 +-
src/amf/amfd/siass.cc | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/amf/amfd/sg_2n_fsm.cc b/src/amf/amfd/sg_2n_fsm.cc
index 72edf9d..a218786 100644
--- a/src/amf/amfd/sg_2n_fsm.cc
+++ b/src/amf/amfd/sg_2n_fsm.cc
@@ -127,7
Hi, Ack (code review). Thanks/Minh
> ---
> src/amf/amfd/node_state_machine.cc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amf/amfd/node_state_machine.cc
> b/src/amf/amfd/node_state_machine.cc
> index 478ad2a48..c5d86d33c 100644
> ---
If the SU is unlock-in/unlock before the node joins cluster, the SU is
instantiated
and in unlocked state. However, when the node completes joining the cluster,
amfd
assumes all applications' SU uninstantiated and starts the instantiation, thus
the
instantiated/unlocked SU is forgot to give the
revision: 5bb2174a323a97f626ce354d553a1dc4d1673899
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Configuration
If split brain happens and network merges back, at this point in time
there are a few mds events coming to payloads, which are the SVC UP
from the other controller; SVC down from services in both controllers
due to reboot from split brain detection.
In the ticket description, the first partition
The first part of #2929 which has introduced EXCESSIVE susi fms state,
it also handles the duplicated 2N assignments so that the node that has
duplicated assignments will be reboot.
This patch removes the sending node reboot in avd_sg_2n_act_susi(), or
amfd will send multiple node reboot to the
-2929
Base revision: 9442e2bfc9c883a10bca1a88816da9ff6fda2921
Personal repository: git://git.code.sf.net/u/minh-chau/review
Impacted area Impact y/n
Docsn
Build systemn
RPM/packaging n
Once splitbrain happens, we have multiple partitions, in which AMF will continue
assignments to the spare SUs in each partitions. When network merge, these
partitions
join into one cluster and the assignments of SU become excessive.
This patch adds a new susi fsm EXCESSIVE state, which is marked
1 - 100 of 411 matches
Mail list logo