Great investigative work. Please merge at your earliest convenience and I'll release a corosync 1.2.2.
Regards -steve On Mon, 2010-04-12 at 20:17 +1000, Angus Salkeld wrote: > Hi > > This patch fixes crashes found by repeated pacemaker CTS SimluStart > tests. When you bring up the nodes together it can cause a lot of > configuration changes and sync gets started and aborted > lots of times. > > When abort is called the ring_id is not changed which means that any > sync packet that arrive from that point on will be accepted as valid. > I have seen old barrier messages causing the processing index to increment > later causing an array out of bounds. > > This patch memsets the ring_id to 0, thus causing the ring_id in the packet > and > my_ring_id not to match. > > Regards > Angus > > > Signed-off-by: Angus Salkeld <[email protected]> > --- > exec/syncv2.c | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/exec/syncv2.c b/exec/syncv2.c > index 57b501b..559e199 100644 > --- a/exec/syncv2.c > +++ b/exec/syncv2.c > @@ -665,6 +665,11 @@ void sync_v2_abort (void) > schedwrk_destroy (my_schedwrk_handle); > my_service_list[my_processing_idx].sync_abort (); > } > + > + /* this will cause any "old" barrier messages from causing > + * problems. > + */ > + memset (&my_ring_id, 0, sizeof (struct memb_ring_id)); > } > > void sync_v2_memb_list_determine (const struct memb_ring_id *ring_id) _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
