Great investigative work.

Please merge at your earliest convenience and I'll release a corosync
1.2.2.

Regards
-steve

On Mon, 2010-04-12 at 20:17 +1000, Angus Salkeld wrote:
> Hi
> 
> This patch fixes crashes found by repeated pacemaker CTS SimluStart
> tests. When you bring up the nodes together it can cause a lot of
> configuration changes and sync gets started and aborted
> lots of times.
> 
> When abort is called the ring_id is not changed which means that any
> sync packet that arrive from that point on will be accepted as valid.
> I have seen old barrier messages causing the processing index to increment
> later causing an array out of bounds.
> 
> This patch memsets the ring_id to 0, thus causing the ring_id in the packet 
> and
> my_ring_id not to match.    
> 
> Regards
> Angus
> 
> 
> Signed-off-by: Angus Salkeld <[email protected]>
> ---
>  exec/syncv2.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/exec/syncv2.c b/exec/syncv2.c
> index 57b501b..559e199 100644
> --- a/exec/syncv2.c
> +++ b/exec/syncv2.c
> @@ -665,6 +665,11 @@ void sync_v2_abort (void)
>               schedwrk_destroy (my_schedwrk_handle);
>               my_service_list[my_processing_idx].sync_abort ();
>       }
> +
> +     /* this will cause any "old" barrier messages from causing
> +      * problems.
> +      */
> +     memset (&my_ring_id, 0, sizeof (struct memb_ring_id));
>  }
>  
>  void sync_v2_memb_list_determine (const struct memb_ring_id *ring_id)

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to