On 07/06/2011 05:24 PM, Tim Beale wrote: > Hi, > > We've hit a problem in the recovery code and I'm struggling to understand why > we do the following: > > /* > * The recovery sort queue now becomes the regular > * sort queue. It is necessary to copy the state > * into the regular sort queue. > */ > sq_copy (&instance->regular_sort_queue, &instance->recovery_sort_queue); > > The problem we're seeing is sometimes we get an encapsulated message from the > recovery queue copied onto the regular queue, and corosync then crashes trying > to process the message. (When it strips off the totemsrp header it gets > another > totemsrp header rather than the totempg header it expects). > > The problem seems to happen when we only do the sq_items_release() for a > subset > of the recovery messages, e.g. there are 12 messages on the recovery queue and > we only free/release 5 of them. The remaining encapsulated recovery messages > get left on the regular queue and corosync crashes trying to deliver them. > > It looks to me like deliver_messages_from_recovery_to_regular() handles the > encapsulation correctly, stripping the extra header and adding the recovery > messages to the regular queue. But then the sq_copy() just seems to overwrite > the regular queue. > > We've avoided the crash in the past by just reiniting both queues, but I don't > think this is the best solution. >
I would expect this solution would lead to message loss or lockup of the protocol. > Any advice would be appreciated. > > Thanks, > Tim A proper fix should be in commit master: 7d5e588931e4393c06790995a995ea69e6724c54 flatiron-1.3: 8603ff6e9a270ecec194f4e13780927ebeb9f5b2 A new flatiron-1.3 release is in the works. There are other totem bugs you may wish to backport in the meantime. Let us know if that commit fixes the problem you encountered. Regards -steve > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
