Dear Corosync authors,

Due to libqb license issues, I work with version 1.4.6, but it seems that the 
code in question is the same in 2.x.

I seem to have stumbled on a few issues related to fragmentation in combination 
with config changes. 

The main issue is this:
Sometimes the first totem message delivered during the transitional 
configuration is the continuation of a messages that was delivered before. 
Similarly the last message delivered during the transitional configuration can 
be fragmented into the next message.

In both these cases, reassembly fails since the reassembly context is changed 
during the transitional configuration (per the patch signed off by Jan Friesse 
on 11/8/2012).

I am not sure which part is a bug: that messages can continue each other across 
a transitional configuration boundary, or that the reassembly context gets 
changed, but the two things cannot work together.

A couple of side issues are that:

1 - The fragmentation code resets the next fragment number to 1 whenever it can 
fit a message in the send buffer; no matter that the buffer may be currently 
accumulating data for fragment 2 or 3 or what not. That messes up the 
reassembly code.

2 - Whenever the re-assembly code hits a fragment that does not stitch, it 
starts discarding everything until a first fragment shows up (although I am not 
sure it always achieves that; see point 1). I believe the intent was to drop 
only the one or two application message pieces that can't be stitched. I have 
an alternate, much simpler writing of totempg_deliver_fn that does just that, 
but we can talk about it later. I suspect that fragments that don't connect are 
not supposed to happen at all and that I see that only because of the main 
issue I described above. Am I suspecting right?

If you have an idea about how to deal with fragmentation across transitional 
configuration boundaries, I will be more than happy to try out things for you. 
I have a test program that can produce these problems at will (I don't want to 
get into how I do that, just yet).

Thanks a lot for reading thus far.

J-C





_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to