At Wed, 14 Apr 2010 13:18:54 +0900, FUJITA Tomonori wrote: > > On Wed, 14 Apr 2010 13:15:19 +0900 > FUJITA Tomonori <[email protected]> wrote: > > > fix a bug that A SIMPLE work wronly passes blocked ORDERED works. > > > > 1. a SIMPLE work is on the pending_list > > 2. when a new ORDERED work comes, then it added to the blocked_list. > > 3. then a new SIMPLE work comes, it's wrongly added to the blocked_list (it > > should be delayed untile the above ORDERED work finishes). > > > > Should have been: > > 3. then a new SIMPLE work comes, it's wrongly added to the > pending_list. It will be executed wrongly before the above ORDERED. > > It should be delayed untile the above ORDERED work finishes.
If nodes mutually sent requests just before sheepdog node membership changes, this patch seems to cause problems. Here is an example scenario: 1) There are two nodes, A and B, in the sheepdog cluster, and a VM is running on the each node. 2) Each VM sends a write request to the local collie at the same time. 3) Node A forwards the request to node B, and vice versa at the same time. 3) Node C joins to the cluster. 4) sd_confch add the ORDERED work to the work_queue on each node. This work is blocked because the previous SIMPLE work are not finished. 5) On each node, collie receives forwarded write requests, but collie blocks the request because the previous ORDERED work is not finished. 6) Two nodes cannot accept any requests any more. I guess, if worker thread calls __sd_confch with an ORDERED attribute, we must assume that collie can process SIMPLE work before its previous ORDERED work finishes. Regards, Kazutaka Morita -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
