If 5 sheepdog nodes are running with cache, and more than 10 vms running on each node.
I mount a tmpfs to /cache directory, and start sheep with: sheep -l level=debug -n /home/admin/sheepdogmetadata,/disk1/sheepdogstoredata,/disk2/sheepdogstoredata,/disk3/sheepdogstoredata,/disk4/sheepdogstoredata,/disk5/sheepdogstoredata,/disk7/sheepdogstoredata,/disk8/sheepdogstoredata,/disk9/sheepdogstoredata -w size=20G dir=/cache -b 0.0.0.0 -y **.**.**.** -c zookeeper:**.**.**.**:2181 There is a possibility that all object push threads are running do_background_push work, and no threads is running do_push_object work. In my test environment, this occurs: [1] 13:09:30 [SUCCESS] vmsecdomainhost1 Name Tag Total Dirty Clean win7_type4_node8.img 4.7 GB 4.7 GB 4.0 MB standard.img images 0.0 MB 0.0 MB 0.0 MB win7_type4_node1.img 4.8 GB 4.8 GB 28 MB win7_type4_node10.img 5.0 GB 4.9 GB 32 MB win7_type4_node2.img 4.7 GB 4.6 GB 68 MB win7_type4_node3.img 4.7 GB 4.7 GB 4.0 MB win7_type4_node6.img 4.8 GB 4.7 GB 40 MB win7_type4_node4.img 4.8 GB 4.7 GB 20 MB win7_type4_node7.img 4.8 GB 4.8 GB 24 MB win7_type4_node9.img 4.7 GB 4.7 GB 32 MB win7_type4_node5.img 4.2 GB 4.2 GB 8.0 MB Cache size 20 GB, used 47 GB, non-directio I found that, 7 object push threads are working with work_queue "oc_push", and their call stacks are: Thread 37 (Thread 0x7f3c2a1fc700 (LWP 116747)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 36 (Thread 0x7f3c2abfd700 (LWP 116775)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 35 (Thread 0x7f3b5d7fb700 (LWP 116889)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 34 (Thread 0x7f3b4ffff700 (LWP 116891)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 33 (Thread 0x7f3ac8dfa700 (LWP 117040)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 32 (Thread 0x7f3ac83f9700 (LWP 117041)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 Thread 31 (Thread 0x7f3ac65f6700 (LWP 117044)): #0 0x0000003916eda37d in read () from /lib64/libc.so.6 #1 0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6 #2 0x000000000042a89d in eventfd_xread () #3 0x0000000000419acb in object_cache_push () *#4 0x0000000000419b83 in do_background_push ()* #5 0x000000000042e56a in worker_routine () #6 0x0000003917207851 in start_thread () from /lib64/libpthread.so.0 #7 0x0000003916ee767d in clone () from /lib64/libc.so.6 No threads are pushing objects, so no object_cache_push work finished. In gdb, we can see the information of each object cache in object_cache_push: vid = 9627038, push_count = 26, dirty_count = 150, total_count = 154 vid = 3508964, push_count = 22, dirty_count = 1456, total_count = 1464 vid = 360229, push_count = 18, dirty_count = 1437, total_count = 1444 vid = 9678955, push_count = 34, dirty_count = 1462, total_count = 1470 vid = 9008538, push_count = 17, dirty_count = 1490, total_count = 1493 vid = 2383510, push_count = 28, dirty_count = 1494, total_count = 1498 vid = 16192623, push_count = 19, dirty_count = 1447, total_count = 1451 push_count is far less than dirty_count, and no threads is doing do_push_object work, so static void do_push_object(struct work *work) if (uatomic_sub_return(&oc->push_count, 1) == 0) eventfd_xwrite(oc->push_efd, 1); will never be kicked. And in static bool wq_need_grow(struct wq_info *wi) { if (wi->nr_threads < uatomic_read(&wi->nr_queued_work) && wi->nr_threads * 2 <= wq_get_roof(wi)) { wi->tm_end_of_protection = get_msec_time() + WQ_PROTECTION_PERIOD; return true; } return false; } nr_threads is 7, wq_get_roof(wi) returns 10( 2 * five nodes). so no more threads will be created, and all threads are waiting for do_push_object finished. Hope that the above information is clearly for everyone. Let's discuss the solution now. The oc_push_wqueue is created with WQ_DYNAMIC: sys->oc_push_wqueue = create_work_queue("oc_push", WQ_DYNAMIC) So the roof of threads number will be case WQ_DYNAMIC: /* FIXME: 2 * nr_nodes threads. No rationale yet. */ nr = nr_nodes * 2; break; There are also other work queue created with WQ_DYNAMIC: wq = create_work_queue("vdi check", WQ_DYNAMIC); sys->http_wqueue = create_work_queue("http", WQ_DYNAMIC); oc_push created with WQ_UNLIMITED is not rational too. *I think that, the nr_threads working with oc_push should be (2 * number of object cache), not (2 * nr_nodes), to ensure that there will be always enougth threads doing do_push_object work.* With your advises, I wish to submit patches to solve this problem. Thanks. -- Xu Fang Beijing,P.R.China
-- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog