If mc-crusher engages all threads I'd suspect a memslap problem. that
util's never worked very well.

I've been validating releases on dual socket 8 core (16 total+ht) machines
and have gotten all of the threads to engage just fine.

you could also write a small test program in a language of your choice
which connects, runs a set and a get, then disconnects a few hundred
times. Then print the stats structures to see if they were all being
engaged.

On Mon, 9 Feb 2015, Saman Barghi wrote:

> Btw, I already considered avoiding hyper threads and running all server 
> threads on the same socket (to avoid cross socket latencies and better
> cache usage), but it does not seem to be a hardware related problem.
>
> On Mon, Feb 9, 2015 at 4:29 PM, Saman Barghi <sama...@gmail.com> wrote:
>       Thanks for you response, find my response below:
>
>       On Mon, Feb 9, 2015 at 1:39 PM, dormando <dorma...@rydia.net> wrote:
>             > I am running some tests using memached 1.4.22 over an Intel 
> Xeon E5 (4 sockets with 8 core each, 2 Hyper threads per
>             core, and 4 NUMA nodes) and 
>             > running Ubuntu trusty. I compiled memcached with gcc-4.8.2 with 
> default CFLAGS and configuration options.
>             >
>             > The problem is whenever I start memcached with odd number of 
> server threads (3,5,7,9,11,..) everything is ok, and all
>             threads are engaging in
>             > processing requests, the status of all threads are "Running". 
> However, if I start the server with even number of threads
>             (2,4,6,8,..), half of the
>             > threads are always in sleep mode and do not engage in servicing 
> clients. This is related to memached, as memaslap, for
>             example, is running with no
>             > such pattern. I ran the exact test on an AMD Opteron and things 
> are ok with memached. So my question is: is there any
>             specific tuning required for
>             > Intel machines? Is there any specific flag or some part of the 
> code that might cause worker threads to not engage?
>             >
>             >
>             > Thanks,
>             > Saman
>
>             That is pretty weird. I've not run it on a quad socket but plenty 
> of intel
>             machines without problem. Modern ones too.
>
>        
> I see, I am not sure why it happens cause everything is very straight forward 
> with memcached.
>
>
>       How many clients are you telling memslap to use? Can you try
>       https://github.com/dormando/mc-crusher quickly? (run loadconf/similar to
>       load some values, then a different one to hammer it).
>
>
> I fire memaslap with the same number of threads as memcached, and with 
> concurrency 20 per thread, so enough to keep server threads busy.
>  
> It seems that using mc-crusher, all threads are engaged  when loading and 
> with mget_test. So does it mean there is something fishy with
> memaslap?
>
>
>       Connections are dispersed via thread.c:dispatch_conn_new()
>
>           int tid = (last_thread + 1) % settings.num_threads;
>
>           LIBEVENT_THREAD *thread = threads + tid;
>
>           last_thread = tid;
>
>       which is pretty simple at the base.
>
>
> Right, I printed out 'last_thread' to make sure nothing funny is happening, 
> but it's perfect round robin. 
>
>
>       If you can gdb up can you dump the per-thread stats structures? that 
> will
>       show definitively if those threads ever get work or not.
>
>
>
> I ran memcached with -t 8 and client side is
>
> memaslap -s localhost -T 8 -c 160 -t 1m
>
> and below find the gdb output for per thread stats structure. It seems that 
> every other thread is not doing anything!? Also I can confirm
> that all memaspal threads are consuming 100% of the cores they are running 
> on, and again with odd number of server threads this does not
> happen. You can see the kind of results I get when running memcached and 
> increase the number of threads on that machine. It is not consistent
> at all!
>
> Thanks,
> Saman
>
>
> (gdb) print threads[0].stats
> $21 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 0, get_misses = 0,
>   touch_cmds = 0, touch_misses = 0, delete_misses = 0, incr_misses = 0, 
> decr_misses = 0, cas_misses = 0, bytes_read = 0, bytes_written = 0,
> flush_cmds = 0, conn_yields = 0, auth_cmds = 0, auth_errors = 0, slab_stats = 
> {{set_cmds = 0,
>       get_hits = 0, touch_hits = 0, delete_hits = 0, cas_hits = 0, cas_badval 
> = 0, incr_hits = 0, decr_hits = 0} <repeats 201 times>}}
> (gdb) printthreads[1].stats                                                   
>                                                                               
?? ?                                                                            
    
> $22 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 4466788,
>   get_misses = 3863038, touch_cmds = 0, touch_misses = 0, delete_misses = 0, 
> incr_misses = 0, decr_misses = 0, cas_misses = 0, bytes_read =
> 861116495, bytes_written = 693448306, flush_cmds = 0, conn_yields = 0, 
> auth_cmds = 0,
>   auth_errors = 0, slab_stats = {{set_cmds = 0, get_hits = 0, touch_hits = 0, 
> delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0,
> decr_hits = 0} <repeats 12 times>, {set_cmds = 496327, get_hits = 603750, 
> touch_hits = 0,
>       delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits 
> = 0}, {set_cmds = 0, get_hits = 0, touch_hits = 0, delete_hits
> = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits = 0} <repeats 188 
> times>}}
> (gdb) printthreads[2].stats                                                   
>                                                                               
?? ?                                                                            
    
> $23 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 0, get_misses = 0,
>   touch_cmds = 0, touch_misses = 0, delete_misses = 0, incr_misses = 0, 
> decr_misses = 0, cas_misses = 0, bytes_read = 0, bytes_written = 0,
> flush_cmds = 0, conn_yields = 0, auth_cmds = 0, auth_errors = 0, slab_stats = 
> {{set_cmds = 0,
>       get_hits = 0, touch_hits = 0, delete_hits = 0, cas_hits = 0, cas_badval 
> = 0, incr_hits = 0, decr_hits = 0} <repeats 201 times>}}
> (gdb) printthreads[3].stats                                                   
>                                                                               
?? ?                                                                            
    
> $24 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 4120462,
>   get_misses = 3550471, touch_cmds = 0, touch_misses = 0, delete_misses = 0, 
> incr_misses = 0, decr_misses = 0, cas_misses = 0, bytes_read =
> 794355485, bytes_written = 654105157, flush_cmds = 0, conn_yields = 0, 
> auth_cmds = 0,
>   auth_errors = 0, slab_stats = {{set_cmds = 0, get_hits = 0, touch_hits = 0, 
> delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0,
> decr_hits = 0} <repeats 12 times>, {set_cmds = 457849, get_hits = 569991, 
> touch_hits = 0,
>       delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits 
> = 0}, {set_cmds = 0, get_hits = 0, touch_hits = 0, delete_hits
> = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits = 0} <repeats 188 
> times>}}
> (gdb) print threads[4].stats
> $25 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 0, get_misses = 0,
>   touch_cmds = 0, touch_misses = 0, delete_misses = 0, incr_misses = 0, 
> decr_misses = 0, cas_misses = 0, bytes_read = 0, bytes_written = 0,
> flush_cmds = 0, conn_yields = 0, auth_cmds = 0, auth_errors = 0, slab_stats = 
> {{set_cmds = 0,
>       get_hits = 0, touch_hits = 0, delete_hits = 0, cas_hits = 0, cas_badval 
> = 0, incr_hits = 0, decr_hits = 0} <repeats 201 times>}}
> (gdb) print threads[5].stats
> $26 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 4038230,
>   get_misses = 3493086, touch_cmds = 0, touch_misses = 0, delete_misses = 0, 
> incr_misses = 0, decr_misses = 0, cas_misses = 0, bytes_read =
> 778500650, bytes_written = 626164950, flush_cmds = 0, conn_yields = 0, 
> auth_cmds = 0,
>   auth_errors = 0, slab_stats = {{set_cmds = 0, get_hits = 0, touch_hits = 0, 
> delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0,
> decr_hits = 0} <repeats 12 times>, {set_cmds = 448710, get_hits = 545144, 
> touch_hits = 0,
>       delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits 
> = 0}, {set_cmds = 0, get_hits = 0, touch_hits = 0, delete_hits
> = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits = 0} <repeats 188 
> times>}}
> (gdb) print threads[6].stats
> $27 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 0, get_misses = 0,
>   touch_cmds = 0, touch_misses = 0, delete_misses = 0, incr_misses = 0, 
> decr_misses = 0, cas_misses = 0, bytes_read = 0, bytes_written = 0,
> flush_cmds = 0, conn_yields = 0, auth_cmds = 0, auth_errors = 0, slab_stats = 
> {{set_cmds = 0,
>       get_hits = 0, touch_hits = 0, delete_hits = 0, cas_hits = 0, cas_badval 
> = 0, incr_hits = 0, decr_hits = 0} <repeats 201 times>}}
> (gdb) print threads[7].stats
> $28 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0,
> __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, get_cmds = 
> 4472436,
>   get_misses = 3868324, touch_cmds = 0, touch_misses = 0, delete_misses = 0, 
> incr_misses = 0, decr_misses = 0, cas_misses = 0, bytes_read =
> 862203585, bytes_written = 693881564, flush_cmds = 0, conn_yields = 0, 
> auth_cmds = 0,
>   auth_errors = 0, slab_stats = {{set_cmds = 0, get_hits = 0, touch_hits = 0, 
> delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0,
> decr_hits = 0} <repeats 12 times>, {set_cmds = 496953, get_hits = 604112, 
> touch_hits = 0,
>       delete_hits = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits 
> = 0}, {set_cmds = 0, get_hits = 0, touch_hits = 0, delete_hits
> = 0, cas_hits = 0, cas_badval = 0, incr_hits = 0, decr_hits = 0} <repeats 188 
> times>}}
>  
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Reply via email to