Re: Non-deterministic number of Memcached children processes other than worker threads

2019-12-15 Thread dormando
Yup :) looks like you left a system installed version of memcached running
on the other one.

On Mon, 16 Dec 2019, Alireza Sanaee wrote:

> Hi,
> I did a stupid mistake, the big machine Memcached version seems different. I 
> guess that is the problem.
>
> Thanks,
> Alireza
>
> On Mon, Dec 16, 2019 at 2:16 AM Alireza Sanaee  wrote:
>   Hi,
> I'm investigating the Linux load balancer, meanwhile, I'm trying to 
> understand what is happening in Memcached and just noticed the different 
> number
> of Memcached threads on my machines. Linux doesn't provide immediate access 
> to generally latency-sensitive threads/processes like Memcached worker
> threads, causing HoL and eventually long-tail latency, not a new thing 
> though. 
>
> But before that, I should know what worker threads I need to consider. You 
> gave me some pointers to different internal threads of Memcached though.
> Even intermittent occurrence of some internal worker threads of 
> Memcached(rebalancer, crawler or ...) might block some requests (I'm not sure 
> if
> that's the case or not). ULE scheduler sounds suffering from the same flaw.
>
> The connection dispatcher is there, but I don't have so many connections 
> creation, I have a good number of clients, and send requests at some rates
> until the end of the experiment. I'm using mutilate as my workload generator.
>
> Sure, I can check whether those are idle or not, I'm actually recording 
> everything from `/proc///stat` so the status is also
> available there. It is a true fact that internal threads are IDLE most of the 
> time and it is some sort of visible in the plot, but I just want to
> make sure that all mostly idle ones are just internal threads and not 
> workers. As you said some times workloads are not loaded evenly making the
> worker threads more difficult to distinguish. I think this doesn't matter 
> now. 
>
> The main issue here is that I have 6 threads on my big machine and 10 threads 
> on my small machine, while I have the same Memcached configuration for
> both machines. I have attached the numbers for the two machines.
>
> Thanks,
> Alireza
>
> On Sun, Dec 15, 2019 at 3:40 PM dormando  wrote:
>   What're you trying to accomplish?
>
>   Can you include the output of "stats" and "stats settings" on both
>   machines?
>
>   Dumb question but you've looked at the output of `ps auxH`? If just 
> using
>   htop you may not see the threads that're idle.
>
>   TCP connections are pinned to a specific worker thread on connection.
>   Trivial benchmarks may not load the worker threads evenly, as the
>   connections are handed to threads evenly via round robin.
>
>   On Sun, 15 Dec 2019, Alireza Sanaee wrote:
>
>   > Hi,
>   > Thank you for the information,
>   >
>   > Sorry for miss using the word there, yes that's all threads. I'm 
> using the Memcached 1.5.20. I build it myself and then run my
>   experiments($MEMCACHED -u
>   > root -p 11211 -m $MAXMEM -c 1024 -t $MEMCACHED_THREADS). And I'm 
> checking the number of Memcached threads in htop output. It showed me
>   10 threads(workers
>   > included) in one machine and 6 threads(workers included) on the other 
> one.
>   >
>   > To share some more information, I have 200GB of memory for the bigger 
> machine that creates only 6 threads, and we have only 16GB of
>   memory for the machine
>   > that creates 10 threads. I'm just thinking maybe because the smaller 
> machine has less amount of space, and I'm actually filling in up
>   to 15GB then I might
>   > have more work to do and creates more threads.
>   >
>   > According to your information, I should expect at least 5 threads 
> other than the main workers. So 10 threads look OK, but how about
>   the bigger machine
>   > which spawns only 6 threads?  
>   >
>   > I also had difficulties in detecting the worker threads that respond 
> to GET/SET requests on my results, I have attached two pictures,
>   one of them shows
>   > the actual location of each worker on various cores, and the second 
> one is showing userspace time spent for each worker. Apparently
>   worker thread number
>   > 1,2,4 and 5 have spent more time in userspace, so I'm concluding here 
> that 1,2,4 and 5 are my actual worker threads, and worker 3 and
>   6 are just internal
>   > worker threads of Memcached. Does that make sense to you?
>   >
>   > Thanks,
>   > Alireza
>   >
>   >
>   > On Sun, Dec 15, 2019 at 7:19 AM dormando  wrote:
>   >       What version of memcached is on each machine?
>   >
>   >       memcached doesn't use processes, it's multi-threaded. Different 
> versions
>   >       may have a different number of background threads. In the 
> latest version
>   >       there should be at least:
>   >
>   >       - listener thread (main "process")
>   >       - N worker threads
> 

Re: Non-deterministic number of Memcached children processes other than worker threads

2019-12-15 Thread Alireza Sanaee
Hi,

I did a stupid mistake, the big machine Memcached version seems different.
I guess that is the problem.

Thanks,
Alireza

On Mon, Dec 16, 2019 at 2:16 AM Alireza Sanaee  wrote:

> Hi,
>
> I'm investigating the Linux load balancer, meanwhile, I'm trying to
> understand what is happening in Memcached and just noticed the different
> number of Memcached threads on my machines. Linux doesn't provide
> immediate access to generally latency-sensitive threads/processes like
> Memcached worker threads, causing HoL and eventually long-tail latency, not
> a new thing though.
>
> But before that, I should know what worker threads I need to consider. You
> gave me some pointers to different internal threads of Memcached though.
> Even intermittent occurrence of some internal worker threads of
> Memcached(rebalancer, crawler or ...) might block some requests (I'm not
> sure if that's the case or not). ULE scheduler sounds suffering from the
> same flaw.
>
> The connection dispatcher is there, but I don't have so many connections
> creation, I have a good number of clients, and send requests at some rates
> until the end of the experiment. I'm using mutilate as my workload
> generator.
>
> Sure, I can check whether those are idle or not, I'm actually recording
> everything from `/proc///stat` so the status is also
> available there. It is a true fact that internal threads are IDLE most of
> the time and it is some sort of visible in the plot, but I just want to
> make sure that all mostly idle ones are just internal threads and not
> workers. As you said some times workloads are not loaded evenly making the
> worker threads more difficult to distinguish. I think this doesn't matter
> now.
>
> The main issue here is that I have 6 threads on my big machine and 10
> threads on my small machine, while I have the same Memcached configuration
> for both machines. I have attached the numbers for the two machines.
>
> Thanks,
> Alireza
>
> On Sun, Dec 15, 2019 at 3:40 PM dormando  wrote:
>
>> What're you trying to accomplish?
>>
>> Can you include the output of "stats" and "stats settings" on both
>> machines?
>>
>> Dumb question but you've looked at the output of `ps auxH`? If just using
>> htop you may not see the threads that're idle.
>>
>> TCP connections are pinned to a specific worker thread on connection.
>> Trivial benchmarks may not load the worker threads evenly, as the
>> connections are handed to threads evenly via round robin.
>>
>> On Sun, 15 Dec 2019, Alireza Sanaee wrote:
>>
>> > Hi,
>> > Thank you for the information,
>> >
>> > Sorry for miss using the word there, yes that's all threads. I'm using
>> the Memcached 1.5.20. I build it myself and then run my
>> experiments($MEMCACHED -u
>> > root -p 11211 -m $MAXMEM -c 1024 -t $MEMCACHED_THREADS). And I'm
>> checking the number of Memcached threads in htop output. It showed me 10
>> threads(workers
>> > included) in one machine and 6 threads(workers included) on the other
>> one.
>> >
>> > To share some more information, I have 200GB of memory for the bigger
>> machine that creates only 6 threads, and we have only 16GB of memory for
>> the machine
>> > that creates 10 threads. I'm just thinking maybe because the smaller
>> machine has less amount of space, and I'm actually filling in up to 15GB
>> then I might
>> > have more work to do and creates more threads.
>> >
>> > According to your information, I should expect at least 5 threads other
>> than the main workers. So 10 threads look OK, but how about the bigger
>> machine
>> > which spawns only 6 threads?
>> >
>> > I also had difficulties in detecting the worker threads that respond to
>> GET/SET requests on my results, I have attached two pictures, one of them
>> shows
>> > the actual location of each worker on various cores, and the second one
>> is showing userspace time spent for each worker. Apparently worker thread
>> number
>> > 1,2,4 and 5 have spent more time in userspace, so I'm concluding here
>> that 1,2,4 and 5 are my actual worker threads, and worker 3 and 6 are just
>> internal
>> > worker threads of Memcached. Does that make sense to you?
>> >
>> > Thanks,
>> > Alireza
>> >
>> >
>> > On Sun, Dec 15, 2019 at 7:19 AM dormando  wrote:
>> >   What version of memcached is on each machine?
>> >
>> >   memcached doesn't use processes, it's multi-threaded. Different
>> versions
>> >   may have a different number of background threads. In the latest
>> version
>> >   there should be at least:
>> >
>> >   - listener thread (main "process")
>> >   - N worker threads
>> >   - hash table maintenance thread
>> >   - async log thread (for `watch` commands)
>> >   - LRU maintainer thread
>> >   - LRU crawler thread
>> >   - slab rebalancer thread
>> >
>> >   they're all idle unless they need to do work. LRU maintenance
>> thread is
>> >   probably the most active, since it executes LRU maintenance work
>> deferred
>> >   from the worker threads. 

Re: Non-deterministic number of Memcached children processes other than worker threads

2019-12-15 Thread Alireza Sanaee
Hi,

I'm investigating the Linux load balancer, meanwhile, I'm trying to
understand what is happening in Memcached and just noticed the different
number of Memcached threads on my machines. Linux doesn't provide
immediate access to generally latency-sensitive threads/processes like
Memcached worker threads, causing HoL and eventually long-tail latency, not
a new thing though.

But before that, I should know what worker threads I need to consider. You
gave me some pointers to different internal threads of Memcached though.
Even intermittent occurrence of some internal worker threads of
Memcached(rebalancer, crawler or ...) might block some requests (I'm not
sure if that's the case or not). ULE scheduler sounds suffering from the
same flaw.

The connection dispatcher is there, but I don't have so many connections
creation, I have a good number of clients, and send requests at some rates
until the end of the experiment. I'm using mutilate as my workload
generator.

Sure, I can check whether those are idle or not, I'm actually recording
everything from `/proc///stat` so the status is also
available there. It is a true fact that internal threads are IDLE most of
the time and it is some sort of visible in the plot, but I just want to
make sure that all mostly idle ones are just internal threads and not
workers. As you said some times workloads are not loaded evenly making the
worker threads more difficult to distinguish. I think this doesn't matter
now.

The main issue here is that I have 6 threads on my big machine and 10
threads on my small machine, while I have the same Memcached configuration
for both machines. I have attached the numbers for the two machines.

Thanks,
Alireza

On Sun, Dec 15, 2019 at 3:40 PM dormando  wrote:

> What're you trying to accomplish?
>
> Can you include the output of "stats" and "stats settings" on both
> machines?
>
> Dumb question but you've looked at the output of `ps auxH`? If just using
> htop you may not see the threads that're idle.
>
> TCP connections are pinned to a specific worker thread on connection.
> Trivial benchmarks may not load the worker threads evenly, as the
> connections are handed to threads evenly via round robin.
>
> On Sun, 15 Dec 2019, Alireza Sanaee wrote:
>
> > Hi,
> > Thank you for the information,
> >
> > Sorry for miss using the word there, yes that's all threads. I'm using
> the Memcached 1.5.20. I build it myself and then run my
> experiments($MEMCACHED -u
> > root -p 11211 -m $MAXMEM -c 1024 -t $MEMCACHED_THREADS). And I'm
> checking the number of Memcached threads in htop output. It showed me 10
> threads(workers
> > included) in one machine and 6 threads(workers included) on the other
> one.
> >
> > To share some more information, I have 200GB of memory for the bigger
> machine that creates only 6 threads, and we have only 16GB of memory for
> the machine
> > that creates 10 threads. I'm just thinking maybe because the smaller
> machine has less amount of space, and I'm actually filling in up to 15GB
> then I might
> > have more work to do and creates more threads.
> >
> > According to your information, I should expect at least 5 threads other
> than the main workers. So 10 threads look OK, but how about the bigger
> machine
> > which spawns only 6 threads?
> >
> > I also had difficulties in detecting the worker threads that respond to
> GET/SET requests on my results, I have attached two pictures, one of them
> shows
> > the actual location of each worker on various cores, and the second one
> is showing userspace time spent for each worker. Apparently worker thread
> number
> > 1,2,4 and 5 have spent more time in userspace, so I'm concluding here
> that 1,2,4 and 5 are my actual worker threads, and worker 3 and 6 are just
> internal
> > worker threads of Memcached. Does that make sense to you?
> >
> > Thanks,
> > Alireza
> >
> >
> > On Sun, Dec 15, 2019 at 7:19 AM dormando  wrote:
> >   What version of memcached is on each machine?
> >
> >   memcached doesn't use processes, it's multi-threaded. Different
> versions
> >   may have a different number of background threads. In the latest
> version
> >   there should be at least:
> >
> >   - listener thread (main "process")
> >   - N worker threads
> >   - hash table maintenance thread
> >   - async log thread (for `watch` commands)
> >   - LRU maintainer thread
> >   - LRU crawler thread
> >   - slab rebalancer thread
> >
> >   they're all idle unless they need to do work. LRU maintenance
> thread is
> >   probably the most active, since it executes LRU maintenance work
> deferred
> >   from the worker threads. Older versions have some of these
> threads, but
> >   they were not enabled by default until 1.5.0.
> >
> >   -Dormando
> >
> >   On Sat, 14 Dec 2019, Alireza Sanaee wrote:
> >
> >   > Hello,
> >   > I'm running Memcached on two different machines with different
> specifications. And I specify the number of