Re: benchmarking issues

dormando Fri, 26 Mar 2021 17:44:29 -0700

Hey,

> This worked! However it seems like TCP and UDP latency now is about the same 
> with my code as well as with a real
> benchmarking tool (memaslap).


I don't use memaslap so I can't speak to it. I use mc-crusher for the
"official" testing, though admittedly it's harder to configure.

> Not sure I understand the scalability point. From my observations, if I do a 
> multiget, I get separate packet
> sequences for each response. So each get value could be about 2^16 * 1400 
> bytes big and still be ok via UDP
> (assuming everything arrives)? One thing that seemed hard is each separate 
> sequence has the same requestId, which
> makes deciding what to do difficult in out-of-order arrival scenarios. 

mostly RE: kernel/syscall stuff. Especially after the TCP optimizations in
1.6, UDP mode will just be slower at high request rates. It will end up
running a lot more syscalls.

> SO_REUSEPORT seems to be supported in the linux kernel in 3.9. But I 
> definitely understand the decision to not
> spend much time optimizing the UDP protocol. I did see higher rusage_user and 
> much higher rusage_system when
> using UDP, which maybe corresponds to what you are saying. I tried with 
> memaslap and observed the same thing.

Yeah, see above.

> No pressing issue really.  We saw this (admittedly old) paper discussing how 
> Facebook was able to reduce get
> latency by 20% by switching to UDP. Memcached get latency is a key factor in 
> our overall system latency so we
> thought it would be worth a try, and it would ease some pressure on our 
> network infrastructure as well. Do you
> know if Facebook's changes ever made it back into the main memcached 
> distribution?

I wish there was some way I could make that paper stop existing. Those
changes went into memcached 1.2, 13+ years ago. I'm reasonably certain
facebook doesn't use UDP for memcached and hasn't in a long time. None of
their more recent papers (Which also stop around 2014) mention UDP at all.

The best performance you can get is by ensuring multiple requests are
pipelined at once, and there are a reasonable number of worker threads
(not more than one per CPU). If you see anything odd or have quetions
please bring up specifics, share server settings, etc.

> Thanks
> Kireet
>  
>
>       -Dormando
>
>       On Fri, 26 Mar 2021, kmr wrote:
>
>       > We are trying to experiment with using UDP vs TCP for gets to see 
> what kind of speedup we can
>       achieve. I wrote a
>       > very simple benchmark that just uses a single thread to set a key 
> once and do gets to retrieve the
>       key over and
>       > over. We didn't notice any speedup using UDP. If anything we saw a 
> slight slowdown which seemed
>       strange. 
>       > When checking the stats delta, I noticed a really high value for 
> lrutail_reflocked. For a test
>       doing 100K gets,
>       > this value increased by 76K. In our production system, memcached 
> processes that have been running
>       for weeks have
>       > a very low value for this stat, less than 100. Also the latency 
> measured by the benchmark seems to
>       correlate to
>       > the rate at which that value increases. 
>       >
>       > I tried to reproduce using the spy java client and I see the same 
> behavior, so I think it must be
>       something wrong
>       > with my benchmark design rather than a protocol issue. We are using 
> 1.6.9. Here is a list of all
>       the stats values
>       > that changed during a recent run using TCP:
>       >
>       > stats diff:
>       >   * bytes_read: 10,706,007
>       >   * bytes_written: 426,323,216
>       >   * cmd_get: 101,000
>       >   * get_hits: 101,000
>       >   * lru_maintainer_juggles: 8,826
>       >   * lrutail_reflocked: 76,685
>       >   * moves_to_cold: 76,877
>       >   * moves_to_warm: 76,917
>       >   * moves_within_lru: 450
>       >   * rusage_system: 0.95
>       >   * rusage_user: 0.37
>       >   * time: 6
>       >   * total_connections: 2
>       >   * uptime: 6
>       >
>       > --
>       >
>       > ---
>       > You received this message because you are subscribed to the Google 
> Groups "memcached" group.
>       > To unsubscribe from this group and stop receiving emails from it, 
> send an email to
>       > memcached+...@googlegroups.com.
>       > To view this discussion on the web visit
>       >
>       
> https://groups.google.com/d/msgid/memcached/8efbc45d-1d6c-4563-a533-fdbd95457223n%40googlegroups.com.
>       >
>       >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to
> memcached+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/memcached/08c496de-c686-401b-80d6-ad40c55a4e6dn%40googlegroups.com.
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/memcached/8f90544f-c1e8-851c-d2ac-1016138bdfad%40rydia.net.

Re: benchmarking issues

Reply via email to