Yes, our production traffic all uses binary protocol, even behind our
on-server proxy that we use. In fact, if you have a way to reduce syscalls
by batching responses, that would solve another huge pain we have that's of
our own doing.


*Scott Mansfield*

Product > Consumer Science Eng > EVCache > Sr. Software Eng
{
  M: 352-514-9452
  E: [email protected]
  K: {M: mobile, E: email, K: key}
}

On Wed, Jan 25, 2017 at 11:33 AM, dormando <[email protected]> wrote:

> Okay, so it's the big rollup that gets delayed. Makes sense.
>
> You're using binary protocol for everything? That's a major focus of my
> performance annoyance right now, since every response packet is sent
> individually. I should have that switched to an option at least pretty
> soon, which should also help with the time it takes to service them.
>
> I'll test both ascii and binprot + the req_per_event option to see how bad
> this is measurably.
>
> On Wed, 25 Jan 2017, 'Scott Mansfield' via memcached wrote:
>
> > The client is the EVCache client jar: https://github.com/netflix/evcache
> > When a user calls the batch get function on the client, it will spread
> those batch gets out over many servers because it is hashing keys to
> different servers. Imagine many of
> > these batch gets happening at the same time, though, and each server's
> queue will get a bunch of gets from a bunch of different user-facing batch
> gets. It all gets intermixed.
> > These client-side read queues are rather large (10000) and might end up
> sending a batch of a few hundred keys at a time. These large batch gets are
> sent off to the servers as
> > "one" getq|getq|getq|getq|getq|getq|getq|getq|getq|getq|noop package
> and read back in that order. We are reading the responses fairly
> efficiently internally, but the batch get
> > call that the user made is waiting on the data from all of these
> separate servers to come back in order to properly respond to the user in a
> synchronous manner.
> >
> > Now on the memcached side, there's many servers all doing this same
> pattern of many large batch gets. Memcached will stop responding to that
> connection after 20 requests on the
> > same event and go serve other connections. If that happens, any
> user-facing batch call that is waiting on any getq command still waiting to
> be serviced on that connection can
> > be delayed. It doesn't normally end up causing timeouts but it does at a
> low level.
> >
> > Our timeouts for this app in particular are 5 seconds for a single
> user-facing batch get call. This client app is fine with higher latency for
> higher throughput.
> >
> > At this point we have the reqs_per_event set to a rather high 300 and it
> seems to have solved our problem. I don't think it's causing any more
> consternation (for now), but
> > having a dynamic setting would have lowered the operational complexity
> of the tuning.
> >
> >
> > Scott Mansfield
> > Product > Consumer Science Eng > EVCache > Sr. Software Eng
> > {
> >   M: 352-514-9452
> >   E: [email protected]
> >   K: {M: mobile, E: email, K: key}
> > }
> >
> > On Wed, Jan 25, 2017 at 11:04 AM, dormando <[email protected]> wrote:
> >       I guess when I say dynamic I mostly mean runttime-settable.
> Dynamic is a
> >       little harder so I tend to do those as a second pass.
> >
> >       You're saying your client had head-of-line blocking for unrelated
> >       requests? I'm not 100% sure I follow.
> >
> >       Big multiget comes in, multiget gets processed slightly slower
> than normal
> >       due to other clients making requests, so requests *behind* the
> multiget
> >       time out, or the multiget itself?
> >
> >       How long is your timeout? :P
> >
> >       I'll take a look at it as well and see about raising the limit in
> `-o
> >       modern` after some performance tests. The default is from 2006.
> >
> >       thanks!
> >
> >       On Wed, 25 Jan 2017, 'Scott Mansfield' via memcached wrote:
> >
> >       > The reqs_per_event setting was causing a client that was doing
> large batch-gets (of a few hundred keys) to see some timeouts. Since
> memcached will delay
> >       responding fully until
> >       > other connections are serviced and our client will wait until
> the batch is done, we see some client-side timeouts for the users of our
> client library. Our
> >       solution has been to
> >       > up the setting during startup, but just as a thought experiment
> I was asking if we could have done it dynamically to avoid losing data. At
> the moment there's
> >       quite a lot of
> >       > machinery to change the setting (deploy, copy data over with our
> cache warmer, flip traffic, tear down old boxes) and I would have rather
> left everything as is
> >       and adjusted the
> >       > setting on the fly until our client's problem was resolved.
> >       > I'm interested in patching this specific setting to be settable,
> but having it fully dynamic in nature is not something I'd want to tackle.
> There's a natural
> >       tradeoff of
> >       > latency for other connections / throughput for the one that is
> currently being serviced. I'm not sure it's a good idea to dynamically
> change that. It might cause
> >       unexpected
> >       > behavior if one bad client sends huge requests.
> >       >
> >       >
> >       > Scott Mansfield
> >       > Product > Consumer Science Eng > EVCache > Sr. Software Eng
> >       > {
> >       >   M: 352-514-9452
> >       >   E: [email protected]
> >       >   K: {M: mobile, E: email, K: key}
> >       > }
> >       >
> >       > On Tue, Jan 24, 2017 at 11:53 AM, dormando <[email protected]>
> wrote:
> >       >       Hey,
> >       >
> >       >       Would you mind explaining a bit how you determined the
> setting was causing
> >       >       an issue, and what the impact was? The default there is
> very old and might
> >       >       be worth a revisit (or some kind of auto-tuning) as well.
> >       >
> >       >       I've been trending as much as possible to online
> configuration, inlcuding
> >       >       the actual memory limit.. You can turn the lru crawler on
> and off,
> >       >       automoving on and off, manually move slab pages, etc. I'm
> hoping to make
> >       >       the LRU algorithm itself modifyable at runtime.
> >       >
> >       >       So yeah, I'd take a patch :)
> >       >
> >       >       On Mon, 23 Jan 2017, 'Scott Mansfield' via memcached wrote:
> >       >
> >       >       > There was a single setting my team was looking at today
> and wish we could have changed dynamically: the
> >       >       > reqs_per_event setting. Right now in order to change it
> we need to shut down the process and start it again
> >       >       > with a different -R parameter. I don't see a way to
> change many of the settings, though there are some that
> >       >       > are ad-hoc changeable through some stats commands. I was
> going to see if I could patch memcached to be able
> >       >       > to change the reqs_per_event setting at runtime, but
> before doing so I wanted to check to see if that's
> >       >       > something that would be amenable. I also didn't want to
> do something specifically for that setting if it was
> >       >       > going to be better to add it as a general feature.
> >       >       > I see some pros and cons:
> >       >       >
> >       >       > One easy pro is that you can easily change things at
> runtime to save performance while not losing all of
> >       >       > your data. If client request patterns change, the
> process can react.
> >       >       >
> >       >       > A con is that the startup parameters won't necessarily
> match what the process is doing, so they are no
> >       >       > longer going to be a useful way to determine the
> settings of memcached. Instead you would need to connect
> >       >       > and issue a stats settings command to read them. It also
> introduces change in places that may have
> >       >       > previously never seen it, e.g. the reqs_per_event
> setting is simply read at the beginning of the
> >       >       > drive_machine loop. It might need some kind of
> synchronization around it now instead. I don't think it
> >       >       > necessarily needs it on x86_64 but it might on other
> platforms which I am not familiar with.
> >       >       >
> >       >       > --
> >       >       >
> >       >       > ---
> >       >       > You received this message because you are subscribed to
> the Google Groups "memcached" group.
> >       >       > To unsubscribe from this group and stop receiving emails
> from it, send an email to
> >       >       > [email protected].
> >       >       > For more options, visit https://groups.google.com/d/
> optout.
> >       >       >
> >       >       >
> >       >
> >       >       --
> >       >
> >       >       ---
> >       >       You received this message because you are subscribed to a
> topic in the Google Groups "memcached" group.
> >       >       To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       >       To unsubscribe from this group and all its topics, send an
> email to [email protected].
> >       >       For more options, visit https://groups.google.com/d/optout
> .
> >       >
> >       >
> >       > --
> >       >
> >       > ---
> >       > You received this message because you are subscribed to the
> Google Groups "memcached" group.
> >       > To unsubscribe from this group and stop receiving emails from
> it, send an email to [email protected].
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to a topic in
> the Google Groups "memcached" group.
> >       To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       To unsubscribe from this group and all its topics, send an email
> to [email protected].
> >       For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "memcached" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/memcached/C6l8aoXQO4A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to