1M get/s for the whole cluster. Its not a huge amount, but will illustrate
scaling logstash horizontally for enrichment purposes. Its large enough
that a single host won't handle it on its own.

As for dalli, I don't think its going to stack when run from within the
context of logstash. We'll have to test. I'm not aware of it running in any
kind of event loop, but I could be wrong. I'm pretty sure we can do
TCP_NODELAY at least to avoid waiting on the ack, but i still think its
going to block on responses. I'm not aware of any sessions in memcache (I
may be a bit dated here), so im not sure how a request could be async
without either its own connection or a requestid/session. With a pool of
connections, this is more feasible, but will only stack on the number of
tcp connections. Let me know if my thinking is right/wrong here.

We didn't implement batching, but thats something we could do. Doing a
multi-get against 100 log messages at a time may help a bit.

On Sat, Mar 17, 2018 at 3:17 PM, dormando <[email protected]> wrote:

> Got it.
>
> Mind if we talk through this a bit? I have a feeling you'll do okay
> without UDP.
>
> Are you looking at 1M sets + 1M gets/sec per memcached or for the whole
> cluster?
>
> UDP sets can't be multi-packet; it's not implemented at all. Jumbo frames
> may help (I forget if the MTU is discovered internally). It's relatively
> hard to do multi-packet requests since you have to manage timeouts on the
> server and no other code paths do this.
>
> Also, if you're using dalli there should hopefully be a method of stacking
> requests down connections with async callbacks? I've not looked at the API
> in a very long time.
>
> Binprot has "set quiet" and opaque's (message id's) for all queries. Which
> means updates can be simply stacked down the pipe without blocking on
> responses. Also lookups can be stacked down as the requests come in, while
> still waiting for a previous response.
>
> Just pure quiet binprot sets (with significant batching) can easily do
> 500k+ sets/sec with a single TCP connection. It might be as high as 1m,
> I'll have to check again. With less batching you'd need a few more
> connections/processors.
>
> Since the UDP code isn't using much batching wise nor REUSEPORT the
> performance ceiling is presently much lower than the TCP protocol,
> ironically.
>
> On Sat, 17 Mar 2018, Robert Gil wrote:
>
> > We haven't started tuning for UDP. That module was a PoC because all the
> other memcache modules were _incomplete_ in terms of feature set. We use
> > namespaces for different things like threats, assets, users, containers,
> etc. The plan is to also use pools to scale the memory and allow for easier
> > distributed update across all ingestion nodes.
> >
> > I plan to start testing using UDP and dalli, but if dalli doesn't do the
> trick, we'll re-write with the java client lib. For the UDP lookups, a
> single
> > key should be fine in 1400 bytes for common lookups like IPs. We can
> also consider jumboframes (9000) for increased payload. Multi packet sets
> may be
> > necessary, especially if we do lookups on commonly exploited URI
> patterns with long URIs.
> >
> > With regard to testing, i'd like to PoC this with the goal of doing 1M
> events/s with 1M lookups/s for enrichment.
> >
> > On Sat, Mar 17, 2018 at 2:58 PM, dormando <[email protected]> wrote:
> >       Hey,
> >
> >       That is exactly the use case I would expect out of UDP! Thanks for
> >       responding so quickly.
> >
> >       In your example module I can't quickly figure out how UDP is
> configured
> >       (it just seems to import dalli and use it?)
> >
> >       Are you using multi-packet responses for UDP lookups? If you're
> running
> >       sets via UDP they can't be more than 1400 bytes in size (more like
> >       1300ish). That's generally been okay?
> >
> >       On Sat, 17 Mar 2018, Robert Gil wrote:
> >
> >       > I still have a use case for it. We're looking to make the
> highest performing enrichment for log ingestion possible.
> >       >
> >       > https://www.elastic.co/blog/elasticsearch-data-enrichment-
> with-logstash-a-few-security-examples
> >       >
> >       > In the use case described above, we aim to enrich as close to
> real time as possible for log ingestion. Small, light, lookups... non
> >       blocking updates for
> >       > threat and other security related feeds.
> >       >
> >       > We're not too concerned about misses since we need to back
> enrich/compare against data that has already been ingested.
> >       >
> >       > Thoughts?
> >       >
> >       > Rob
> >       >
> >       > On Sat, Mar 17, 2018 at 2:22 PM, dormando <[email protected]>
> wrote:
> >       >       Not sure how active this list is anymore :P
> >       >
> >       >       Are any of you still listening users of the UDP protocol?
> If so, mind
> >       >       reaching out to me (here or privately) to explain your use
> case?
> >       >
> >       >       I'm thinking around some long term options with it; one is
> to turn it into
> >       >       a compile flag, and another is to keep it (and update it
> with SO_REUSEPORT
> >       >       and *_mmsg under linux) but restrict responses to single
> packet. It can
> >       >       still be useful for fire-and-forget cases (ie; spraying
> touch's or
> >       >       fetching flag keys asynchronously), but there isn't much
> use for it
> >       >       otherwise these days.
> >       >
> >       >       Thanks,
> >       >       -Dormando
> >       >
> >       >       --
> >       >
> >       >       ---
> >       >       You received this message because you are subscribed to
> the Google Groups "memcached" group.
> >       >       To unsubscribe from this group and stop receiving emails
> from it, send an email to [email protected].
> >       >       For more options, visit https://groups.google.com/d/optout
> .
> >       >
> >       >
> >       > --
> >       >
> >       > ---
> >       > You received this message because you are subscribed to the
> Google Groups "memcached" group.
> >       > To unsubscribe from this group and stop receiving emails from
> it, send an email to [email protected].
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to the Google
> Groups "memcached" group.
> >       To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected].
> >       For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to