1M get/s for the whole cluster. Its not a huge amount, but will illustrate scaling logstash horizontally for enrichment purposes. Its large enough that a single host won't handle it on its own.
As for dalli, I don't think its going to stack when run from within the context of logstash. We'll have to test. I'm not aware of it running in any kind of event loop, but I could be wrong. I'm pretty sure we can do TCP_NODELAY at least to avoid waiting on the ack, but i still think its going to block on responses. I'm not aware of any sessions in memcache (I may be a bit dated here), so im not sure how a request could be async without either its own connection or a requestid/session. With a pool of connections, this is more feasible, but will only stack on the number of tcp connections. Let me know if my thinking is right/wrong here. We didn't implement batching, but thats something we could do. Doing a multi-get against 100 log messages at a time may help a bit. On Sat, Mar 17, 2018 at 3:17 PM, dormando <[email protected]> wrote: > Got it. > > Mind if we talk through this a bit? I have a feeling you'll do okay > without UDP. > > Are you looking at 1M sets + 1M gets/sec per memcached or for the whole > cluster? > > UDP sets can't be multi-packet; it's not implemented at all. Jumbo frames > may help (I forget if the MTU is discovered internally). It's relatively > hard to do multi-packet requests since you have to manage timeouts on the > server and no other code paths do this. > > Also, if you're using dalli there should hopefully be a method of stacking > requests down connections with async callbacks? I've not looked at the API > in a very long time. > > Binprot has "set quiet" and opaque's (message id's) for all queries. Which > means updates can be simply stacked down the pipe without blocking on > responses. Also lookups can be stacked down as the requests come in, while > still waiting for a previous response. > > Just pure quiet binprot sets (with significant batching) can easily do > 500k+ sets/sec with a single TCP connection. It might be as high as 1m, > I'll have to check again. With less batching you'd need a few more > connections/processors. > > Since the UDP code isn't using much batching wise nor REUSEPORT the > performance ceiling is presently much lower than the TCP protocol, > ironically. > > On Sat, 17 Mar 2018, Robert Gil wrote: > > > We haven't started tuning for UDP. That module was a PoC because all the > other memcache modules were _incomplete_ in terms of feature set. We use > > namespaces for different things like threats, assets, users, containers, > etc. The plan is to also use pools to scale the memory and allow for easier > > distributed update across all ingestion nodes. > > > > I plan to start testing using UDP and dalli, but if dalli doesn't do the > trick, we'll re-write with the java client lib. For the UDP lookups, a > single > > key should be fine in 1400 bytes for common lookups like IPs. We can > also consider jumboframes (9000) for increased payload. Multi packet sets > may be > > necessary, especially if we do lookups on commonly exploited URI > patterns with long URIs. > > > > With regard to testing, i'd like to PoC this with the goal of doing 1M > events/s with 1M lookups/s for enrichment. > > > > On Sat, Mar 17, 2018 at 2:58 PM, dormando <[email protected]> wrote: > > Hey, > > > > That is exactly the use case I would expect out of UDP! Thanks for > > responding so quickly. > > > > In your example module I can't quickly figure out how UDP is > configured > > (it just seems to import dalli and use it?) > > > > Are you using multi-packet responses for UDP lookups? If you're > running > > sets via UDP they can't be more than 1400 bytes in size (more like > > 1300ish). That's generally been okay? > > > > On Sat, 17 Mar 2018, Robert Gil wrote: > > > > > I still have a use case for it. We're looking to make the > highest performing enrichment for log ingestion possible. > > > > > > https://www.elastic.co/blog/elasticsearch-data-enrichment- > with-logstash-a-few-security-examples > > > > > > In the use case described above, we aim to enrich as close to > real time as possible for log ingestion. Small, light, lookups... non > > blocking updates for > > > threat and other security related feeds. > > > > > > We're not too concerned about misses since we need to back > enrich/compare against data that has already been ingested. > > > > > > Thoughts? > > > > > > Rob > > > > > > On Sat, Mar 17, 2018 at 2:22 PM, dormando <[email protected]> > wrote: > > > Not sure how active this list is anymore :P > > > > > > Are any of you still listening users of the UDP protocol? > If so, mind > > > reaching out to me (here or privately) to explain your use > case? > > > > > > I'm thinking around some long term options with it; one is > to turn it into > > > a compile flag, and another is to keep it (and update it > with SO_REUSEPORT > > > and *_mmsg under linux) but restrict responses to single > packet. It can > > > still be useful for fire-and-forget cases (ie; spraying > touch's or > > > fetching flag keys asynchronously), but there isn't much > use for it > > > otherwise these days. > > > > > > Thanks, > > > -Dormando > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to > the Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving emails > from it, send an email to [email protected]. > > > For more options, visit https://groups.google.com/d/optout > . > > > > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to the > Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving emails from > it, send an email to [email protected]. > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, > send an email to [email protected]. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > --- > You received this message because you are subscribed to the Google Groups > "memcached" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
