Hi, see https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html and https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/ for general Elasticsearch tuning hints.
Your number of processbuffer_processors and outputbuffer_processors is also quite high for the given hardware and should be reduced (keep the defaults, if you're unsure). Cheers, Jochen On Tuesday, 27 September 2016 15:01:08 UTC+2, [email protected] wrote: > > Hello Jochen, > > Ok I see what you are saying and I guess even if the stream processing is > rather faster, having an average spike of 1000msg/sec is a huge load and > will cause backlog. > > Inbound: > 15 minute avg:1,003.21 events/secondOubound:15 minute avg:687.23 > events/secondProcessTime:Mean:8,585μs > > Here is the setup I currently tweak and still experienced the issue: > > 4 vCPU > 8GB RAM > 2GB graylog > 60% RAM Elastic > > > > > > Path:/var/opt/graylog/data/journalEarliest entry:7 minutes agoMaximum > size:5.0GBMaximum age:12 hours 0 minutesFlush policy:Every 1,000,000 > messages or 1 minutes 0 seconds > > processbuffer_processors = 10 > outputbuffer_processors = 10 > ring_size = 65536 > output_batch_size = 1000 > output_flush_interval = 1 > > Any advice on tuning to improve the handling of the load? I'm thinking > raising the CPU number and the processors. Not sure if ring size will > change anything. And raising journal from 1GB to 5GB only delayed the > issue. > > > On Tuesday, 27 September 2016 06:12:36 UTC-4, Jochen Schalanda wrote: >> >> Hi, >> >> On Monday, 26 September 2016 16:31:21 UTC+2, [email protected] wrote: >>> >>> As a 'coincidence', the the journal filled up to maximum capacity (and >>> failed) really quickly during the same period due to spikes in events at >>> that time (expected) so I adjusted the journal >>> size, processbuffer_processors and outputbuffer_processors in hopes it will >>> solve that part. >>> >>> However, can both events be related? If so, how? I'm not sure how the >>> journal issue can lead to the stream processing issue. >>> >> >> Yes, they can be related, but it's usually the other way round: The disk >> journal fills up because message processing is too slow. >> >> Cheers, >> Jochen >> > -- You received this message because you are subscribed to the Google Groups "Graylog Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/graylog2/349e746f-5a7c-41e1-9d4e-85128fb61ed3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
