Thanks for the follow up, very interesting. I had a lot of trouble going the VM route, mostly due to slower backend storage I was given. I have moved our environment off onto some old G6 HP's we had laying around. I was able to get some funds for SSDs and have seen a remarkable increase in search performance even under high write load. I simplified with 1 master and 2 data nodes, Granted, I only expect a max ingestion rate of around 1500 to 2500 logs per second once everything is pumping logs to graylog. The current peak runs around 800-1200 logs per second. On occasion, the buffers will get filled up and when things break loose I can see sustained output of 7000 - 9000 logs written to ES so I know I should have room or growth.
The hot-warm model is very intriguing to me in the way you described your setup. That has been one of our discussions where I work. The current solution for long term storage is an rsyslog collector. We run everything through rsyslog first, then write to storage from that server prior to shipping to graylog. We keep a years worth of raw logs and 45 days available through graylog. You also got my gears going because I hadn't even considered using a RAM disk. This is my little setup. HP G6 dual quad 2.5GHz with 48GB of RAM OS is on 15k spinners in a RAID1 300GB Data is on SSDs in a RAID5 6x200GB only SSDs on actual ES data nodes Everything is installed using RPM except for graylog, which I use the tarball and install to /opt as well. I have made some tuning changes that I helped throughput and seem to be fairly stable I'll be honest, I don't really know what I'm doing. I just try stuff and see if it makes it better or worse ;) Regards, Brandon On Friday, December 9, 2016 at 10:33:33 AM UTC-6, Jason Close wrote: > > I must also say that I don't like this setup. > > I think that 3-5 fully iron nodes with 64GB of RAM and a TB or 3 of > on-board storage would be plenty for what we need. > > Even better would be 2-3 boxes with 128GB of RAM, and almost no storage. > Those boxes would just write the indexes to RAMDISK. Then just 2-3 more > boxes on backend with lots of storage and 64GB of RAM for aged indices. > > 15-20 boxes should not be necessary for this. But when you are running in > VMs and are trying to use NFS in some capacity, performance takes a > nose-dive. It's made things much more complicated. I can say that our > other datacenter was handling 60k eps into ES+Kibana with Logstash on 8 > boxes each with 64GB of RAM, and we were keeping up just fine. > > Our holdup has always been logstash and preprocessing for ingestion. > > Jas > > On Friday, December 9, 2016 at 10:19:36 AM UTC-6, Jason Close wrote: >> >> I can tell you that we are trying to use the following setup: >> - 2 'hot' nodes with 32GB RAM, 16-core and 3TB onboard storage RAID 1+0 >> - 10 'warm' nodes with 16GB RAM and only around 50GB of onboard storage >> (VMs) >> - 5 'cold' nodes, with 16GB RAM and nearly limitless NFS attached >> storage (VMs) >> - 1 nginx software based load balancer (VM same as the others) >> >> All are running RHEL7. >> >> The goal is to make the 'hot' nodes face the brunt of the ingestion. We >> are looking at around 20k eps. >> The warm nodes will take the remainder of the ingestion. >> >> Hot and warm nodes will store indexes on their internal HD for up to 2 >> hours, in which they will age over to the cold nodes and move into NFS >> storage. >> >> I have all of the nodes nearly the same. The only thing that is >> different is the 2 'hot' boxes have more RAM assigned to ES and Graylog. >> >> As far as mods go, I'm still tinkering. I had to turn off firewalld, but >> that's fine because it's in an internal network. I originally had my >> syslog server sending sources directly to the graylog cluster, but the VMs >> were getting overwhelmed. So I then added the nginx load balancer to load >> balance the UDP traffic, and that helped. It just round robins among the >> sources. >> >> Other than that, it's pretty self explanatory. I run everything out of >> the tarballs (no RPMs) so that I can easily upgrade (RHEL can take a while >> for new supported RPMs). I install everything in /opt, and put everything >> in its own log destination in /var/log. >> >> Hope this helps some. Ask me more if you are curious about anything. >> >> Jas >> >> >> >> On Friday, December 9, 2016 at 8:49:08 AM UTC-6, BKeep wrote: >>> >>> As an interested nerd, would either or both of you be willing to share >>> some details about your environments and hardware setups? I'm always >>> curious about what other users of Graylog are doing. A couple of things >>> that would interest me are your hardware specs, what underlying OS, any OS >>> level modifications you are making, how many ES/GL nodes, how many log >>> clients, msg/s etc. >>> >>> Regards, >>> Brandon >>> >>> On Saturday, December 3, 2016 at 9:13:51 AM UTC-6, Dustin Tennill wrote: >>>> >>>> All, >>>> >>>> We just finished implementing >>>> https://www.elastic.co/blog/hot-warm-architecture >>>> <https://www.elastic.co/blog/hot-warm-architecture?blade=tw> for our >>>> Graylog environment. After weeks of troubleshooting elasticsearch >>>> performance issues with our budget ES nodes, the addition of a two small >>>> SSD nodes REALLY made a difference. Our output buffers had been filling up >>>> from time to time, and this appears to have resolved that issue. >>>> >>>> If anyone is interested, we will post our config information. >>>> >>>> Dustin Tennill >>>> EKU >>>> >>>> -- You received this message because you are subscribed to the Google Groups "Graylog Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/graylog2/1f66f502-fbf6-42a2-9f27-5d1809af288e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
