Thanks for the update Tyler! On Wed, May 7, 2014 at 12:04 AM, Tyler Bell <[email protected]> wrote: > I think I just found the issue. I thought we had a box big enough to run the > Graylog2 server, plus Web Interface, but we had a bunch of Steams enabled > recently. We disabled them to see what would happen and we came back to full > processing capacity (~1750 msg/s). I'm suggesting that we get a separate box > for the web interface now. > > > On Tuesday, May 6, 2014 12:53:44 PM UTC-6, Tyler Bell wrote: >> >> There are no ES errors. Cluster Health is Green. I see data being added to >> my /data partition. Is there a way to see what else ES could be doing that >> would force Graylog to only process 1/3 of the logs it was processing a week >> ago? >> >> { >> "cluster_name" : "XXXXXXXXX", >> "status" : "green", >> "timed_out" : false, >> "number_of_nodes" : 3, >> "number_of_data_nodes" : 2, >> "active_primary_shards" : 320, >> "active_shards" : 320, >> "relocating_shards" : 0, >> "initializing_shards" : 0, >> "unassigned_shards" : 0 >> } >> >> >> On Tuesday, May 6, 2014 12:29:53 PM UTC-6, lennart wrote: >>> >>> Can you check your ElasticSearch logs for errors? I am pretty sure it >>> is the reason. >>> >>> On Tue, May 6, 2014 at 5:57 PM, Tyler Bell <[email protected]> >>> wrote: >>> > I'm having an issue with Graylog continuously falling behind with log >>> > processing, and the MasterCache filling up til the 10G of Heap Space >>> > maxes >>> > out and crashes. The really weird thing is that a week ago, everything >>> > was >>> > processing fine and I was taking between 1500-2000 msg/s. Now I barely >>> > get >>> > over 500-750 msg/s. I don't think ElasticSearch is the issue because >>> > none of >>> > the OutputCache or Buffer is increasing. >>> > >>> > I'm wondering if it has something to do with this: Number of indices >>> > (80) >>> > higher than limit (20). Running retention for 60 indices. It doesn't >>> > look >>> > like Graylog is properly rotating indexes and running this retention >>> > instead. >>> > >>> > After restarting graylog2 and emptying cache... >>> > [util][caches][2014-05-06T08:46:04.850-07:00] InputCache size: 5758 >>> > [util][caches][2014-05-06T08:46:04.850-07:00] OutputCache size: 0 >>> > [util][buffers][2014-05-06T08:46:04.850-07:00] OutputBuffer is at 0.0%. >>> > [0/2048] >>> > [util][buffers][2014-05-06T08:46:04.850-07:00] ProcessBuffer is at >>> > 33.251953%. [681/2048] >>> > [util][heap][2014-05-06T08:46:04.850-07:00] Used memory (MB): 1465 >>> > [util][heap][2014-05-06T08:46:04.850-07:00] Free memory (MB): 8330 >>> > [util][heap][2014-05-06T08:46:04.850-07:00] Total memory (MB): 9814 >>> > [util][heap][2014-05-06T08:46:04.850-07:00] Max memory (MB): 9814 >>> > [util][written][2014-05-06T08:46:04.850-07:00] Messages written to all >>> > outputs: 1561 >>> > >>> > >>> > After MasterCache fills up a bit >>> > [util][caches][2014-05-06T08:42:18.109-07:00] InputCache size: 2487587 >>> > [util][caches][2014-05-06T08:42:18.109-07:00] OutputCache size: 0 >>> > [util][buffers][2014-05-06T08:42:18.109-07:00] OutputBuffer is at 0.0%. >>> > [0/2048] >>> > [util][buffers][2014-05-06T08:42:18.109-07:00] ProcessBuffer is at >>> > 40.429688%. [828/2048] >>> > [util][heap][2014-05-06T08:42:18.109-07:00] Used memory (MB): 6392 >>> > [util][heap][2014-05-06T08:42:18.109-07:00] Free memory (MB): 3736 >>> > [util][heap][2014-05-06T08:42:18.109-07:00] Total memory (MB): 10129 >>> > [util][heap][2014-05-06T08:42:18.109-07:00] Max memory (MB): 10129 >>> > [util][written][2014-05-06T08:42:18.109-07:00] Messages written to all >>> > outputs: 3100 >>> > >>> > >>> > ES Node config: (GLNode0 is the Graylog server). I know mlockall is >>> > false, >>> > and is configured to be true, but these are virtualized servers and >>> > there >>> > are some issues there. >>> > >>> > { >>> > "ok" : true, >>> > "cluster_name" : "Graylog2", >>> > "nodes" : { >>> > "X.X.X.X" : { >>> > "name" : "GLNode1", >>> > "transport_address" : "inet[/X.X.X.X:9300]", >>> > "hostname" : "X.X.X.X", >>> > "version" : "0.90.10", >>> > "http_address" : "inet[/X.X.X.X:9200]", >>> > "attributes" : { >>> > "master" : "true" >>> > }, >>> > "process" : { >>> > "refresh_interval" : 1000, >>> > "id" : 1611, >>> > "max_file_descriptors" : 32000, >>> > "mlockall" : false >>> > } >>> > }, >>> > "X.X.X.X" : { >>> > "name" : "GLNode0", >>> > "transport_address" : "inet[/X.X.X.X:9350]", >>> > "hostname" : "X.X.X.X", >>> > "version" : "0.90.10", >>> > "attributes" : { >>> > "client" : "true", >>> > "data" : "false", >>> > "master" : "false" >>> > }, >>> > "process" : { >>> > "refresh_interval" : 1000, >>> > "id" : 28382, >>> > "max_file_descriptors" : 4096, >>> > "mlockall" : false >>> > } >>> > }, >>> > "X.X.X.X" : { >>> > "name" : "GLNode2", >>> > "transport_address" : "inet[/X.X.X.X:9300]", >>> > "hostname" : "X.X.X.X", >>> > "version" : "0.90.10", >>> > "http_address" : "inet[/X.X.X.X:9200]", >>> > "attributes" : { >>> > "master" : "false" >>> > }, >>> > "process" : { >>> > "refresh_interval" : 1000, >>> > "id" : 4508, >>> > "max_file_descriptors" : 32000, >>> > "mlockall" : false >>> > } >>> > } >>> > } >>> > } >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> > Groups >>> > "graylog2" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> > an >>> > email to [email protected]. >>> > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "graylog2" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout.
-- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
