[graylog2] query about performance of time retention_policy settings

Jason Haar Mon, 03 Aug 2015 15:00:37 -0700

Hi there

I currently have the default "rotation_strategy = count" with 20M documents 
with 20 indices, and have an incoming syslog feed into it. So that roughly 
means when the 400,000,001 syslog record enters the system, the first index 
is deleted (as the 21st index is created)

I want to move to the "time" strategy: basically I want to keep 30 days of
logs around. So I could do "rotation_strategy = time" and
"elasticsearch_max_time_per_index = 1d" and
"elasticsearch_max_number_of_indices = 30"

All well and good. However, how does the performance vary with index size?
As the default is 20M records, that implies to me that was chosen by the
developers for good reason - so should I try to approximately match that
with "time" too? eg should I let it run a few days, get a feel for G/hour
growth rate and then choose a "elasticsearch_max_time_per_index" value that
would create ~20M record indices and then change
"elasticsearch_max_number_of_indices" to multiply out to 30? Or should I
instead increase elasticsearch's sharding: eg if the indices are 10x the
"count" model, should I have 10x more shards per indice to keep about the
same performance?

eg I just did a search over the past 24 hours and it had to go through 8
indices - would it have performed as well if there was only one (bigger)
index?

Sorry these questions are so dumb - hopefully I'm learning fast :-)

Thanks

--
You received this message because you are subscribed to the Google Groups
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/graylog2/4b16361c-0da0-4980-8bcc-504a2eab2eba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[graylog2] query about performance of time retention_policy settings

Reply via email to