Hi there

I currently have the default "rotation_strategy = count" with 20M documents 
with 20 indices, and have an incoming syslog feed into it. So that roughly 
means when the 400,000,001 syslog record enters the system, the first index 
is deleted (as the 21st index is created)

I want to move to the "time" strategy: basically I want to keep 30 days of 
logs around. So I could do "rotation_strategy = time" and 
"elasticsearch_max_time_per_index = 1d" and 
"elasticsearch_max_number_of_indices = 30"

All well and good. However, how does the performance vary with index size? 
As the default is 20M records, that implies to me that was chosen by the 
developers for good reason - so should I try to approximately match that 
with "time" too? eg should I let it run a few days, get a feel for G/hour 
growth rate and then choose a "elasticsearch_max_time_per_index" value that 
would create ~20M record indices and then change 
"elasticsearch_max_number_of_indices" to multiply out to 30? Or should I 
instead increase elasticsearch's sharding: eg if the indices are 10x the 
"count" model, should I have 10x more shards per indice to keep about the 
same performance?

eg I just did a search over the past 24 hours and it had to go through 8 
indices - would it have performed as well if there was only one (bigger) 
index?

Sorry these questions are so dumb - hopefully I'm learning fast :-)

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/4b16361c-0da0-4980-8bcc-504a2eab2eba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to