[graylog2] Re: query about performance of time retention_policy settings

Arie Sun, 01 Nov 2015 12:55:09 -0800

I asked this at the elastic.on-tour in Amsterdam

They told me that the size of an indice could be up to 50GB (the sum of its 
shards)


Altho I think there is one thing in relation to graylog and search speed. 
Graylog
appears th know what is stored where in relation of time to improve search 
speed I thing.



Op dinsdag 4 augustus 2015 15:18:34 UTC+2 schreef Jochen Schalanda:
>
> Hi Jason,
>
> the answer to your question depends on multiple factors, like the 
> structure of your log messages, their average size, the available hardware 
> resources for Graylog and Elasticsearch, and the kind of queries you've 
> been running.
>
> In short, modern hardware with decent amounts of memory should easily be 
> able to handle Elasticsearch indices with lots (i. e. millions) of indexed 
> documents. For your specific case, you should simply test if switching to a 
> time-based rotation strategy helps your use cases or not.
>
>
> Cheers,
> Jochen
>
> On Tuesday, 4 August 2015 00:00:25 UTC+2, Jason Haar wrote:
>>
>> Hi there
>>
>> I currently have the default "rotation_strategy = count" with 20M 
>> documents with 20 indices, and have an incoming syslog feed into it. So 
>> that roughly means when the 400,000,001 syslog record enters the system, 
>> the first index is deleted (as the 21st index is created)
>>
>> I want to move to the "time" strategy: basically I want to keep 30 days 
>> of logs around. So I could do "rotation_strategy = time" and 
>> "elasticsearch_max_time_per_index = 1d" and 
>> "elasticsearch_max_number_of_indices = 30"
>>
>> All well and good. However, how does the performance vary with index 
>> size? As the default is 20M records, that implies to me that was chosen by 
>> the developers for good reason - so should I try to approximately match 
>> that with "time" too? eg should I let it run a few days, get a feel for 
>> G/hour growth rate and then choose a "elasticsearch_max_time_per_index" 
>> value that would create ~20M record indices and then change 
>> "elasticsearch_max_number_of_indices" to multiply out to 30? Or should I 
>> instead increase elasticsearch's sharding: eg if the indices are 10x the 
>> "count" model, should I have 10x more shards per indice to keep about the 
>> same performance?
>>
>> eg I just did a search over the past 24 hours and it had to go through 8 
>> indices - would it have performed as well if there was only one (bigger) 
>> index?
>>
>> Sorry these questions are so dumb - hopefully I'm learning fast :-)
>>
>> Thanks
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/4dc628da-e8a7-47bb-b3f7-3c4368c58a9f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[graylog2] Re: query about performance of time retention_policy settings

Reply via email to