[graylog2] Re: Random exceptions on large datasets / lost messages

Jochen Schalanda Fri, 04 Sep 2015 07:50:52 -0700

Hi Marcel,

could you please post those exceptions in full?


Cheers,
Jochen

On Friday, 4 September 2015 15:43:56 UTC+2, Marcel Manz wrote:
>
> Hi all
>
> We have a setup of 2 graylog servers (1.1.6), both of which are running ES 
> 1.7.1 in redundant setup behind a load balancer.
>
> When we do searches over a longer period of time (eg. 1 month search, 
> which involves approximately 300 million messages) we several times managed 
> to get an exception in the web interface, that in worst case caused either 
> the graylog server process or elasticsearch to fail and required restarting 
> those services.
>
> Yesterday such exception happened to us on a search, for which Graylog 
> couldn't write anymore to ES and started filling up its internal journal. 
> After we restarted ES and ES recovered the indexes, the graylog journal got 
> flushed to ES. Unfortunately when we now search and look in the histogram, 
> we don't see any messages for the short period the outage happened.
>
> We already tried recalculating the index ranges (completed successfully), 
> but the messages still don't show up. As we could clearly see that messages 
> got queued in GL's journal (> 100 K messages during the few minute window) 
> and then flushed to ES, we believe that the messages actually got stored in 
> ES, but somehow GL is unable to see them.
>
> How can we investigate this, as it concerns us that messages could be 
> lost, even though GL's journal was used during time of error.
>
> Thanks
>
> Best regards,
> Marcel
>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/2cce9a56-5403-412f-8a4e-a2ba20b6c1d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[graylog2] Re: Random exceptions on large datasets / lost messages

Reply via email to