We are doing the classic ELK stack. However, doing that isn’t straightforward. Log Forwarders seem to think that all log events should not have newlines in them. However, Java logs with stack traces usually do. You really need to log with a structured format like JSON to be able to get the searchable attributes I nto Elastic. GELF works great except a) Log Forwarders generally want to parse the data and make it structured, which doesn’t work well when the data is already structured b) The GELF spec specifies both a format and ties itself to UDP/TCP. c) getting FileBeat or FluentBit to consume GELF coming from a file doesn’t really work since they won’t allow the event to be terminated by a null value.
We were logging to Logstash from the app using the TCP SocketAppender with the GELF Layout. This worked great until our Elastic ran out of disk space. That caused Logstash to hang which, in turn, caused all the applications to hang. The socket write didn’t time out. As a consequence we are going back to logging rolling files and using a log forwarder to send the logs to Logstash. I am now in the process of determining what equivalent JSON I can use to get Logstash to do the right thing. FWIW, I wouldn’t consider Kafka to be an acceptable database of logging events. While it will certainly store them I don’t see how you could effectively query them. Cassandra would work but my experience with it was that it was not simple to manage and required a fair bit of work to get the data in so that it could effectively be queried. Ralph > On Aug 3, 2021, at 9:50 AM, Matt Sicker <[email protected]> wrote: > > Hey all, I have a somewhat practical question related to logging here. > For those of you maintaining a structured event log or audit log of > some sort, what types of event log stores are you using to append them > to? I feel like solutions like Splunk, ELK, etc., are geared toward > diagnostic logs which don't necessarily need retention beyond a > relatively short time period. On the other hand, one of the more > natural append-only storage solutions I can think of is Kafka, though > that, too, isn't really geared toward long term storage (even if I can > theoretically fit the entire audit log on one machine). I've been > considering potentially using Cassandra here for durability and append > speed, but even that seems overkill since I don't want or need to be > able to ever update a log event after it's been stored. I've also > considered having Kafka as a layer in between, but that just feels > like overengineering as I don't expect event logs to populate nearly > as fast as, say, wind turbine sensor data where I last used that > architectural pattern. > > I'm curious if anyone has experience with building their own event log > storage service or using an existing one along with any advice. >
