My current setup is essentially a small modification to the EcsLayout in JsonTemplateLayout (to ignore certain MDC values) to a rolling file appender with a sidecar container for forwarding log files to Splunk. The event log I'm considering here does not directly involve finances (you could stretch things by saying that using compute and storage resources is similar to spending money, but that's not what we're tracking). Another potential idea I've considered is dumping the JSON files to S3 or similar and then using standard data lake stuff for querying, though that just feels like Splunk with extra steps.
How effective have you found using Redis or other non-durable stores in between the source and sink of logs? On Wed, Aug 4, 2021 at 11:34 AM Dominik Psenner <[email protected]> wrote: > > Hi > > I like Json as the transport format over mqtt. A few hundred lines of code > persist events to a postgresql database by calling a stored procedure that > handles the json directly. The stored procedure allows for plugging in some > fancy aggregation logic or distribution across tables or even retention > during persistency and if applied gradually with very little impact. This > has the benefit of no hours of downtime to delete millions of records that > would require a rebuild of the table otherwise. The stored procedure can > even be modified at runtime. And when the requirements become too tough for > the database to handle, the mqtt pub/sub system allows plugging in > additional processors like aggregators, alerters, ... > > Warm regards, > Dominik > -- > Sent from my phone. Typos are a kind gift to anyone who happens to find > them. > > On Wed, Aug 4, 2021, 10:04 Volkan Yazıcı <[email protected]> wrote: > > > We have our Redis-shielded ELK stack – Redis acts as a buffer for > > fluctuating Elasticsearch ingestion performance and downtime. Apps use > > log4j2-redis-appender <https://github.com/vy/log4j2-redis-appender>, not > > the official Log4j one. This said, I am lobbying for replacing Redis with > > Google Cloud Pub/Sub – one less managed component to worry about. (Yes, my > > Google Cloud Pub/Sub appender PR is on the horizon!) > > > > I have heavily used Elasticsearch for both "search" (as in "Google search") > > and log sink purposes, professionally, for 5+ years. IMHO, in both use > > cases, it is the best tool in the F/OSS market that delivers. I did not > > understand your remark that Elasticsearch is geared toward "relatively > > short time period" retention. I have seen deployments spanning thousands of > > nodes with a couple of years of retention. It just works. > > > > If you are in the cloud, there are pretty good log sink solutions too, > > e.g., Google Cloud Logging. All your worries about retention and > > maintenance will be perfectly addressed there, granted you are willing to > > pay for that. > > > > If you need logging for auditing purposes, e.g., "mark this money transfer > > as completed", you are doing it wrong, I think. Most of the time, the > > shebang that happens after your log() statement is executed asynchronously > > at many layers, hence, failures don't propagate back. For one, all Log4j > > threads are daemon threads and will be killed upon a JVM exit without > > flushing their buffers. Indeed this is a controversial subject and one can > > possibly engineer a reliable logging infra, yet, again, I think this is the > > wrong tool for the job. > > > > Regarding DBMS log sinks, e.g., PostgreSQL, MySQL, MongoDB, Cassandra, they > > are good for persistence, scrolling through records, etc., but not for > > aggregation queries. I see two main issues that they fall short of > > addressing in my experience: 1) Many users reach out to queries combined > > with aggregations (e.g., show me a histogram of mdc.httpStatusCode in the > > last month for this long query of mine) and RDBMSes are tremendously slow > > compared to Elasticsearch/Lucene for such queries. One can argue that this > > is abusing logging for metrics. Yet, there it is. 2) Certain RDBMSes are > > darn difficult to (horizontally) scale, unless it is provided > > out-of-the-box. > > > > On Tue, Aug 3, 2021 at 6:50 PM Matt Sicker <[email protected]> wrote: > > > > > Hey all, I have a somewhat practical question related to logging here. > > > For those of you maintaining a structured event log or audit log of > > > some sort, what types of event log stores are you using to append them > > > to? I feel like solutions like Splunk, ELK, etc., are geared toward > > > diagnostic logs which don't necessarily need retention beyond a > > > relatively short time period. On the other hand, one of the more > > > natural append-only storage solutions I can think of is Kafka, though > > > that, too, isn't really geared toward long term storage (even if I can > > > theoretically fit the entire audit log on one machine). I've been > > > considering potentially using Cassandra here for durability and append > > > speed, but even that seems overkill since I don't want or need to be > > > able to ever update a log event after it's been stored. I've also > > > considered having Kafka as a layer in between, but that just feels > > > like overengineering as I don't expect event logs to populate nearly > > > as fast as, say, wind turbine sensor data where I last used that > > > architectural pattern. > > > > > > I'm curious if anyone has experience with building their own event log > > > storage service or using an existing one along with any advice. > > > > >
