On Fri, 29 Oct 2010, Brian Rogoff wrote:
Hi, I'm considering using rsyslog to replace the custom logging system of a distributed program, and I had some questions about how to do this, and whether my tentative design is reasonable. Assume all machines are running a recent rsyslog (>= 5.6) on Linux.We use logging for system monitoring and performance, not for security. I plan to use the rsyslog time stamps to calculate the time through our system. The program runs on several clusters, and consists of numerous services/processes. For concreteness, let's just look at one cluster, and give each machine on it a name like cluster_01, cluster_02, etc. I'd like to designate two machines in each cluster as being the log servers; two to provide some redundancy for failures. Let cluster_01 and cluster_02 be the logging servers. There are about 20 machines per cluster, so two logging servers will collect for about 18 machines, each of which is running about 6 (six) services that we log events from. The services get 50 - 100 requests/second, all have 1Gbps Ethernet links into the same switch and disks that are fast enough to easily deal with the write bandwidth needed. I'd like the log messages from each service on each machine to go to a log files something like this cluster_01:/var/log/cluster_03 --cluster_04 . . . --cluster_nn cluster_01:/var/log/cluster_03/service1/log --service5/log --service7/log First question, does this make sense? Both as an explanation of what I'm trying to do, and as a reasonable logging architecture?
sure, I may end up doing it somewhat differently (but I don't know all the details), there's nothing fundamentally wrong with what you are doing.
what is the purpose of having everything broken out into individual files like this? I see a lot of people who start off trying to do something like this and then try to run reports across everything (where it would be much simpler if everything was just in one file)
throughput wise, there is nothing you havve said here that should strain rsyslog, even on pretty modest hardware.
Second question. What should my rsyslog.conf look like on each machine? On the logging servers, I'd like all messages not from the server machine to be stored in the log file with path determined from hostname, service name, and pid, say. I may need more info later to assure that I can trace the path of messages more easily, but this should be sufficient for starters. I'd prefer that the conf files do not hardcode the names of the other machines on themselves. On the cluster machines running the services, I'd like the conf files to all be exactly the same, so they may refer to the logging servers by name but not themselves. I has been very slow going for me trying to figure out the syntax to do all of this. I looked at the example conf files, and I was able to use expression based filters to get some of the way there but I think everything I described should be doable with rsyslog.
everything you are trying to do is possible, but getting it fully setup with everything you are trying to do is quite a bit of work. This request probably isn't inteded to sound like 'do my homework for me', but it's sounding pretty close to that. Adiscon does offer a service to work with you to do exactly this, but the level of detail you are asking for seems like it exceeds normal mailing list support (for the record, I do not work for adiscon)
I think that you should look at dynafiles for hints on how to do what it sounds like you want to do.
Third, are there any subtle issues I should be thinking about here. For example, since I'd like to use the log messages to calculate the performance of the services, do I need to introduce some extra time stamps in the message flow or are the rsyslog generated timestamps enough?
you really haven't provided enough information for this question. when you say you want to measure the performance of a service, what does that mean? what are you measuring?
rsyslog will log the time that it received the message. If you need to measure something like hits per second, this may be good enough. But if you needed to measure how long it took to service each individual request, the rsyslog timestamp is almost worthless.
David Lang _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

