Hi John,

On Tue, Sep 09, 2014 at 02:48:46PM +0300, John Schwarz wrote:
> Hi guys,
> 
> Long story short: We have a need for a new feature for haproxy, which
> allows logging to normal files (as opposed to the current domain
> socket/UDP servers). This will of course require adding such an option
> to the configuration. I am willing to write the whole feature (with
> reviews from the community, of course). I understood from the README at
> your github that it might be best to ask here first: is there a specific
> reason why this wasn't implemented yet?

Yes, haproxy works in multiplexed-I/O mode around a single poll() loop.
You must absolutely not use any blocking I/O there, or you'll pause your
process for a long time. That's the first reason why haproxy does no file
system access once started. The second reason is that it's supposed to be
chrooted and to drop privileges for increased security. However if the
file is opened before the fork (like is done for logs over unix sockets),
it will not be an issue. Still the performance issue remains. If you have,
say, 4ms access time to a file because the machine is running under moderate
I/O load, the service will become extremely slow as nothing will be done
during these I/O. One option could be to implement async I/O, but that's
not portable at all, so probably not really an option. Another option would
be to dedicate a process to logs, but that would basically mean that we send
logs overthere using a unix socket and it dumps them to disk. That's exactly
what a syslog deamon does.

(...)
> The problem I'm trying to solve: If there are N haproxy processes in one
> machine (N >= 2), and say the setup isn't stable (so VMs constantly goes
> up and down), many log messages will be written to syslog and it will
> become ungainly and complicated to differentiate which log message
> belongs to which haproxy.
> 
> I spent a few days recently to solve this problem by assigning, for each
> haproxy process, a separate domain socket, and having LBaaS-Agent read
> from said sockets. Each time a new log is read, the message is written
> to a specific (different) file. In other words, say I have N haproxy
> processes; these are connected to N domain sockets, and the logs are
> read by a single LBaaS-Agent and finally written to N log files.

I know people who use many haproxy processes (once per application customer,
with many applications running on a server), they have a dedicated syslog
daemon, once per application user as well. And in the end it provides really
good results, because each application comes with its haproxy and syslogd
configs in a very templatable way. The unix socket is always the same, at
a place relative to the config and logs, always with the same name, or just
running over UDP with a port directly related to the application's or to
the customer ID. Also, the benefits of using a regular syslog is that you
can continue to use all the syslog facilities such as logs rotation,
filtering/alerting/buffering etc... depending on the syslog daemon.

> This is far from optimal (and difficult to maintain because of the long
> route each log file is now making, and because there is now a single
> point of failure). Adding the ability for each haproxy to dump its logs
> to a specific file will solve the long workaround we made in the code
> (which is currently being reviewed and can be found at [1]).

The method above provides all that without the difficult processing, and
ensures that people skilled with each component can still apply their skills
to each part of the log chain, which is another important point in general.

Don't you think it would solve your trouble ?

Regards,
Willy


Reply via email to