Just a quick summary for those who were following this issue, the
real issue came from the fact that two distinct sockets were used
by the script, one to send the command an another one to retrieve
the stats. Since there is no guarantee on the order of delivery on
two parallel sockets, it explains why it worked sometimes only, and
why adding a delay made the process more reliable. Michele switched
to a single socket which fixed the issue. Another solution is to
wait for haproxy to ACK the command on the socket, but it's not
necessarily always worth it.

I understood the single socket was Michele's original design but it
was probably failing due to one bug recently fixed (tracking servers
exiting maintenance mode).

Hoping this will save troubleshooting time for other people.

Regards,
Willy


Reply via email to