Hi Maxim,

On Thu, Apr 04, 2019 at 02:22:59PM +0300, ?????? ????????? wrote:
> Hi, everybody!
> 
> Got multiple incidents of failure with 1.9.6:
> Core was generated by `/usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p
> /var/run/haproxy'.
> Program terminated with signal SIGFPE, Arithmetic exception.
> #0  0x0000559afb73c533 in fwrr_update_position (grp=0x559afbd9fb68,
> grp=0x559afbd9fb68, s=0x559afcc5f560) at src/lb_fwrr.c:498
> 498 HA_ATOMIC_ADD(&s->npos, (grp->next_weight / s->cur_eweight));
> [Current thread is 1 (Thread 0x7f879677c700 (LWP 776412))]
> (gdb) thread apply all bt

Scary, that's not supposed to be possible in theory :

  /* Computes next position of server <s> in the group. It is mandatory for <s>
   * to have a non-zero, positive eweight.
               ^^^^^^^^^
   *
   * The server's lock and the lbprm's lock must be held.
   */
  static inline void fwrr_update_position(struct fwrr_group *grp, struct server 
*s)

So either we're doing something wrong somewhere in a caller, or we have
insufficient locking and sometimes this server's weight is put down to
zero between the moment the value is checked and the moment it's used.

I'm having a look at it right now.

Thanks for reporting,
Willy

Reply via email to