Hi Willy, Tim,

I am providing some more details about my setup if you wish to try to
reproduce the issue.
As I mentioned before, I have 5 HAProxy nodes, all of them listening to
public IPs.
My DNS is setup with round-robin mode on AWS R53, resolving to one of the
HAProxy nodes individual IPs for each request.
It means that very commonly one client will have multiple connections with
many (or even) all nodes in the cluster - Also they do tend to
connect/disconnect fast (little keep-alive usage), making this racing
condition quite likely to happen.

I suppose the scenario Tim described earlier is accurate:

- Connect to peer A     (A=1, B=0)
- Peer A sends 1 to B   (A=1, B=1)
- Kill connection to A  (A=0, B=1)
- Connect to peer B     (A=0, B=2)
- Peer A sends 0 to B   (A=0, B=0)
- Peer B sends 0/2 to A (A=?, B=0)
- Kill connection to B  (A=?, B=-1)
- Peer B sends -1 to A  (A=-1, B=-1)


Let me know if some you wish to add some debugging info to the patch in
order to dump some extra information when this scenario happens.

BR.,
Emerson





Em ter, 15 de jan de 2019 às 11:50, Willy Tarreau <w...@1wt.eu> escreveu:

> Hello Emerson,
>
> On Mon, Jan 14, 2019 at 10:26:40PM +0100, Emerson Gomes wrote:
> > Hello Tim,
> >
> > Sorry for the delayed answer.
> > The segfaults I had experinced apparently were related to something else
> -
> > Maybe some issue in my env.
> > At first I tried to apply the patch to 1.9.0, but after applying it to
> > 1.8.7, I no longer had the segfaults.
> >
> > So far I yet haven't experienced the underflow issue again.
> > I think it would be nice to merge this change to next releases - Not sure
> > how this is managed around here without the tracking tool :)
>
> Thanks for the report! Tim, could you elaborate a little bit more on how
> the race reproduces ? I'm asking because if we only apply the underflow
> check, it will mean we'll constantly accumulate wrong values under load
> since until the counter crosses zero, the double discount is not detected.
> I'd rather be sure to address the cause (why do we decrement it twice)
> than the consequence (value becomes negative).
>
> thanks!
> Willy
>

Reply via email to