On Sunday, September 11, 2016 7:57:41 PM EDT Willy Tarreau wrote:
> > > Also I've been thinking about this issue of the infinite loop that you
> > > solved already. As long as c > 1 I don't think it can happen at all,
> > > because for any server having a load strictly greater than the average
> > > load, it means there exists at least one server with a load smaller than
> > > or equal to the average. Otherwise it means there's no more server in
> > > the ring because all servers are down, and then the initial lookup will
> > > simply return NULL. Maybe there's an issue with the current lookup
> > > method, we'll have to study this.
> > 
> > Agreed again, it should be impossible as long as c > 1, but I ran into it.
> > I assumed it was some problem or misunderstanding in my code.
> 
> Don't worry I trust you, I was trying to figure what exact case could
> cause this and couldn't find a single possible case :-/

I've encountered this again in my re-written branch. I think it has to do with 
the case where all servers are draining for shutdown. What I see is that 
whenever I do a restart (haproxy -sf oldpid) under load, the new process 
starts up, but the old process never exits, and perf shows it using 100% CPU 
in chash_server_is_eligible, so it's got to be looping and deciding nothing is 
eligible. Can you think of anything special that needs to be done to handle 
graceful shutdown?

Thanks,

Andrew

Reply via email to