Hi Krishna,

On Thu, Oct 11, 2018 at 12:04:31PM +0530, Krishna Kumar (Engineering) wrote:
> I must say the improvements are pretty impressive!
> 
>     Earlier number reported with 24 processes:     519K
>     Earlier number reported with 24 threads:          79K
>     New RPS with system irq tuning, today's git,
>                configuration changes, 24 threads:        353K
>     Old code with same tuning gave:                      290K

OK that's much better but I'm still horrified by the time taken in
the load balancing algorithm. I thought it could be fwlc_reposition(),
which contains an eb32_delete()+eb32_insert(), so I decided to replace
this with a new eb32_move() which moves the node within the tree, and
it didn't change anything here. Also I figured that I cannot manage to
reach that high time spent in this lock (300ms here, 58s for you). There
is one possible difference that might explain it, do you have a maxconn
setting on your servers ? If so, is it possible that it's reached ? You
can take a look at your stats page and see if the "Queue/Max" entry for
any backend is non-null.

Indeed, I'm seeing that once a server is saturated, we skip it for the
next one. This part can be expensive. Ideally we should remove such servers
from the tree until they're unblocked, but there is one special case making
this difficult, which is the dynamic limitation (minconn+maxconn+fullconn).
However I think we could improve this so that only this use case would be
affected and not the other ones.

I'm also seeing that this lock could be replaced by an RW lock. But before
taking a deeper look, I'm interested in verifying that it's indeed the
situation you're facing.

Thanks,
Willy

Reply via email to