+1

Regardless of using pthread's conditional variable or not, the idea is 
consistent.

It's truly awesome that a test script is provided. Great work Mala! and 
Twittervese for investigating the problem.

-Toru

-----Original Message-----
From: dormando <[email protected]>
Sender: [email protected]
Date: Tue, 24 Aug 2010 23:57:09 
To: memcached<[email protected]>
Reply-To: [email protected]
Subject: Re: memcached race condition problem

>
> [Quick Fix]
> Set memcached to only listen to a single interface.
>
> example memcached setting:
> > memcached -U 0 -u nobody -p 11222 -t 4 -m 16000 -C -c 1000 -l 192.168.0.1 -v
>
>
> [Reproducing]
> Use attack script (thanks to mala): http://gist.github.com/522741
>
> w/ -l interface restriction: we have seen over 70 hours of stability
>  - yes, you will see "Too many open connections." but that's not an
> issue here
>
> w/o -l interface restriction: memcached quits w/ attack script
>
>
> Please give us some feedback on attach patch.  This should fix the
> race condition we have experienced.
>
> At last, we would like to thank numerous contributors on twitter that
> helped us nail this problem.  http://togetter.com/li/41702
>
> Shigeki

Excellent! Thanks for the detailed report and the patch. I'll see if I can
verify it here and get it applied asap. Might move the mutex to the
threads file, but the idea is sound.

Since you're using the clock event to re-enable connections, this means
that hitting maxconns will lock up memcached for a worst case minimum of 1
full second?

Perhaps a conditional could be used to wake up the main thread faster?

On a similar note (but wouldn't block this patch at all), I've been
wanting to change the default behavior in 1.6 to never close the accept()
routine, but instead instant-drop excessive connections. It makes more
sense for what memcached is, and I'd like to see an option in 1.4.x for
enabling such behavior to test with.

That avoids the issue in another way entirely, but I think we'll need both
fixes :P

-Dormando

Reply via email to