Zhiwei, thank you for the info. But i still not sure that this relates to hash table grow (see my answer to Dormando in this thread) and it happened for 3 hours time and disappear... Or I miss this part of code (do_item_alloc is small but with fancy idea :) )?
-denis On Tuesday, July 1, 2014 10:24:20 PM UTC-7, Zhiwei Chan wrote: > > hi, > i think it the same bug with issue#370, and i have found the reproduce > way and pull a fix patch to github. > > 在 2014年7月2日星期三UTC+8上午5时43分49秒,Dormando写道: >> >> Hey, >> >> Can you presize the hash table? (-o hashpower=nn) to be large enough on >> those servers such that hash expansion won't happen at runtime? You can >> see what hashpower is on a long running server via stats to know what to >> set the value to. >> >> If that helps, we might still have a bug in hash expansion. I see someone >> finally reproduced a possible issue there under .20. .17/.19 fix other >> causes of the problem pretty thoroughly though. >> >> On Tue, 1 Jul 2014, Denis Samoylov wrote: >> >> > Hi, >> > We had sporadic memory corruption due tail repair in pre .20 version. >> So we updated some our servers to .20. This Monday we observed several >> > crushes in .15 version and tons of "allocation failure" in .20 version. >> This is expected as .20 just disables "tail repair" but it seems the >> > problem is still there. What is interesting: >> > 1) there is no visible change in traffic and only one slab is affected >> usually. >> > 2) this always happens with several but not all servers :) >> > >> > Is there any way to catch this and help with debug? I have all slab and >> item stats for the time around incident for .15 and .20 version. .15 is >> > clearly memory corruption: gdb shows that hash function returned 0 >> (line 115 uint32_t hv = hash(ITEM_key(search), search->nkey, 0);). >> > >> > so we seems hitting this comment: >> > /* Old rare bug could cause a refcount leak. We haven't >> seen >> > * it in years, but we leave this code in to prevent >> failures >> > * just in case */ >> > >> > :) >> > >> > Thank you, >> > Denis >> > >> > -- >> > >> > --- >> > You received this message because you are subscribed to the Google >> Groups "memcached" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to [email protected]. >> > For more options, visit https://groups.google.com/d/optout. >> > >> > > > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
