I did a bzr update just now, to commit, and I see that this problem is
fixed :-)
Please ignore....

On 04/08/2014 06:48 PM, Tsantilas Christos wrote:
> The problem is that the peer_select code trying to access a std::vector
> which is already destroyed.
> If we call peerSelect for fwd FwdState object eg:
>     peerSelect(&fwd->serverDestinations, ..., fwd);
> 
> and the fwd state object become invalid (eg because client closed the
> connection) then it is possible to see this crash.
> 
> I believe the fix for these cases is easy, we need to add the following
> code at the beggining of peerSelectDnsResults and peerSelectDnsPaths
> methods (and any other method uses the vector):
> 
>   if (!cbdataReferenceValid(psstate->callback_data)) {
>     delete psstate;
>     return;
>   }
> 
> Regards,
>    Christos
> 
> On 03/26/2014 11:13 AM, Kinkie wrote:
>> And instead..
>> the code testing Alex's suspects triggered in peer_select.cc:337, this
>> is the relevant snippet of stack trace:
>>
>>
>> #3  0x0000000000632f2d in Vector<RefCount<Comm::Connection> >::size (
>>     this=<optimized out>, this=<optimized out>) at ../src/base/Vector.h:355
>> #4  0x0000000000638a45 in peerSelectDnsResults (ia=0x29ff8b0, details=...,
>>     data=0x2cf59b0) at peer_select.cc:337
>> #5  0x000000000060d4e6 in ipcacheCallback (i=i@entry=0x29ff890,
>>     wait=wait@entry=19) at ipcache.cc:347
>> #6  0x000000000060dbd4 in ipcacheHandleReply (data=<optimized out>,
>>     answers=<optimized out>, na=<optimized out>, error_message=0x0)
>>     at ipcache.cc:497
>> #7  0x000000000058b357 in idnsCallback (q=0x24eec80, q@entry=0x24f5370,
>>     error=error@entry=0x0) at dns_internal.cc:1127
>>
>>
>> I'm currently investigating more in depth.
>>
>> On Wed, Mar 26, 2014 at 8:32 AM, Kinkie <gkin...@gmail.com> wrote:
>>> Hi,
>>>   I think I have isolated the changelet in FwdState::connectionList.
>>> However WHAT in that changelet may be causing the crashes is still
>>> unknown. The hunt for iterate-after-clear has so far turned up
>>> nothing.
>>>
>>> On Wed, Mar 26, 2014 at 8:01 AM, Niki Gorchilov <n...@gorchilov.com> wrote:
>>>> Hello there,
>>>>
>>>> Is there any progress regarding the random coredumps related to
>>>> std::vector migration?
>>>>
>>>> Best,
>>>> Niki
>>>>
>>>> On Mon, Mar 17, 2014 at 6:41 PM, Kinkie <gkin...@gmail.com> wrote:
>>>>> Yes, I will.
>>>>>
>>>>> On Mon, Mar 17, 2014 at 5:24 PM, Alex Rousskov
>>>>> <rouss...@measurement-factory.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>>     I have discovered one difference between std::vector and Squid's own
>>>>>> Vector implementation that might explain some of the random crashes that
>>>>>> some of us have been seeing after the Vector class was replaced with
>>>>>> std::vector in trunk.
>>>>>>
>>>>>> Squid Vector sets the number of stored elements to zero on destruction.
>>>>>> Std::vector does not do that. Here is the output of a simple test code
>>>>>> quoted in the postscriptum:
>>>>>>
>>>>>>> alive std::vector size:    1
>>>>>>> deleted std::vector size:  3
>>>>>>> alive Squid Vector size:   1
>>>>>>> deleted Squid Vector size: 0
>>>>>>
>>>>>> Both vectors behave correctly, but std::vector code is optimized not to
>>>>>> do extra work such as maintaining the size member. This means that
>>>>>> iterating a deleted Squid Vector is often safe (until the freed memory
>>>>>> is overwritten) because the broken caller code would just discover that
>>>>>> there are no items in the vector and move on.
>>>>>>
>>>>>> Iterating the deleted std::vector usually leads to crashes because all
>>>>>> iteration-related methods such as size() rely on immediately deleted and
>>>>>> changed pointers (std::vector does not store its size as a data member
>>>>>> but uses memory pointers to compute the number of stored elements).
>>>>>>
>>>>>> This difference leads to std::vector migration problems such as this
>>>>>> trivially reproducible one:
>>>>>>
>>>>>>> Adaptation::AccessRules &
>>>>>>> Adaptation::AllRules()
>>>>>>> {
>>>>>>>     static AccessRules TheRules;
>>>>>>>     return TheRules;
>>>>>>> }
>>>>>>
>>>>>> After TheRules array is deallocated during global destructions,
>>>>>> iterating AllRules() becomes unsafe. The old Vector-based code usually
>>>>>> worked because deallocated TheRules had zero size.
>>>>>>
>>>>>> The specific bug mentioned above is trivial to work around by allocating
>>>>>> TheRules dynamically (and never deleting it). However, if similar bugs
>>>>>> exist in other code where Vector is used as a data member of some
>>>>>> transaction-related structure, then they will lead to crashes. It is
>>>>>> quite possible that the affected structure's memory itself is never
>>>>>> deleted when the offending code accesses it (due to cbdata locks, for
>>>>>> example) so the [equally buggy] Vector-based code "works".
>>>>>>
>>>>>> One way we can try to catch such cases is by temporary switching back to
>>>>>> Vector and then:
>>>>>>
>>>>>> * Modifying Vector to mark deleted Vector objects:
>>>>>>
>>>>>> Vector::~Vector() {
>>>>>>     clean();
>>>>>>     deleted = true;
>>>>>> }
>>>>>>
>>>>>> * And adjusting *every* Vector method to assert if a deleted object is
>>>>>> still being used. For example:
>>>>>>
>>>>>> template<class E>
>>>>>> size_t
>>>>>> Vector<E>::size() const
>>>>>> {
>>>>>>     assert(!deleted);
>>>>>>     return count;
>>>>>> }
>>>>>>
>>>>>> If one of those asserts is triggered, we would be closer to solving this
>>>>>> mystery.
>>>>>>
>>>>>>
>>>>>> Kinkie, if you agree with this analysis, would you be able to make the
>>>>>> above modifications and test?
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Alex.
>>>>>>
>>>>>>
>>>>>> Goes into Squid's main.cc:
>>>>>>> +#include <vector>
>>>>>>> +#include "base/Vector.h"
>>>>>>>  int
>>>>>>>  main(int argc, char **argv)
>>>>>>>  {
>>>>>>> +       typedef std::vector<int> StdVector;
>>>>>>> +       StdVector *stdv = new StdVector;
>>>>>>> +       stdv->push_back(1);
>>>>>>> +       std::cout << "alive std::vector size:    " << stdv->size() << 
>>>>>>> "\n";
>>>>>>> +       delete stdv;
>>>>>>> +       std::cout << "deleted std::vector size:  " << stdv->size() << 
>>>>>>> "\n";
>>>>>>> +
>>>>>>> +       typedef Vector<int> SqdVector;
>>>>>>> +       SqdVector *sqdv = new SqdVector;
>>>>>>> +       sqdv->push_back(1);
>>>>>>> +       std::cout << "alive Squid Vector size:   " << sqdv->size() << 
>>>>>>> "\n";
>>>>>>> +       delete sqdv;
>>>>>>> +       std::cout << "deleted Squid Vector size: " << sqdv->size() << 
>>>>>>> "\n";
>>>>>>> +       if (true) return 0;
>>>>>>>      return SquidMainSafe(argc, argv);
>>>>>>>  }
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>     Francesco
>>>
>>>
>>>
>>> --
>>>     Francesco
>>
>>
>>
> 
> 


-- 
Tsantilas Christos
Network and Systems Engineer
email:chris...@chtsanti.net
  web:http://www.chtsanti.net
Phone:+30 6977678842

Reply via email to