[ 
https://issues.apache.org/jira/browse/QPID-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614430#comment-16614430
 ] 

Cliff Jansen commented on QPID-8209:
------------------------------------

The value Queue::deleted can be set arbitrarily late because the callback from 
QueueRegistry to Queue::destroyed() can be.  This can leave the Queue in a 
semi-deleted state from the viewpoint of some threads.

The subject code has been refactored numerous times due to deadlocks from 
multiple threads accessing the queue and the registry for different but 
interrelated purposes.

One solution is to have a separate lock just for setting and getting the 
deleted value.  As a last level lock, it will be safe from deadlock.

Related "deleted" problem code is of the form:

  foo(string &queueName, args1){
    Queue::shared_ptr q = queueRegistry.find(queueName);
    if (q && shouldDelete(q))
      queueRegistry.destroy(q->name, args2);
  }

The call to destroy() can occur after the found queue has already been 
destroyed and replaced with a same named queue by other threads, resulting in 
an unintended queue being deleted.

Changing the signature to destroy() so that it takes a queue pointer instead of 
a string name can get around this problem.

> qpidd segfault with huge backtrace when deleting autoDel queue just being 
> auto-deleted
> --------------------------------------------------------------------------------------
>
>                 Key: QPID-8209
>                 URL: https://issues.apache.org/jira/browse/QPID-8209
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: qpid-cpp-1.38.0
>            Reporter: Pavel Moravec
>            Priority: Major
>
> Description of problem:
> When below two actions happen concurrently on an auto-delete queue (without 
> auto-del timeout), qpidd segfaults.
> Two actions:
> - detaching (latest) consumer of the auto-del queue
> - deleting the queue in either way (i.e. via QMF or by sending proper AMQP 
> performative)
> cause that:
> - segfaulting thread from the detach event has backtrace like:
> #0  0x00007f9af745f40d in ScopedLock (this=0x7f9adc6c5a88, expectedVersion=1) 
> at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/sys/Mutex.h:33
> #1  qpid::broker::Queue::tryAutoDelete (this=0x7f9adc6c5a88, 
> expectedVersion=1) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1348
> #2  0x00007f9af745eed4 in qpid::broker::Queue::scheduleAutoDelete 
> (this=0x7f9adc6c5a88, immediate=false) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1339
> #3  0x00007f9af745f524 in qpid::broker::Queue::tryAutoDelete 
> (this=0x7f9adc6c5a88, expectedVersion=<value optimized out>) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1367
> #4  0x00007f9af745eed4 in qpid::broker::Queue::scheduleAutoDelete 
> (this=0x7f9adc6c5a88, immediate=false) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1339
> ..
> #12012 0x00007f9af745eed4 in qpid::broker::Queue::scheduleAutoDelete 
> (this=0x7f9adc6c5a88, immediate=false) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1339
> #12013 0x00007f9af74601c6 in qpid::broker::Queue::cancel 
> (this=0x7f9adc6c5a88, c=..., 
> connectionId="qpid.127.0.0.1:5672-127.0.0.1:56736", userId="anonymous")
>     at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:637
> ..
> - other thread trying to delete the queue has bt like:
> #0  0x000000343ee11016 in qpid::broker::Exchange::propagateFedOp 
> (this=0x260ee30, routingKey="autoDel_93", tags="", op="U", origin="", 
> extra_args=0x0)
>     at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Exchange.cpp:344
> #1  0x000000343ee66baf in qpid::broker::DirectExchange::unbind 
> (this=0x260edd0, queue=..., routingKey="autoDel_93", args=<value optimized 
> out>)
>     at /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/DirectExchange.cpp:159
> #2  0x000000343eeac7c9 in qpid::broker::QueueBindings::unbind (this=<value 
> optimized out>, exchanges=..., queue=...) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/QueueBindings.cpp:47
> #3  0x000000343ee39b1d in qpid::broker::Queue::unbind (this=<value optimized 
> out>, exchanges=<value optimized out>) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1208
> #4  0x000000343ee39ca6 in qpid::broker::Queue::destroyed 
> (this=0x7ff820308810) at 
> /usr/src/debug/qpid-cpp-1.36.0/src/qpid/broker/Queue.cpp:1162
> #5  0x000000343eeb1027 in qpid::broker::QueueRegistry::destroy 
> (this=0x2606278, name="autoDel_93", 
> connectionId="qpid.127.0.0.1:5672-127.0.0.1:39240", userId="anonymous")
> ..
> The segfault happens since depth of the backtrace exceeds some limit.
> Version-Release number of selected component (if applicable):
> current upstream
> 1.36
> 1.38
> How reproducible:
> randomly, but 100% in 1 hour
> Steps to Reproduce:
> compile C++ program to delete a queue via QMF - attached at the botton
> run below script with optionally updated parameters, such that numbers of 
> "resource-deleted" (from qpid-receive) and "Delete failed. No such queue" 
> (from delete_queue) errors are similar (i.e. the events to trigger the race 
> condition happen usually at very similar time)
> queues=125
> slept=0.9922
> for i in $(seq 1 $queues); do
>         while true; do
>                 qpid-receive -a "autoDel_${i}; \{create:always, node:{ 
> x-declare:{auto-delete:True}}}" --timeout=1 &
>                 sleep $slept
>                 delete_queue autoDel_${i}
>                 sleep 1
>         done &
>         sleep 0.1
> done
> Actual results:
> within a hour, segfault with above backtraces
> Expected results:
> no segfault
> Additional info:
> delete_queue.cpp :
> #include <cstdlib>
> #include <iostream>
> #include <sstream>
> #include <qpid/messaging/Address.h>
> #include <qpid/messaging/Connection.h>
> #include <qpid/messaging/Message.h>
> #include <qpid/messaging/Sender.h>
> #include <qpid/messaging/Receiver.h>
> #include <qpid/messaging/Session.h>
> using namespace qpid::messaging;
> using namespace qpid::types;
> using std::stringstream;
> using std::string;
> int main(int argc, char** argv) \{
>     const char* queue_name = argc>1 ? argv[1] : "queue_name";
>     const char* url = argc>2 ? argv[2] : "amqp:tcp:127.0.0.1:5672";
>     Connection connection(url/*, connectionOptions*/);
>     try {
>         connection.open();
>         Session session = connection.createSession();
>         Sender sender = session.createSender("qmf.default.direct/broker");
>         Address responseQueue("#reply-queue; {create:always, 
> node:{x-declare:{auto-delete:true}}}");
>         Receiver receiver = session.createReceiver(responseQueue);
>         Message message;
>         Variant::Map content;
>       Variant::Map OID;
>       Variant::Map arguments;
>       OID["_object_name"] = "org.apache.qpid.broker:broker:amqp-broker";
>       arguments["type"] = "queue";
>       arguments["name"] = queue_name;
>       
>         content["_object_id"] = OID;
>         content["_method_name"] = "delete";
>         content["_arguments"] = arguments;
>       
>         encode(content, message);
>       message.setReplyTo(responseQueue);
>       message.setProperty("x-amqp-0-10.app-id", "qmf2");
>       message.setProperty("qmf.opcode", "_method_request");
>         sender.send(message, true);
>       
>       Message response;
>       if (receiver.fetch(response,qpid::messaging::Duration(30000)) == true)
>       \{
>               qpid::types::Variant::Map recv_props = response.getProperties();
>               if (recv_props["x-amqp-0-10.app-id"] == "qmf2")
>                       if (recv_props["qmf.opcode"] == "_method_response")
>                               std::cout << "Response: OK" << std::endl;
>                       else if (recv_props["qmf.opcode"] == "_exception")
>                               std::cerr << "Error: " << response.getContent() 
> << std::endl;
>                       else
>                               std::cerr << "Invalid response received!" << 
> std::endl;
>               else
>                       std::cerr << "Invalid response not of qmf2 type 
> received!" << std::endl;
>       }
>       else
>               std::cout << "Timeout: No response received within 30 seconds!" 
> << std::endl;
>         connection.close();
>         return 0;
>     } catch(const std::exception& error) \{
>         std::cout << error.what() << std::endl;
>         connection.close();
>     }
>     return 1;
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to