[
https://issues.apache.org/jira/browse/QPID-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256223#comment-16256223
]
Chris Richardson commented on QPID-7991:
----------------------------------------
Just a note for posterity - the segfault under discussion did not seem to be
triggered when creating the routes with the current version of qpid-route
(which has recently changed to use the Broker::create API rather than the
Link::Bridge approach which code comments suggest should be deprecated, see
changes under https://issues.apache.org/jira/browse/QPID-7876). However it DID
(prior to Alan's submitted fix) rear its ugly head when the route was created
with the supposedly identical call from the c++ broker management library I
authored at https://github.com/fourceu/fourc-qpid-manager and I have not yet
been able to determine the exact cause. Since this fix appears to remedy the
issue in either case I will abandon the investigation unless additional issues
arise.
> Segfault in broker while processing active bridges
> --------------------------------------------------
>
> Key: QPID-7991
> URL: https://issues.apache.org/jira/browse/QPID-7991
> Project: Qpid
> Issue Type: Bug
> Components: C++ Broker
> Affects Versions: qpid-cpp-1.36.0, qpid-cpp-1.37.0
> Environment: Ubuntu 17.10 x86_64, gcc 7.
> Reporter: Chris Richardson
> Assignee: Alan Conway
> Priority: Critical
> Fix For: qpid-cpp-1.37.0
>
> Attachments: segfault stack trace, segfault-fix.patch,
> segfault-repoduce.tar.gz, std_remove_if_with_smart_ptr.cpp
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> Segfault occurs on a brackground thread within about 5-10 seconds of broker
> startup at src/qpid/broker/Link.cpp:465. [^segfault stack trace] attached,
> frames #3 and #5 are of particular relevance.
> The unchecked Bridge::shared_ptr derived from the iterator is null and the
> invocation of bridge->closed() triggers the segfault. Adding a simple null
> check (as per attached [^segfault-fix.patch]) fixes the segfault but not the
> underlying reason for the null pointer.
> The segfault appears to be related to how a second broker (henceforth
> "broker1") is configured; this is the one to which the links are established.
> Without broker1, the "segfaulting broker" (aka "broker2") does not do its
> thing. It may be that broker1 returns invalid data to broker2 but this is not
> in the scope of this bug report, which focuses on the segfault.
> h2. Reproduce
> Unfortunately the steps to arrive at this situation are not clear so the
> reproduce is a bit hacky - the data directory, config file and some certs for
> the two brokers are attached as a tarball in the hope that they can be
> arranged in such a way as to provide a reproduce in lieu of a purely
> step-based procedure.
> Steps to reproduce:
> * Temporarily add a DNS alias to the local machine of "octopussy" (necessary
> due to cert config and durable link config in broker2's data store)
> * Extract the attached [^segfault-repoduce.tar.gz] to an empty directory
> (assumed to be cwd)
> * Start broker1 with "qpidd --config broker1/qpidd.conf"
> * In another shell with the same cwd, start broker2 with "qpidd --config
> broker2/qpidd.conf"
> * Observe segfault in broker2 after 5-10 seconds.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]