[
https://issues.apache.org/jira/browse/PROTON-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544161#comment-17544161
]
Clifford Jansen commented on PROTON-2543:
-----------------------------------------
I don't know if you have had any time to try to gather further information
about the crash that you are seeing.
It would certainly help me to be of greater assistance if you could provide
more details about the environment where you see the crash:
* cpu hardware type and model
* OS and version
* compiler (gcc/clang/other)
* Number of concurrent threads servicing proactor event batches
* Number of active proactors in failing process (usually 1)
* Running on bare hardware, VM, container
* crash occurs during main operation or on shutdown (or both)
* Types of connections and listeners
** All outgoing connections
** All incoming connections and listeners
** Mix of both (describe)
** Mainly/only pn_raw_connection_t or pn_connection_t connections.
** connections are over a network/virtual network/loopback
If you are having difficulty reproducing the crash in debug mode, perhaps I
could provide an instrumented version of epoll.c that could give us recent
proactor history and help debug the problem.
Also, if you could provide a debugger dump of the failing pn_proactor_t at time
of one of your crashes, that might help me think of other things to explore.
Thank you for any information you can provide.
> Crash in epoll.c resched_pop_front
> ----------------------------------
>
> Key: PROTON-2543
> URL: https://issues.apache.org/jira/browse/PROTON-2543
> Project: Qpid Proton
> Issue Type: Bug
> Components: proton-c
> Reporter: Fredrik Hallenberg
> Assignee: Clifford Jansen
> Priority: Major
> Attachments: qpid-epoll-crash.patch
>
>
> During stress testing it is fairly easy to reproduce a segfault in
> resched_pop_front. Using gdb it is easy to see that polled_resched_front can
> be zero when entering this function which causes the value to wrap and then a
> crash in later calls.
> polled_resched_front is not checked when calling this function in one
> instance, the trivial fix to check this value is seen in the attached patch
> seems to work.
> Tested with Qpid Proton C++ 0.37.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]