[
https://issues.apache.org/jira/browse/DISPATCH-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Giusti reassigned DISPATCH-2059:
------------------------------------
Assignee: Jiri Daněk
> Support running router under rr during test execution
> -----------------------------------------------------
>
> Key: DISPATCH-2059
> URL: https://issues.apache.org/jira/browse/DISPATCH-2059
> Project: Qpid Dispatch
> Issue Type: Wish
> Components: Tests
> Affects Versions: 1.15.0
> Reporter: Jiri Daněk
> Assignee: Jiri Daněk
> Priority: Major
>
> Dispatch has env variable {{QPID_DISPATCH_RUNNER}} which is (according to
> comment) intended to be used for running tests under valgrind. That is
> outdated comment, because the memory checking is currently solved in a
> different way, in {{RuntimeChecks.cmake}}. One tool that would make sense to
> use to wrap dispatch is rr, the record-replay debugger from Mozilla
> (https://rr-project.org/).
> I've previously tried rr with (very) limited success in DISPATCH-782.
> [~aconway] considered it while working on DISPATCH-902 and used it on other
> issues.
> There has been an attempt
> https://issues.apache.org/jira/browse/DISPATCH-739?focusedCommentId=15983719&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15983719
> to use rr which however did not survive in the mainline to the present day.
> I have two problems with rr:
> # Dispatch system-tests send SIGTERM to the subprocess itself, which is rr.
> What is necessary is to kill its children instead. Killing rr causes abrupt
> termination of the recording. When I issue ^C to a {{rr record qdrouterd -c
> ...}} in the terminal, that signal goes correctly to the child. I am not sure
> what's happening there in the test, where the difference comes from.
> Explicitly killing only children in the system test does the right thing.
> Sadly doing that requires hacks, python's subprocess does not allow to query
> children easily. The os module has some ways; psutil is the easiest, but
> thats a 3rd party dependency.
> # CLion debugger disconnects during replay when qdrouterd gets SIGTERM, but
> the router handles that signal and continues running (cleanup)
> One awesome feature of rr is that the recording can be replayed many times,
> backwards and forwards, and all memory addresses stay the same in the
> recording, on every replay. Meaning that one can use {{watch -l *0x0000000}}
> breakpoints to watch specific places of memory, and use {{reverse-cont}} gdb
> command. (rr emulates the gdb UI, it's a wrapper over gdb, actually, if I
> understand correctly.)
> h3. Chaos mode
> rr has a {{--chaos}} switch which tries to explore thread schedules as to
> reveal more crashes; that could be useful
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]