Charles E. Rolke created DISPATCH-1968:
------------------------------------------
Summary: Crash after running series of 1Mb iperf3 against TCP
adaptor
Key: DISPATCH-1968
URL: https://issues.apache.org/jira/browse/DISPATCH-1968
Project: Qpid Dispatch
Issue Type: Bug
Components: Protocol Adaptors
Affects Versions: 1.15.0
Environment: Fedora 32 bare metal 64-bit.
Dispatch at 1.15 release
Proton git branch master @ 5e7d7af8f
Reporter: Charles E. Rolke
h2. Setup
Running with a minimal TCP adaptor listener / connector on a single router. See
attached INTA.conf. These processes run on a single laptop.
Start a iperf3 server on default port 5201:
iperf3 -s
Run iperf3 client in a loop to port 5202 served by the TCP adaptor.
iperf3 -c hostname -p 5202 -n 1000000
h2. Issues
After a few loops the router crashes with malloc having a corrupted doubly
linked list.
Sometimes the test client hangs for a few seconds until the iperf server times
out.
Qdstat shows many resource leaks of qd_buffer_t and stream data objects.
h2. Observations
h3. Tracing a single iperf3 session
A wireshark trace of a single iperf3 session shows the client opening two
connections to the router and the router opening two connections to the server.
This is expected.
As the test runs there is a certain amount of chat between the client and
server that works as expected. These messages are test setup and are not part
of the iperf mission payload data.
Then the payload data starts. After the server has accepted 8kbytes of iperf
payload (in 16 512-byte network packets!!!) the server closes the connection to
the TCP connector with a FIN. A few microseconds later the TCP connector sends
another 512-byte packet to which the the iperf server responds with a RST.
Shortly thereafter the connections close with a bunch of TCP FIN packets.
The router did not crash.
h3. Running with asan and valgrind memcheck
Running with either of these tools was inconslusive and did not reveal any
stray memory writes or double frees that could corrupt the malloc heap.
h2. Next steps
Having the network peer of the TCP connector close the connection mid-stream is
a pattern that is not tested in the self tests. A test to generate this pattern
is in progress.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]