the figure -- Re: Please help: ssh_exchange_identification: read: Connection reset by peer

Gábor LENCSE Sun, 03 Sep 2023 13:07:49 -0700

I am sorry.

A somewhat different but at least visible version of the Test setup isavailable here:https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-benchmarking-stateful#test_setup_sfnat64_multi


Gábor

9/3/2023 8:45 PM keltezéssel, Gabor LENCSE írta:

Dear List Members,
I have a weird problem, when I try to ssh to an OpenBSD server. (I useOpenBSD 7.3 with GENERIC.MP #1125 kernel.)
I perform benchmarking tests to measure the performance of OpenBSDPF. I use the below test setup:
2001:2::[0000-ffff]:2/64 198.19.0.0/15 - 198.19.255.254/15
           \  +--------------------------------------+  /
  IPv6      \ |Initiator                    Responder| /
+-------------|                Tester |<------------+
| addresses   |                         [state table]| public IPv4 |
| +--------------------------------------+             |
| |
| +--------------------------------------+             |
| 2001:2::1/64|                 DUT:                 | public IPv4 |
+------------>|        Stateful NAT64 gateway |-------------+
 IPv6 address |     [connection tracking table]      | \
              +--------------------------------------+  \
198.18.0.1/15
(As for the actual tests, I use only sub-ranges from the potential IPaddress ranges shown above.)
The Tester is executed on a Linux server. During my tests, a bashshell script (running on the Linux server) executes various commandson the DUT (Device Under Test), which is the OpenBSD server. To thatend, I use ssh with key based authentication. Usually everything goeswell, but after a while, things "go wrong", and I cannot ssh from theLinux server to the OpenBSD server any more. I get the following errormessage:
root@tester:~/siitperf# ssh 172.16.17.102
ssh_exchange_identification: read: Connection reset by peer
root@tester:~/siitperf#

Then I even cannot ssh from the OpenBSD server to itself:

dut# ssh localhost
getsockname failed: Connection reset by peer
banner exchange: Connection to 127.0.0.1 port -1: Broken pipe
dut# ssh 172.16.17.102
getsockname failed: Connection reset by peer
banner exchange: Connection to UNKNOWN port -1: Broken pipe
dut#
To be able to perform the tests, I set various things by my scripts,and perhaps one of them could be the culprit, but I cannot find it. Iexecute the scripts in the /root/DUT-settings directory of the OpenBSDserver from the bash shell script running on the tester using ssh. Therelevant scripts are:
dut# pwd
/root/DUT-settings

dut# cat set-nat64-varip # this one sets static NDP and ARP entries
/root/DUT-settings/set-ndm-left 0 3999
/root/DUT-settings/set-arpm-right 2 1001

dut# cat set-ndm-left
for i in $(seq $1 $2)
do
  h=$(printf "%x" $i)
  ndp -s 2001:2::$h:2 24:6e:96:3c:3f:40 permanent
done

dut# cat set-ndm-right
for i in $(seq $1 $2)
do
  h=$(printf "%x" $i)
  ndp -s 2001:2:0:8000::$h:2 24:6e:96:3c:3f:42 permanent
done

dut# cat set-pf
pfctl -f /etc/pf-set-nat64

dut# cat /etc/pf-set-nat64
#       $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $
#
# See pf.conf(5) and /etc/examples/pf.conf

set skip on lo

block return    # block stateless traffic
pass            # establish keep-state

# By default, do not permit remote connections to X11
block return in on ! lo0 proto tcp to port 6000:6010

# Port build user does not need network
block return out log proto {tcp udp} user _pbuild

set skip on em1 # to protect ssh
set limit states 1000000000 # 1000M
set timeout interval 3600 # 1 hour
pass in on ix0 inet6 from any to 64:ff9b::/96 af-to inet from 198.19.0.1

dut#

When everything is set, then the test follows. I have two kinds of tests.
1) Maximum connection establishment rate test. It sends 4M test frameswith all different source IP address and destination IP addresscombinations to establish 4M connections. The test uses a binarysearch to find the highest rate at which all connections areestablished. (In fact it is not checked. What is checked, is that alltest frames arrive back the the Tester.)
2) Throughput test. First, the 4M connections are loaded into theconnection tracking table of PF. Then comes the throughput test withbidirectional traffic. One elementary test last for 60s. A binarysearch is used to find the highest rate at which all frames areforwarded.
In the case of both tests, I reboot the DUT after each elementary stepof the binary search. Its aim is to completely clear the connectiontracking table of PF. And, IMHO, it should put the OpenBSD server intoa well defined, clear state. After which, it should behave the in thesame way, every time.
And now come the weird things. The maximum connection establishmentrate test was successful. The binary search was executed 10 timeswithout any problem. As for the throughput test, the binary search wasdone ones fully. (It means 9 steps.)
Here is the first result:
No, Size, Dir, n, m, Duration, Initial Rate, N, M, R, T, D, Error,Date, Iterations needed, rate1, 84, b, 2, 2, 60, 200000, 4000000, 4000000, 80000, 500, 51000, 1000,2023-09-03 18:23:27, 9, 361718
root@tester:~/siitperf#
And when the binary search was executed the second time, it stoppedworking after the second iteration. This is the relevant part from thenohup.out file:
Preliminary frames received: 4000000
Info: Preliminary phase finished.
Info: Testing initiated at 2023-09-03 18:31:35
Info: Forward sender's sending took 59.9999967862 seconds.
Forward frames sent: 18000000
Info: Reverse sender's sending took 59.9999967887 seconds.
Reverse frames sent: 18000000
Forward frames received: 17726784
Reverse frames received: 17782805
Info: Test finished.
Rebooting the DUT and then waiting for 240 seconds...
ssh_exchange_identification: read: Connection reset by peer^M
Done.
The script waited for 240s and then continued the work, but from thispoint it could never ssh to the DUT again, and thus all its furtherresults are rubbish...
Some more information: I could execute the throughput test without anyproblem, when I used only a single IP address pair and 4M differentport number combinations. This makes me think that perhaps the usageof the high number of IP addresses (4000 static NDP and 1000 staticARP entries are set) could cause the problem? But I reboot the systemafter every single step. Why it does not have a clear state then?
Could there be some random event?
Did I make a mistake in pf.conf? -- I am not familiar with PF, so ithas a chance, too!
Could you please advise me?

Thank you very much in advance!

Best regards,

Gábor
p.s.: Although I do not suspect any hardware problem, I have attachedthe dmesg of the DUT.

the figure -- Re: Please help: ssh_exchange_identification: read: Connection reset by peer

Reply via email to