On 24.10.2016. 23:36, Mike Belopuhov wrote:
> On Mon, Oct 24, 2016 at 19:04 +0200, Hrvoje Popovski wrote:
>> Hi all,
>>
>> OpenBSD box acts as transit router for /8 networks without pf and with
>> this sysctls
>>
>> ddb.console=1
>> kern.pool_debug=0
>> net.inet.ip.forwarding=1
>> net.inet.ip.ifq.maxlen=8192
>>
>> netstat
>> 11/8 192.168.11.2 UGS 0 114466419 - 8 ix0
>> 12/8 192.168.12.2 UGS 0 0 - 8 ix1
>> 13/8 192.168.13.2 UGS 0 0 - 8 myx0
>> 14/8 192.168.14.2 UGS 0 0 - 8 myx1
>> 15/8 192.168.15.2 UGS 0 0 - 8 em3
>> 16/8 192.168.16.2 UGS 0 89907239 - 8 em2
>> 17/8 192.168.17.2 UGS 0 65791508 - 8 bge0
>> 18/8 192.168.18.2 UGS 0 0 - 8 bge1
>>
>> while testing dlg@ "mcl2k2 mbuf clusters" patch with todays -current i
>> saw that performance with plain -current drops for about 300Kpps vs
>> -current from 06.10.2016. by bisecting cvs tree it seems that this
>> commit is guilty for this
>>
>> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/if_ethersubr.c?rev=1.240&content-type=text/x-cvsweb-markup
>>
>
> I don't see how this change can affect performance in such a way
> unless you're sending jumbo packets, but then the packet rates
> are too high. Are you 100% sure it's this particular change?
>
No, no, i'm not 100% sure. I was doing this to try to find bottleneck:
cvs -q checkout -D "2016-10-XX" -P src
2016-10-06 - 900kpps
2016-10-07 - 900kpps
2016-10-10 - 900kpps
2016-10-11 - 650kpps
2016-10-11 with if_ethersubr.c 1.239 - 900kpps
...
2016-10-14 - 650kpps
2016-10-14 with dlg@ patch - 900kpps
2016-10-14 with dlg@ patch and with if_ethersubr.c 1.239 - 880kpps
....
2016-10-24 - results are in mail ...
and then i looked at networking diffs from 2016-10-10 and 2016-10-11 and
it seems that if_ethersubr.c is guilty
tests was done over ix only ...
although as you can see with today's plain -current i'm getting 690kpps
and with today's -current with if_ethersubr.c 1.239 i'm getting 910kpps
so i thought that there must be something with if_ethersubr.c
> What kind of traffic are you testing this with?
> I assume small IP or UDP packets, correct?
>
yes, 64 byte UDP without flowcontrol..
> Actually I'd like to know what causes this.
>
> So far I've noticed that the code generating ICMP error doesn't
> reserve space for the link header but it's unlikely a culprit.
> (The diff was only compile tested so far...)
>
> diff --git sys/netinet/ip_icmp.c sys/netinet/ip_icmp.c
> index cdd60aa..b3ddea4 100644
> --- sys/netinet/ip_icmp.c
> +++ sys/netinet/ip_icmp.c
> @@ -208,19 +208,21 @@ icmp_do_error(struct mbuf *n, int type, int code,
> u_int32_t dest, int destmtu)
>
> if (icmplen + ICMP_MINLEN > MCLBYTES)
> icmplen = MCLBYTES - ICMP_MINLEN - sizeof (struct ip);
>
> m = m_gethdr(M_DONTWAIT, MT_HEADER);
> - if (m && (sizeof (struct ip) + icmplen + ICMP_MINLEN > MHLEN)) {
> + if (m && (max_linkhdr + sizeof(struct ip) + icmplen +
> + ICMP_MINLEN > MHLEN)) {
> MCLGET(m, M_DONTWAIT);
> if ((m->m_flags & M_EXT) == 0) {
> m_freem(m);
> m = NULL;
> }
> }
> if (m == NULL)
> goto freeit;
> + m->m_data += max_linkhdr;
> /* keep in same rtable */
> m->m_pkthdr.ph_rtableid = n->m_pkthdr.ph_rtableid;
> m->m_len = icmplen + ICMP_MINLEN;
> if ((m->m_flags & M_EXT) == 0)
> MH_ALIGN(m, m->m_len);
>
with -current from few minutes ago and with this diff i'm getting panic
login: panic: pool_do_get: mbufpl free list modified: page
0xffffff00697e8000; item addr 0xffffff00697e8800; offset 0x0=
0x3800004500081c56 != 0xf2a4b1392c5839b2
Stopped at Debugger+0x9: leave
TID PID UID PRFLAGS PFLAGS CPU COMMAND
*11010 11010 83 0x100012 0 2 ntpd
Debugger() at Debugger+0x9
panic() at panic+0xfe
pool_runqueue() at pool_runqueue
pool_get() at pool_get+0xb5
m_get() at m_get+0x28
m_getuio() at m_getuio+0x5c
sosend() at sosend+0x268
sendit() at sendit+0x258
sys_sendmsg() at sys_sendmsg+0xc0
syscall() at syscall+0x27b
--- syscall (number 28) ---
end of kernel
end trace frame: 0x7f7fffff11f0, count: 5
0xd9f5f7f362a:
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb{2}> show panic
pool_do_get: mbufpl free list modified: page 0xffffff00697e8000; item
addr 0xff
ffff00697e8800; offset 0x0=0x3800004500081c56 != 0xf2a4b1392c5839b2
ddb{2}> trace
Debugger() at Debugger+0x9
panic() at panic+0xfe
pool_runqueue() at pool_runqueue
pool_get() at pool_get+0xb5
m_get() at m_get+0x28
m_getuio() at m_getuio+0x5c
sosend() at sosend+0x268
sendit() at sendit+0x258
sys_sendmsg() at sys_sendmsg+0xc0
syscall() at syscall+0x27b
--- syscall (number 28) ---
end of kernel
end trace frame: 0x7f7fffff11f0, count: -10
0xd9f5f7f362a:
ddb{2}> ps
TID PPID PGRP UID S FLAGS WAIT COMMAND
3659 1 3659 0 3 0x100083 ttyin getty
80732 1 80732 0 3 0x100083 ttyin getty
48644 1 48644 0 3 0x100083 ttyin getty
99800 1 99800 0 3 0x100083 ttyin getty
15583 1 15583 0 3 0x100083 ttyin getty
14129 1 14129 0 3 0x100083 ttyin getty
32286 1 32286 0 3 0x100098 poll cron
45330 81046 81046 720 3 0x90 kqread lldpd
81046 1 81046 0 3 0x80 netio lldpd
24352 16850 16850 95 3 0x100092 kqread smtpd
29475 16850 16850 103 3 0x100092 kqread smtpd
21846 16850 16850 95 3 0x100092 kqread smtpd
30670 16850 16850 95 3 0x100092 kqread smtpd
50042 16850 16850 95 3 0x100092 kqread smtpd
72825 16850 16850 95 3 0x100092 kqread smtpd
16850 1 16850 0 3 0x100080 kqread smtpd
57122 1 57122 0 3 0x80 select sshd
51132 11010 70867 83 3 0x100092 poll ntpd
*11010 70867 70867 83 7 0x100012 ntpd
70867 1 70867 0 3 0x100080 poll ntpd
50211 57585 57585 73 3 0x100090 kqread syslogd
57585 1 57585 0 3 0x100082 netio syslogd
67046 0 0 0 3 0x14200 pgzero zerothread
59733 0 0 0 3 0x14200 aiodoned aiodoned
67761 0 0 0 3 0x14200 syncer update
8369 0 0 0 3 0x14200 cleaner cleaner
21760 0 0 0 3 0x14200 reaper reaper
97104 0 0 0 3 0x14200 pgdaemon pagedaemon
77249 0 0 0 3 0x14200 bored crynlk
33513 0 0 0 3 0x14200 bored crypto
61197 0 0 0 3 0x14200 pftm pfpurge
71518 0 0 0 3 0x14200 usbtsk usbtask
23439 0 0 0 3 0x14200 usbatsk usbatsk
33777 0 0 0 3 0x40014200 acpi0 acpi0
19603 0 0 0 7 0x40014200 idle11
85142 0 0 0 7 0x40014200 idle10
41110 0 0 0 7 0x40014200 idle9
76515 0 0 0 7 0x40014200 idle8
31558 0 0 0 7 0x40014200 idle7
62066 0 0 0 7 0x40014200 idle6
82073 0 0 0 7 0x40014200 idle5
21747 0 0 0 7 0x40014200 idle4
28460 0 0 0 7 0x40014200 idle3
71 0 0 0 3 0x40014200 idle2
93974 0 0 0 7 0x40014200 idle1
76684 0 0 0 3 0x14200 bored sensors
21314 0 0 0 3 0x14200 bored softnet
45074 0 0 0 3 0x14200 bored systqmp
91514 0 0 0 3 0x14200 bored systq
18114 0 0 0 3 0x40014200 bored softclock
8873 0 0 0 7 0x40014200 idle0
63311 0 0 0 3 0x14200 bored sbar
1 0 1 0 3 0x82 wait init
0 -1 0 0 3 0x10200 scheduler swapper