[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-10-12 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

Michael Tuexen  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|In Progress |Closed

--- Comment #35 from Michael Tuexen  ---
Closing this, as I think it is fixed. Please reopen, if the problem still
exists.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-10-12 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #34 from Richard Scheffenegger  ---
I believe we can close this bug, as we haven't had any reports of issues by
those affected after updating/patching..

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #33 from commit-h...@freebsd.org ---
A commit in branch stable/13 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=0612d3000b974f31de15c90c77bf43f121fc8656

commit 0612d3000b974f31de15c90c77bf43f121fc8656
Author: Michael Tuexen 
AuthorDate: 2022-09-19 10:42:43 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:54:18 +

tcp: fix computation of offset

Only update the offset if actually retransmitting from the
scoreboard. If not done correctly, this may result in
trying to (re)-transmit data not being being in the socket
buffe and therefore resulting in a panic.

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36626

(cherry picked from commit 6d9e911fbadf3b409802a211c1dae9b47cb5a2b8)

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #32 from commit-h...@freebsd.org ---
A commit in branch stable/13 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=f9edad0054652e020b8214f61c0e454fd48101a6

commit f9edad0054652e020b8214f61c0e454fd48101a6
Author: Michael Tuexen 
AuthorDate: 2022-09-22 10:12:11 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:55:41 +

tcp: send ACKs when requested

When doing Limited Transmit send an ACK when needed by the protocol
processing (like sending ACKs with a DSACK block).

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36631

(cherry picked from commit 5ae83e0d871bc7cbe4dcc9a33d37eb689e631efe)

 sys/netinet/tcp_input.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #31 from commit-h...@freebsd.org ---
A commit in branch stable/13 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=c1f9a81e7bfe354dfa4f191d5180426f76bc514b

commit c1f9a81e7bfe354dfa4f191d5180426f76bc514b
Author: Richard Scheffenegger 
AuthorDate: 2022-09-22 10:55:25 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:56:28 +

tcp: fix cwnd restricted SACK retransmission loop

While doing the initial SACK retransmission segment while heavily cwnd
constrained, tcp_ouput can erroneously send out the entire sendbuffer
again. This may happen after an retransmission timeout, which resets
snd_nxt to snd_una while the SACK scoreboard is still populated.

Reviewed By:tuexen, #transport
PR: 264257
PR: 263445
PR: 260393
MFC after:  3 days
Sponsored by:   NetApp, Inc.
Differential Revision:  https://reviews.freebsd.org/D36637

(cherry picked from commit a743fc8826fa348b09d219632594c537f8e5690e)

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #30 from commit-h...@freebsd.org ---
A commit in branch stable/12 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=3651c4f42285644938e2f5bc924ab8c7ed857f83

commit 3651c4f42285644938e2f5bc924ab8c7ed857f83
Author: Richard Scheffenegger 
AuthorDate: 2022-09-22 10:55:25 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:52:56 +

tcp: fix cwnd restricted SACK retransmission loop

While doing the initial SACK retransmission segment while heavily cwnd
constrained, tcp_ouput can erroneously send out the entire sendbuffer
again. This may happen after an retransmission timeout, which resets
snd_nxt to snd_una while the SACK scoreboard is still populated.

Reviewed By:tuexen, #transport
PR: 264257
PR: 263445
PR: 260393
MFC after:  3 days
Sponsored by:   NetApp, Inc.
Differential Revision:  https://reviews.freebsd.org/D36637

(cherry picked from commit a743fc8826fa348b09d219632594c537f8e5690e)

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #29 from commit-h...@freebsd.org ---
A commit in branch stable/12 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=26370413d43bfd65500270ff331ae6bdf0f54133

commit 26370413d43bfd65500270ff331ae6bdf0f54133
Author: Michael Tuexen 
AuthorDate: 2022-09-19 10:42:43 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:41:54 +

tcp: fix computation of offset

Only update the offset if actually retransmitting from the
scoreboard. If not done correctly, this may result in
trying to (re)-transmit data not being being in the socket
buffe and therefore resulting in a panic.

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36626

(cherry picked from commit 6d9e911fbadf3b409802a211c1dae9b47cb5a2b8)

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-25 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #28 from commit-h...@freebsd.org ---
A commit in branch stable/12 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=9e69e009c86f259653610f3c337253b79381c7a7

commit 9e69e009c86f259653610f3c337253b79381c7a7
Author: Michael Tuexen 
AuthorDate: 2022-09-22 10:12:11 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-25 08:46:54 +

tcp: send ACKs when requested

When doing Limited Transmit send an ACK when needed by the protocol
processing (like sending ACKs with a DSACK block).

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36631

(cherry picked from commit 5ae83e0d871bc7cbe4dcc9a33d37eb689e631efe)

 sys/netinet/tcp_input.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #27 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=a743fc8826fa348b09d219632594c537f8e5690e

commit a743fc8826fa348b09d219632594c537f8e5690e
Author: Richard Scheffenegger 
AuthorDate: 2022-09-22 10:55:25 +
Commit: Richard Scheffenegger 
CommitDate: 2022-09-22 11:28:43 +

tcp: fix cwnd restricted SACK retransmission loop

While doing the initial SACK retransmission segment while heavily cwnd
constrained, tcp_ouput can erroneously send out the entire sendbuffer
again. This may happen after an retransmission timeout, which resets
snd_nxt to snd_una while the SACK scoreboard is still populated.

Reviewed By:tuexen, #transport
PR: 264257
PR: 263445
PR: 260393
MFC after:  3 days
Sponsored by:   NetApp, Inc.
Differential Revision:  https://reviews.freebsd.org/D36637

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #26 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=5ae83e0d871bc7cbe4dcc9a33d37eb689e631efe

commit 5ae83e0d871bc7cbe4dcc9a33d37eb689e631efe
Author: Michael Tuexen 
AuthorDate: 2022-09-22 10:12:11 +
Commit: Michael Tuexen 
CommitDate: 2022-09-22 10:12:11 +

tcp: send ACKs when requested

When doing Limited Transmit send an ACK when needed by the protocol
processing (like sending ACKs with a DSACK block).

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36631

 sys/netinet/tcp_input.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-09-19 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #25 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=6d9e911fbadf3b409802a211c1dae9b47cb5a2b8

commit 6d9e911fbadf3b409802a211c1dae9b47cb5a2b8
Author: Michael Tuexen 
AuthorDate: 2022-09-19 10:42:43 +
Commit: Michael Tuexen 
CommitDate: 2022-09-19 10:49:31 +

tcp: fix computation of offset

Only update the offset if actually retransmitting from the
scoreboard. If not done correctly, this may result in
trying to (re)-transmit data not being being in the socket
buffe and therefore resulting in a panic.

PR: 264257
PR: 263445
PR: 260393
Reviewed by:rscheff@
MFC after:  3 days
Sponsored by:   Netflix, Inc.
Differential Revision:  https://reviews.freebsd.org/D36626

 sys/netinet/tcp_output.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-10 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #24 from Richard Scheffenegger  ---
The current thinking is, that SACK rescue retransmissions (in FBSD13 this is
gated by net.inet.tcp.rfc6675_pipe=1) very rarely creates an entry, which
apparently is beyond the valid data range. 

While under most common circumstances, a final FIN bit in the sequence space is
taken care of, it seems that there may be some double-counting for the FIN bit.

In most of the inspected cores, we found:

TCP state: LAST_ACK (FIN received and also FIN sent)
SACK loss recovery triggered
A cumulative ACK before all outstanding data was received
The remote cliet "disappears" for a significant amount of time (7 to 12
retransmission timeouts), but may re-appear again just prior.
snd_max consistently 2 counts above the last data, instead of the expected 1
(for the FIN bit).

However, it is still unclear under what circumstances this double-counting
happens, possibly when the persist timer triggers, and a few other conditions
are also fulfilled - maybe a race condition between normal packet processing
and a timer firing.

In short: disabling rfc6675 enhanced SACK features (more correct pipeline
accounting, rescue retransmissions) should address the cause of the panic,
while not addressing the root cause of when/why there is the double-accounting
of the FIN bit...

Would you be willing to run an intrumented kernel, which either panics (full
core dump), or spews out various state, when inconsistencies are detected in
this space - while ignoring/addressing them "on the fly" without panicing?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #23 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=57317c8971df76bd6faeb7dfdc4379097d004caf

commit 57317c8971df76bd6faeb7dfdc4379097d004caf
Author: Richard Scheffenegger 
AuthorDate: 2022-06-08 12:21:28 +
Commit: Richard Scheffenegger 
CommitDate: 2022-06-08 12:51:31 +

tcp: exclude KASSERTS when rescue retransmissions are in play.

The KASSERT criteria needs to be checked against the
sendbuffer so_snd in a subsequent version.

Reviewed By:tuexen, #transport
PR: 263445
MFC after:  1 week
Sponsored by:   NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35431

 sys/netinet/tcp_sack.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #22 from Marek Zarychta  ---
*** Bug 264534 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #21 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=ce2525c8108a830d08d75771621d1bc580edd82c

commit ce2525c8108a830d08d75771621d1bc580edd82c
Author: Richard Scheffenegger 
AuthorDate: 2022-06-08 07:14:16 +
Commit: Richard Scheffenegger 
CommitDate: 2022-06-08 07:18:32 +

tcp: remove goto and address another NULL deref in SACK

Missed another NULL dereference during KASSERTS after traversing
the scoreboard. While at it, scratch the goto by making the
traversal conditional, and remove duplicate checks using an
unconditional loop with all checks inside.

Reviewed By:hselasky
PR: 263445
MFC after:  1 week
Sponsored by:   NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35428

 sys/netinet/tcp_sack.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-07 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #20 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=231e0dd5d1fb7778b1cb285e5ebee5502d5ad253

commit 231e0dd5d1fb7778b1cb285e5ebee5502d5ad253
Author: Richard Scheffenegger 
AuthorDate: 2022-06-07 16:16:54 +
Commit: Richard Scheffenegger 
CommitDate: 2022-06-07 16:18:42 +

tcp: skip sackhole checks on NULL

Inadvertedly introduced NULL pointer dereference during
sackhole sanity check in D35387.

Reviewed By:glebius
PR: 263445
MFC after:  1 week
Sponsored by:   NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35423

 sys/netinet/tcp_sack.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-07 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #19 from commit-h...@freebsd.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=91d6afe6e2a912fd5059fc11dbeffc85474897af

commit 91d6afe6e2a912fd5059fc11dbeffc85474897af
Author: Richard Scheffenegger 
AuthorDate: 2022-06-07 07:07:09 +
Commit: Richard Scheffenegger 
CommitDate: 2022-06-07 07:38:16 +

tcp: Sanity check of SACK holes on retransmissions

Adding a few KASSERT() to validate sanity of sack holes, and
bail out if sack hole is inconsistent to avoid panicing non-invariant
builds.

Reviewed By:hselasky, glebius
PR: 263445
MFC after:  1 week
Sponsored by:   NetApp, Inc.
Differential Revision:  https://reviews.freebsd.org/D35387

 sys/netinet/tcp_sack.c | 12 
 1 file changed, 12 insertions(+)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-07 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #18 from Igor A. Valkov  ---
(In reply to Michael Tuexen from comment #16)
Ok. Now rebuilding kernel without options KERN_TLS.

In the previous configuration, uptime is:
10:11  up 7 days, 11 hrs, 2 users, load averages: 17,39 16,85 17,08
:-) crashes are very random.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-07 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

Richard Scheffenegger  changed:

   What|Removed |Added

 Status|New |In Progress

--- Comment #17 from Richard Scheffenegger  ---
While we don't yet understand how the TCPCB ends up in the peculiar state it is
in when the panic happens, there appearts to be a symptomatic treatment by
ignoring invalid SACK scoreboard state which however is in the approximate
correct sequence space (thus no fully random memory contents).

See https://reviews.freebsd.org/D35387

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-06 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #16 from Michael Tuexen  ---
Can you try if the problem also occurs if you disable KTLS?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-06 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

Michael Tuexen  changed:

   What|Removed |Added

   See Also||https://bugs.freebsd.org/bu
   ||gzilla/show_bug.cgi?id=2642
   ||57

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-05 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #15 from Igor A. Valkov  ---

now

% uptime
12:33  up 5 days, 13:22, 1 user, load averages: 18,27 20,72 20,07

% zpool status
  pool: vol
 state: ONLINE
config:

NAMESTATE READ WRITE CKSUM
vol ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
ada0p3  ONLINE   0 0 0
ada1p3  ONLINE   0 0 0

% cat /etc/sysctl.conf 
vfs.zfs.min_auto_ashift=12
vfs.zfs.arc_min=8589934592
vfs.zfs.arc_max=68719476736
vfs.zfs.txg.timeout=30

kern.maxdsiz=274877906944
kern.dfldsiz=274877906944
kern.maxtsiz=274877906944

net.inet.tcp.fastopen.server_enable=0
net.inet.tcp.fastopen.client_enable=0

kern.ipc.shm_use_phys=1
kern.ipc.maxsockbuf=157286400
kern.ipc.soacceptqueue=16384

kern.ipc.tls.enable=1
kern.ipc.tls.cbc_enable=1

net.route.netisr_maxqlen=2048
net.inet.ip.intr_queue_maxlen=2048

#net.inet.tcp.functions_default=bbr
#net.inet.tcp.functions_inherit_listen_socket_stack=0

net.inet.tcp.rfc6675_pipe=1
net.inet.tcp.mssdflt=1460
net.inet.tcp.minmss=536
net.inet.tcp.abc_l_var=44
net.inet.tcp.initcwnd_segments=44

net.inet.tcp.recvbuf_max=4194304
net.inet.tcp.recvspace=1048576

net.inet.tcp.sendbuf_inc=65536
net.inet.tcp.sendbuf_max=4194304
net.inet.tcp.sendspace=1048576

net.inet.tcp.finwait2_timeout=15000

kern.corefile=/export/coredumps/%N.core

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-05 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #14 from Igor A. Valkov  ---
(In reply to Michael Tuexen from comment #13)

> How long does a server need to run until it crashes?
Sometimes several hours but sometimes several days.
Here are the timings of the incidents:

May 11 21:00
May 13 15:02
May 14 08:05
May 15 15:10
May 15 16:51
May 19 10:31 vmcore.1
May 20 01:04 vmcore.2
May 20 15:20 vmcore.3
May 30 20:17 vmcore.4
May 30 23:12 vmcore.5


> Are the servers running at high load?
Yes. Web serving on 10Gb channel (Intel X550T) with nginx. 64 cores of opteron
6380.

> Regarding "custom build kernel": We were wondering if you can compile a 
> kernel with specific options turned on (like INVARIANTS and other things) and 
> potentially have some modifications to the source code? This intention is to 
> get more information about what is going on...

Yes. I can.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-05 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #13 from Michael Tuexen  ---
(In reply to Igor A. Valkov from comment #12)
Thanks for providing the core files. We want to see if there is a pattern.

Regarding "custom build kernel": We were wondering if you can compile a kernel
with specific options turned on (like INVARIANTS and other things) and
potentially have some modifications to the source code? This intention is to
get more information about what is going on...

How long does a server need to run until it crashes? Are the servers running at
high load?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #12 from Igor A. Valkov  ---
(In reply to Michael Tuexen from comment #11)
1. Сoredumps 13.1-RELEASE with debuginfo are available here:
https://cloud.mediatoday.ru/d/0368a517063047758a0b/

2. Custom kernel is a 13.1-RELEASE GENERIC?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #11 from Michael Tuexen  ---
(In reply to Igor A. Valkov from comment #6)
Hi Igor,
we discussed this bug on todays transport call. Two questions:
1. Can you make also core files against 13.1 RELEASE available?
2. Would you be willing to and able to run a custom kernel? We might want to
provide a kernel which allows to get more data on the situation under which the
problem occurs.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #10 from Michael Tuexen  ---
(In reply to Michael Tuexen from comment #9)
The confusion was on my part by using a 13.1 RELEASE kernel instead of a 13.1R3
kernel.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

--- Comment #9 from Michael Tuexen  ---
(In reply to Richard Scheffenegger from comment #8)
Hi Richard,

I'm looking at the tracefile provided by Igor and see:

(kgdb) f 10
#10 0x80dd7eed in tcp_do_segment (m=, th=, so=, tp=0xfe025fb86518, drop_hdrlen=52,
tlen=, iptos=0 '\000') at
/usr/src/sys/netinet/tcp_input.c:2637
2637(void)
tp->t_fb->tfb_tcp_output(tp);
(kgdb) p tp->t_state
$10 = 6
(kgdb) p *tp->sackhint.nexthole
$11 = {start = 1529400226, end = 1529409856, rxmit = 1529409855, scblink =
{tqe_next = 0x0, tqe_prev = 0xfe025fb86658}}
(kgdb)

Do you really see 8 as the state (which is TCPS_LAST_ACK)? I see 6 (which is
TCPS_FIN_WAIT_1).

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 263445] [tcp] Fatal trap 12: page fault while in kernel mode // supervisor read data, page not present // 13.1-RC3

2022-06-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263445

Michael Tuexen  changed:

   What|Removed |Added

Summary|Fatal trap 12: page fault   |[tcp] Fatal trap 12: page
   |while in kernel mode // |fault while in kernel mode
   |supervisor read data, page  |// supervisor read data,
   |not present // 13.1-RC3 |page not present //
   ||13.1-RC3

-- 
You are receiving this mail because:
You are the assignee for the bug.