[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

Jan Kokemüller  changed:

   What|Removed |Added

 Attachment #192213|0   |1
is obsolete||
 Attachment #192214|0   |1
is obsolete||
 Attachment #192216|0   |1
is obsolete||

--- Comment #5 from Jan Kokemüller  ---
Created attachment 192350
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192350&action=edit
tar archive with test program, Dtrace script and log output

I've attached a program that reproduces the bug faster. I've also updated the
Dtrace script and added some new log output. The Dtrace script now prints all
socantrcvmore() calls, not only those from "a.out".

It turns out that the kernel sometimes closes the socket in the unp garbage
collector (look for sockbuf f800adec3b50 in the debug log). So there
probably is no cache issue after all.

Investigating further...

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

--- Comment #4 from Jan Kokemüller  ---
To uncover the bug it helps to run multiple instances at the same time and to
remove the debug output or to pipe it into /dev/null. After a few seconds, at
least some instances should be dead.

I've tested only amd64 for now. I could reproduce it on stock FreeBSD 10/11
systems:


FreeBSD 10.3-RELEASE-p24 FreeBSD 10.3-RELEASE-p24 #0: Wed Nov 15 04:57:40 UTC
2017 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
hw.model: Intel(R) Celeron(R) CPU  N3150  @ 1.60GHz


FreeBSD 11.1-RELEASE-p7 FreeBSD 11.1-RELEASE-p7 #0: Tue Mar  6 09:33:30 UTC
2018 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
hw.model: Intel(R) Celeron(R) CPU G530 @ 2.40GHz

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

Jan Kokemüller  changed:

   What|Removed |Added

 Attachment #192211|0   |1
is obsolete||
 Attachment #192216|text/x-csrc |text/plain
  mime type||

--- Comment #3 from Jan Kokemüller  ---
Created attachment 192216
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192216&action=edit
test program (fixed)

I've mistakenly uploaded the program with a workaround applied. Here is a fixed
version with the workaround behind a ifdef.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

--- Comment #2 from Jan Kokemüller  ---
Created attachment 192214
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192214&action=edit
excerpt of Dtrace script output

Here is part of the Dtrace debug log.

When 'got err' is printed it means that the parent could not read the byte from
the child. At this point the address of the receive sockbuf is
'f801071df148' and the SBS_CANTRCVMORE flag is set (CPU 1).

However, further above, this sockbuf looks fine in the child (CPU 3) and there
aren't any calls to socantrcvmore_locked() in the meantime.

Even further above, this sockbuf was destroyed by the parent in a previous
iteration (CPU 1) and therefore SBS_CANTRCVMORE is set. But this should not
affect the current iteration. It looks like the memory from the child process
on CPU 3 isn't made visible to CPU 1 properly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

Jan Kokemüller  changed:

   What|Removed |Added

 Attachment #192213|text/x-dsrc |text/plain
  mime type||

--- Comment #1 from Jan Kokemüller  ---
Created attachment 192213
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192213&action=edit
Dtrace script for debugging

Here is a Dtrace script that could be helpful in tracing down the issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 227285] File descriptor passing does not work reliably on SMP system (cache coherency issue?)

2018-04-04 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227285

Bug ID: 227285
   Summary: File descriptor passing does not work reliably on SMP
system (cache coherency issue?)
   Product: Base System
   Version: CURRENT
  Hardware: Any
OS: Any
Status: New
  Severity: Affects Some People
  Priority: ---
 Component: kern
  Assignee: freebsd-bugs@FreeBSD.org
  Reporter: jan.kokemuel...@gmail.com
 Attachment #192211 text/plain
 mime type:

Created attachment 192211
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=192211&action=edit
test program

The attached program repeatedly spawns a child process that creates a
socketpair. One socket is passed back to the parent. One byte is written into
the other end. The parent should be able to read this byte, but sometimes 0
(EOF) is returned.

It seems that the parent reuses stale socket/sockbuf memory from prior
iterations.

To simplify the fd passing I've used fd_send/fd_recv from libnv. Those use
SCM_RIGHTS under the hood. The program must be compiled with 'cc -O3 fdpass.c
-lnv'.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"