On Sat, Jun 27, 2026 at 03:08:37PM -0400, Robert Edmonds wrote:

! However, it seems to me that there might be a race condition in BIND
! when running on FreeBSD.
! 
! fstrm_iothr_init() in the fstrm library is the function that is
! responsible for making dnstap UNIX socket connection attempts. It looks
! like BIND reaches that function via the function call sequence:
! 
!     main()
!     setup()
!     named_server_create()
!     isc_loop_setup(...run_server...)
!     run_server()
!     load_configuration()
!     configure_view()
!     configure_dnstap()
!     dns_dt_create()
!     fstrm_iothr_init()

Great! This is what I wished to have yesterday. ;)
Now I got the idea to use truss on the named (one doesn't always
immediately consider the best option), and found this:

socket(PF_LOCAL,SOCK_STREAM|SOCK_CLOEXEC,0)      = 24 (0x18)
fcntl(24,F_GETFD,)                               = 1 (0x1)
fcntl(24,F_SETFD,FD_CLOEXEC)                     = 0 (0x0)
setsockopt(24,SOL_SOCKET,SO_NOSIGPIPE,0x824189eec,4) = 0 (0x0)
connect(24,{ AF_UNIX "/var/run/dnstap.sock" },106) ERR#13 'Permission denied'

! On FreeBSD, it looks like BIND changes UID/GID via the function call
! sequence:

Now it gets interesting :)
The socket is apparently created by fstrm_capture, and by default it
looks like so:

root@conr:~ # dir /var/named/var/run/
total 3
drwxr-xr-x  3 bind bind  uarch 5 Jun 26 20:28 .
drwxr-xr-x  7 root wheel uarch 7 Jun 10 16:15 ..
srwxr-xr-x  1 root bind  uarch 0 Jun 26 20:28 dnstap.sock
...

I'd think this socket cannot be written by UID/GID bind/bind. So why
does it work this way for my intranet server and not for the
authoritative?
Maybe the question would be: WHEN do we switch UID, and WHEN does that
socket get opened, and can that differ by workload?

The truss logs will likely show that, but I'll look into them only
after no longer excessively sweating... Anyway, the final solution
might well be to just put an "umask 002" at the invocation of
fstrm_capture, and be done. We will see.

! In load_configuration(), the calls to named_os_changeuser() occur
! *after* the calls to configure_view(), which is how fstrm_iothr_init()
! is reached. So, it could be that libfstrm's background I/O thread makes
! its initial socket connection attempt with root privileges, before
! named_os_changeuser() has performed the setuid(), so the filesystem
! permissions don't matter. Or it could be that the background I/O thread
! is delayed from executing long enough that named_os_changeuser() has
! already run, and the filesystem permissions must be correct in order for
! the socket connection attempt to succeed.
! 
! On a Linux system with BIND built against libcap, it looks like the
! setuid() occurs much earlier in the process lifetime via the function
! call sequence:
! 
!     main()
!     setup()
!     named_os_minprivs()  <--- occurs before named_server_create()
!     named_os_changeuser()
! 
! That would complete prior to fstrm_iothr_init() being invoked, so there
! would not be such a race condition on a Linux system.

Oh great thanks! You now provided all the details I was reluctant
to figure out from the code :))

So on Linux this would fail completely and everywhere. And here on
my site a relevant difference might be that the authoritative server
has only a _default view, while the intranet server has a dozen
and talks to itself all the time.

A thought: is that EPERM from the connect() (as shown above)
propagated to any logging? It would make things easier. I had no
idea how to read that socket low-level to see if anything arrives,
netcat didn't really work, and then one is very blind at that point. 

cheerio & thanks!
PMc
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list.

Reply via email to