Not a lot of people are familiar with fd passing so I'll give
a short description:
By using AF_UNIX sockets between processes, a process can use
sendmsg() to send a filedescriptor through the socket where the
other process will do a recvmsg() to pickup the descriptor.
The "problem" is that if a descriptor is in transit/inflight
and the sending process closes the file, it still needs to
remain open for the recipient.
What can happen is:
process A: sendmsg(descriptor)
process A: exit
process B: exit
without the garbage collection we'd have leaked a file descriptor
inside the kernel.
There's a pretty complex loop in sys/kern/uipc_usrreq.c that
deals with garbage collecting these inflight descriptors.
The problem with the garbage collection routine is that:
1) it's expensive as it walks all the open files in the system at least
twice.
2) it's ugly/hackish
3) it will need to aquire global locks on kernel structure lists
for signifigant amounts of time.
4) complicates the code because certain things need to be done
out of order, ie sorflush before sofree (which does the sorflush
anyway).
The solution is actually taken from Linux, in Linux all network
buffers have the ability to have a free routine callback done
on them when a network buffer is deallocated.
FreeBSD only has a free routine available for M_EXT buffers
(buffers with external storage), the routine is called when
(m_flags & M_EXT) != 0 && m_type != EXT_CLUSTER
To achieve my goal I made it so that all fd passing requires an
mbuf cluster and took responsibility for freeing the mbuf
cluster in my callback.
I set m_type == EXT_CMSG_DATA and provide my own free routine
until the descriptors are read by the recieving process, if the
descriptors are read then i restore it back to a "normal"
mbuf with an attached cluster to be free()'d.
Good things about this patch:
1) simplifies
a) locking
b) descriptor management
c) the code in general
2) less latency, the gc routine can be expensive
3) some comments are added describing some other stuff that needs
fixing. (problems with rfork threads)
4) shrink struct file by one int
Problems with this patch:
1) most fd passing probably only sends one descriptor at time,
by allocating clusters I'm wasting a lot more space, and taking
more time to do the allocation.
2) the mbuf subsystem should provide macros to do what I'm doing
(hijacking the free routine on a mbuf+cluster)
3) the mbuf subsystem should provide a way to get a callback on
a single mbuf without a cluster attached.
http://people.FreeBSD.org/~alfred/inflight.diff
thanks,
--
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message