On Jan 7, 2005, at 12:33 PM, Danny MacMillan wrote:

I haven't looked at the code, but your assertion is extremely unlikely.
I really want to say "impossible" but as I said, I haven't looked at
the code.  If FreeBSD loaded entire executable images into RAM when
starting new processes, it would perform very poorly.  What is more
likely is that the kernel keeps the image file open during program
execution.  When the xterm binary is replaced, the old binary is still
on disk in its old location, it just doesn't have any directory
entries pointing to it.  Since the kernel still has the file open it
won't be overwritten.  Hence the kernel can and will still load
pages from the old image.  This is a function of the same behaviour
that causes df and du output to differ in some cases.

The lsof(8) utility seems to bear this out, as each process seems to
keep each image (program and shared object files) open during
execution.

A new instance of xterm would use the new, upgraded binary.


When you run a program the program that runs the new one makes a copy of itself in the process table and they share code pages. This is done through fork(). At that point the new process, called the child, calls one of the exec() function calls which in turn calls a single syscall, execve(). execve() uses namei() to get the vnode pointer. Each vnode pointer has three ference counts, v_usecount, v_holdcnt and v_writecount. A vnode is not recycled until both the usecount and holdcnt are 0. When namei() is called it calls VREF() which is vref() which does


        vp->v_usecount++;

so if it's running the page can't be recycled from a point in time before the program actually is loaded in to memory. execve() calls exec_map_first_page(). Without tearing this apart I'm going to guess that this memory maps the first page of text (code) through the VM subsystem as evidenced by the conspicuous calls to vm_page*() functions so I'd conclude the file is memory mapped. Presuming it turns out the command you're calling isn't a shell script or other script execve() cleans up the environment so file descriptors and signal handlers don't get shared, the processes environment is setup, lets the calling (forking) process know it can continue on it's merry way, sets uid/gid if necessary/possible, and it looks like the scheduler takes care of the rest (I'll be honest here, the code seems to trail off here so far as I can tell in to parts that are jumped to in case of error). In any case we have a increased usecount.

Now we are going to unlink that file and create a new one.

After some basic checks (you can't remove the root of a file system for example) unlink() will call VOP_REMOVE() which calls vrele() which deincrements the usecount when it's greater than one, which in this case it MUST be because the xterm process has one count on it and the file entry has another (hard links to the file may have additional counts on it).

Therefore it appears that you can unlink the file, it will remain on the disk to serve the memory mapped image used for the running process and install a new copy. I'm going to presume when a process exits it de-increments the usecount for the vnode, which, when 0 should put the page on the free list.

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to