> This bug, 6793488, has been created recently for a > case I have open with Sun Support. This is a very > severe case to me and my customer. I would like to > get it fixed as quickly as possible. I was told the > bug was dictating this case, which in turn, affects > the priority of my case being resolved. If this > affects others please add comments to this post. If > an engineer can also help it would be greatly > appreciated. Another bug similar to this one is > 6571565. Here are the links: > > http://bugs.opensolaris.org/view_bug.do?bug_id=6793488
Hey Josh, by accident I run across your posting. Since I'm the engineer who closed this bug as not a bug I'd like to share my reasoning with you that I've put into the bugs evaluation that is not visible to you. I hope this helps you to better understand the issue and overall picture. This is not a bug (tm), it is not a bug in lofs(7FS), not in ufs(7FS) and not in mv(1) either. Rather this constitutes a collection of expectations how all these 3 things above work together that just does not reflect reality and does not match actual defined system behavior. The key to all those misunderstandings is the use of mv(1) here. Let's start with some trivia first, from this, it'll become obvious what happens. So what is mv(1) supposed to do ? http://www.opengroup.org/onlinepubs/009695399/utilities/mv.html <snip> The mv utility shall perform actions equivalent to the rename() function defined in the System Interfaces volume of IEEE Std 1003.1-2001, called with the following arguments: The source_file operand is used as the old argument. The destination path is used as the new argument. If the destination path exists, mv shall attempt to remove it. <snip end> Ie. mv(1) is essentially a rename(2), a possibly existing target file is lost, and the source file is being renamed to the target name. So what is rename(2) supposed to do ? http://www.opengroup.org/onlinepubs/009695399/functions/rename.html <snip> The rename() function shall change the name of a file. The old argument points to the pathname of the file to be renamed. The new argument points to the new pathname of the file. If the link named by the new argument exists, it shall be removed and old renamed to new. If the link named by the new argument exists and the file's link count becomes 0 when it is removed and no process has the file open, the space occupied by the file shall be freed and the file shall no longer be accessible. If one or more processes have the file open when the last link is removed, the link shall be removed before rename() returns, but the removal of the file contents shall be postponed until all references to the file are closed. <snip end> To picture this, here's the code flow from mv(1) down into ufs(7FS), leaving lofs(7FS) aside for the moment as it is not relevant to the basic idea. usr/src/cmd/mv/mv.c:cpymve() 599 if (mve) { 600 if (rename(source, target) >= 0) 601 return (0); usr/src/uts/common/syscall/rename.c:rename() 57 if (error = vn_rename(from, to, UIO_USERSPACE)) usr/src/uts/common/fs/vnode.c:vn_rename()->vn_renameat() 1678 error = VOP_RENAME(fromvp, fpn.pn_path, tovp, tpn.pn_path, CRED(), usr/src/uts/common/fs/ufs/ufs_vnops.c:ufs_rename() 3666 * Link source to the target. If a target exists, return its 3667 * vnode pointer in tvp. We'll release it after sending the 3668 * vnevent. ### in here, we rename the entry in the directory tdp so that it points to ### the source inode # instead of target inode # ### ie. the target name in the namespace now points to the source inode # ### if it existed previously, the source name dissappears from the namespace ### this happens in ufs_dirrename() 3670 if (error = ufs_direnter_lr(tdp, tnm, DE_RENAME, sdp, sip, cr, &tvp)) { 3671 /* 3672 * ESAME isn't really an error; it indicates that the 3673 * operation should not be done because the source and target 3674 * are the same file, but that no error should be reported. 3675 */ 3676 if (error == ESAME) 3677 error = 0; 3678 goto errout; 3679 } [...] 3682 * Unlink the source. 3683 * Remove the source entry. ufs_dirremove() checks that the entry 3684 * still reflects sip, and returns an error if it doesn't. 3685 * If the entry has changed just forget about it. Release 3686 * the source inode. 3687 */ 3688 if ((error = ufs_dirremove(sdp, snm, sip, (struct vnode *)0, 3689 DR_RENAME, cr, NULL)) == ENOENT) So what are the implications of this wrt. to the behavior complained about in this bug, ie. the example from the description: <snip> Maybe the following example could make things a little clearer: # echo user:pass1 > /etc/curpassword # touch /etc/mirrorpasswd # mount -F lofs -o ro /etc/curpassword /etc/mirrorpasswd # cat /etc/mirrorpasswd user:pass1 # echo user:pass2 > /etc/newpassword # mv /etc/newpassword /etc/curpassword # cat /etc/mirrorpasswd user:pass1 Here /etc/curpassword is the password or shadow table provided by the global zone, and /etc/mirrorpasswd would be the Read-Only view given in a local zone. Changing the content of /etc/curpassword with a rename (here mv, but vipw does just the same) is not reflected in the /etc/mirrorpasswd view. <snip end> let's re-do this with a bit more details: 1) create the 1 file: opteron.root./export/home/batschul/test.=> echo user:pass1 > curpassword opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:03 . 4 10 drwxr-xr-x 86 batschul other 5120 Jan 23 13:02 .. 29577 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 curpassword we've got inode number #29577 for the UFS file 'curpassword' 2) create the new file that becomes the lofs mount point 'mirrorpasswd': opteron.root./export/home/batschul/test.=> touch mirrorpasswd opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:03 . 4 10 drwxr-xr-x 86 batschul other 5120 Jan 23 13:02 .. 29577 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 curpassword 29578 0 -rw-r--r-- 1 root root 0 Jan 23 13:03 mirrorpasswd for this, we've got inode number #29578 for the UFS file 3) now perform the loopback mount of file 1) onto file 2) opteron.root./export/home/batschul/test.=> mount -F lofs -o ro curpassword `pwd`/mirrorpasswd opteron.root./export/home/batschul/test.=> mount -v|grep lofs curpassword on /export/home/batschul/test/mirrorpasswd type lofs read-only/setuid/devices/dev=1980004 on Fri Jan 23 13:04:24 2009 opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:03 . 4 10 drwxr-xr-x 86 batschul other 5120 Jan 23 13:02 .. 29577 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 curpassword 29578 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 mirrorpasswd 4) verify that our loopback mount works: opteron.root./export/home/batschul/test.=> cat mirrorpasswd user:pass1 5) create the 2nd new file 'newpassword' opteron.root./export/home/batschul/test.=> echo user:pass2 > newpassword opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:06 . 4 10 drwxr-xr-x 86 batschul other 5120 Jan 23 13:02 .. 29577 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 curpassword 29578 2 -rw-r--r-- 1 root root 11 Jan 23 13:03 mirrorpasswd 29579 2 -rw-r--r-- 1 root root 11 Jan 23 13:06 newpassword for this we've got inode number #29579 for UFS file 'newpassword' 6) now move the 2nd 'newpassord' file over the original 1st 'currpassword' opteron.root./export/home/batschul/test.=> mv newpassword curpassword opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:06 . 4 10 drwxr-xr-x 86 batschul other 5120 Jan 23 13:02 .. 29579 2 -rw-r--r-- 1 root root 11 Jan 23 13:06 curpassword 29578 2 -rw-r--r-- 0 root root 11 Jan 23 13:03 mirrorpasswd see what happened ? the original 1st 'curpassword' file with inode #29577 is gone, the 2nd file 'newpassword' with inode #29579 has been renamed to 'curpassword' 7) now check the content of our file and via the loopback mount: opteron.root./export/home/batschul/test.=> cat mirrorpasswd user:pass1 opteron.root./export/home/batschul/test.=> cat curpassword user:pass2 Bummer, they are different! Now what happened ? Quite easy, remember the preliminaries about mv(1) and rename(2). The access via the loopback mount 'mirrorpasswd' still does show us the old content prior the mv(1) because that file has been deleted from the namespace as part of the mv(1) but there's still a VN_HOLD on its vnode, ie. the file 1) is still open, and that is via the loopback mount. Hence its content is still accesible via the loopback mount and while the file 1) had been removed from the namespace its content is not yet deleted. The renamed file 2) however now exists in the namespace under its new name 'curpassword' which was the former name of file 1), yet it is a different file with different content and not accociated with a loopback mount at all. To picture this from the kernel point of view _after_ the mv(1) has happened in 6) 8) check the dnlc for the 'curpassword' entrie: > ::dnlc!grep curpassword VP DVP NAME ffffff01b1b67e80 ffffff01b5b9aa00 curpassword 9) the corresponding UFS vnode > ffffff01b1b67e80::print vnode_t { v_lock = { _opaque = [ 0 ] } v_flag = 0x10000 v_count = 0x1 v_data = 0xffffff01b1b68de8 ### UFS inode #29579 from 5) v_vfsp = 0xffffff01ac655aa0 ### UFS /export/home v_stream = 0 v_type = 1 (VREG) v_rdev = 0xffffffffffffffff v_vfsmountedhere = 0 v_op = 0xffffff01a8ec4780 v_pages = 0xffffff0004f0e2e0 v_filocks = 0 v_shrlocks = 0 v_nbllock = { _opaque = [ 0 ] } v_cv = { _opaque = 0 } v_locality = 0 v_femhead = 0 v_path = 0xffffff01bb97f1e8 "/export/home/batschul/test/newpassword" (thats because rename on UFS does not update v_path) v_rdcnt = 0 v_wrcnt = 0 v_mmap_read = 0 v_mmap_write = 0 v_mpssdata = 0 v_fopdata = 0 v_vsd = 0 v_xattrdir = 0 v_count_dnlc = 0x1 } 10) verify the inode number and link count: > ffffff01b1b68de8::inode -v ADDR INUMBER T MODE SIZE DEVICE FLAG ffffff01b1b68de8 29579 - 0644 b 6600000004 <REF> 2009 Jan 23 13:06:03 /export/home/batschul/test/newpassword > ffffff01b1b68de8::print inode_t i_ic { i_ic.ic_smode = 0x81a4 i_ic.ic_nlink = 0x1 11) check the dnlc for the 'mirrorpasswd' lofs mount point: > ::dnlc!grep mirrorpasswd ffffff01b5ba6140 ffffff01b5b9aa00 mirrorpasswd 12) the corresponding UFS vnode acting as the lofs mount point 'mirrorpasswd' > ffffff01b5ba6140::print vnode_t { v_lock = { _opaque = [ 0 ] } v_flag = 0x10100 v_count = 0x2 v_data = 0xffffff01b5ba59c8 ### UFS inode #29578 from 2) v_vfsp = 0xffffff01ac655aa0 ### UFS /export/home v_stream = 0 v_type = 1 (VREG) v_rdev = 0xffffffffffffffff v_vfsmountedhere = 0xffffff01ae0b2908 ### lofs mount v_op = 0xffffff01a8ec4780 v_pages = 0 v_filocks = 0 v_shrlocks = 0 v_nbllock = { _opaque = [ 0 ] } v_cv = { _opaque = 0 } v_locality = 0 v_femhead = 0 v_path = 0xffffff01bc8661e0 "/export/home/batschul/test/mirrorpasswd" v_rdcnt = 0 v_wrcnt = 0 v_mmap_read = 0 v_mmap_write = 0 v_mpssdata = 0 v_fopdata = 0 v_vsd = 0 v_xattrdir = 0 v_count_dnlc = 0x1 } 13) verify the inode number and link count: > 0xffffff01b5ba59c8::inode -v ADDR INUMBER T MODE SIZE DEVICE FLAG ffffff01b5ba59c8 29578 - 0644 0 6600000004 <MODTIME,REF> 2009 Jan 23 13:03:33 /export/home/batschul/test/mirrorpasswd the VFS mounted here is lofs loopback mount /export/home/batschul/test/mirrorpasswd 14) the lofs mounts VFS: > 0xffffff01ae0b2908::print vfs_t { vfs_next = root vfs_prev = 0xffffff01ae0b29d8 vfs_op = vfssw+0x538 vfs_vnodecovered = 0xffffff01b5ba6140 ### covered vnode is UFS inode #29578 from 2), our "mountpoint" vfs_flag = 0x2401 vfs_bsize = 0x2000 vfs_fstype = 0xa vfs_fsid = { val = [ 0x1980004, 0x2 ] } vfs_data = 0xffffff01b7659e30 ### loinfo struct vfs_dev = 0x6600000004 vfs_bcount = 0 vfs_list = 0 vfs_hash = 0 vfs_reflock = { _opaque = [ 0, 0 ] } vfs_count = 0x1 vfs_mntopts = { mo_count = 0x11 mo_list = 0xffffff01adeba000 } vfs_resource = 0xffffff01ad1ecb08 vfs_mntpt = 0xffffff01aadd3d80 vfs_mtime = 2009 Jan 23 13:04:24 vfs_implp = 0xffffff01bfc95500 vfs_zone = zone0 vfs_zone_next = root vfs_zone_prev = 0xffffff01ae0b29d8 vfs_femhead = 0 vfs_lofi_minor = 0 } 15) grab the corresponding lofs loinfo struct: > 0xffffff01b7659e30::print 'struct loinfo' { li_realvfs = 0xffffff01ac655aa0 ### UFS /export/home li_mountvfs = 0xffffff01ae0b2908 ### LOFS /export/home/batschul/test/mirrorpasswd li_rootvp = 0xffffff01b6ebe880 li_mflag = 0x1 li_dflag = 0 li_refct = 0x1 li_htsize = 0x1 li_hashtable = 0xffffff01b155d8c0 li_lfs = 0 li_lfslock = { _opaque = [ 0 ] } li_htlock = { _opaque = [ 0 ] } li_retired = 0 li_flag = 0 } 16) look at the root lofs vnode shadowing our UFS mountpoint 'mirrorpasswd': > 0xffffff01b6ebe880::print vnode_t { v_lock = { _opaque = [ 0 ] } v_flag = 0x1 v_count = 0x1 v_data = 0xffffff01ab298f68 ### lofs lnode v_vfsp = 0xffffff01ae0b2908 ### LOFS mount /export/home/batschul/test/mirrorpasswd v_stream = 0 v_type = 1 (VREG) v_rdev = 0xffffffffffffffff v_vfsmountedhere = 0 v_op = 0xffffff01aa208340 v_pages = 0 v_filocks = 0 v_shrlocks = 0 v_nbllock = { _opaque = [ 0 ] } v_cv = { _opaque = 0 } v_locality = 0 v_femhead = 0 v_path = 0xffffff01bc8d1810 "/export/home/batschul/test/mirrorpasswd" v_rdcnt = 0 v_wrcnt = 0 v_mmap_read = 0 v_mmap_write = 0 v_mpssdata = 0 v_fopdata = 0 v_vsd = 0 v_xattrdir = 0 v_count_dnlc = 0 } 17) the corresponding lofs lnode still shadows our original UFS inode/vnode from step 1) !!! > 0xffffff01ab298f68::print lnode_t { lo_next = 0 lo_vp = 0xffffff01b395ee00 ### original, UFS inode #29577 from 1) !!! lo_looping = 0 lo_vnode = 0xffffff01b6ebe880 ### our loinfo->li_rootvp } * The lnode is the "inode" for loop-back files. It contains * all the information necessary to handle loop-back file on the * client side. */ typedef struct lnode { struct lnode *lo_next; /* link for hash chain */ struct vnode *lo_vp; /* pointer to real vnode */ uint_t lo_looping; /* looping flags (see below) */ struct vnode *lo_vnode; /* place holder vnode for file */ } lnode_t; the lo_vp, real vnode points to the deleted inital, original UFS inode #29577 from 1) 18) verify the orginal UFS inode/vnode from step 1) still being alive: > 0xffffff01b395ee00::print vnode_t { v_lock = { _opaque = [ 0 ] } v_flag = 0x10000 v_count = 0x1 ### only 1 VN_HOLD left, from LOFS v_data = 0xffffff01b3960678 v_vfsp = 0xffffff01ac655aa0 ### UFS /export/home v_stream = 0 v_type = 1 (VREG) v_rdev = 0xffffffffffffffff v_vfsmountedhere = 0 v_op = 0xffffff01a8ec4780 v_pages = 0xffffff0005649898 v_filocks = 0 v_shrlocks = 0 v_nbllock = { _opaque = [ 0 ] } v_cv = { _opaque = 0 } v_locality = 0 v_femhead = 0 v_path = 0xffffff01bb5c55b8 "/export/home/batschul/test/curpassword" v_rdcnt = 0 v_wrcnt = 0 v_mmap_read = 0 v_mmap_write = 0 v_mpssdata = 0 v_fopdata = 0 v_vsd = 0 v_xattrdir = 0 v_count_dnlc = 0 } > 0xffffff01b3960678::inode -v ADDR INUMBER T MODE SIZE DEVICE FLAG ffffff01b3960678 29577 - 0644 b 6600000004 <REF> 2009 Jan 23 13:03:22 /export/home/batschul/test/curpassword > ffffff01b3960678::print inode_t i_ic { i_ic.ic_smode = 0x81a4 i_ic.ic_nlink = 0 ### deleted inode ! 18) So what happens after we unmount the loopback mount ? opteron.root./export/home/batschul/test.=> umount /export/home/batschul/test/mirrorpasswd opteron.root./export/home/batschul/test.=> ls -lisa 29575 2 drwxr-xr-x 2 batschul other 512 Jan 23 13:06 . 4 10 drwxr-xr-x 87 batschul other 5120 Jan 23 17:35 .. 29579 2 -rw-r--r-- 1 root root 11 Jan 23 13:06 curpassword 29578 0 -rw-r--r-- 1 root root 0 Jan 23 13:03 mirrorpasswd correct, the mount point is again an empty file. So every layer did what it has been asked for including eventually lofs, which still shadows the original file created in 1) which is all it has been asked for. So the bussiness summary out of this all is that replacing the original target of a loopback mount by the use of mv(1) or rename(2) with a new file and assuming that the loopback mount somehow magically shadows the new replacement is a wrong assumption. This is btw, not different as if you'd do the same replacement operation to a given file that a given process has still open, he'll always see the original file until he closes it, lofs does not change anything in the picture here. There's no magic refreshment operation inside lofs. cheers frankB -- This message posted from opensolaris.org