Date: Mon, 10 Jul 2017 17:40:54 +0000 From: David Holland <dholland-t...@netbsd.org> Message-ID: <20170710174054.ga22...@netbsd.org>
| Union mounts are complicated in this regard because when the directory | involved is a union mount point, some layer of the union mount needs | to be chosen to invoke the filesystem-level operation; I don't think so, directory ops all happen at the upper level (or nowhere). | Directory operations can be divided into five categories: | - lookup (ordinary directory traversal, operations like stat, open | without O_CREATE, etc.) | - nonexclusive create (open without O_CREATE) You mean without O_EXCL, it has to have O_CREATE or it isn't a create at all, just an open of an existing file, which is just a lookup (doesn't matter if is a read, write, or read-write open). | - exclusive create (mkdir, symlink, open with O_CREATE|O_EXCL, etc.) | - remove (rmdir, unlink) | - rename Forget rename(), the relevant operation is link() - rename is just link(),unlink() with idempotent semantics. | For lookup, Agreed, no question. And to answer mouse, if there's a whiteout found, the search terminates, and the file was not found. | For nonexclusive create, we should do the same, and if we run out of | layers start at the top again No, if the file does not exist, it is created, in the top level, there is no "start again" | For an exclusive create, however, we need to ascertain that the name | doesn't exist before we try creating anything. As you do for nonexclusive create - the only difference is what happens when the name does exist. For one it is an error, for the other the open just uses the existing file. | Various security | properties depend on exclusive create actually being exclusive, and I | don't think having union mounts weaken this is healthy. Of course. | So I think we need to test all layers before creating anything. Of course. | (It also means we need to lock all layers, The top layer needs to be locked, and remain that way, I expect (though we just have a normal race if it is unlocked, then locked again later, only effect would be, I think, that the top directory would need to be checked again in case the file appeared in the meantime.) That is, anyone creating the same file name will put it in the upper layer, and it is just a question of who gets there first, which is something we do not need to answer, just make sure there is only one winner. If someone at the same time is creating a file in the alternative name for the under layer then "so what", that's not a problem. | Once we've ascertained that the name doesn't exist, we use the topmost | read-write layer; Huh? Where does that come from? You use the upper layer. If for any reason the file cannot be created there, the operation fails. No second chances. | For remove, I think the correct thing to do is to descend until we | find the topmost layer where the target name exists, if any, We look see if the file exists, yes, if not there is nothing to do (error.) | and then operate at that layer. No, all changes in the top layer, the file is "removed" by creating a whiteout, which will then cause any lookup to fail. | And for rename, [...] Just consider it as link+unlink (and keep the locking to make it idempotent, which tends to be the complex part...) The unlink part was just covered. For link() the first filename is just a lookup, and no different than any other. The second filename is then processed as for an exclusive create. Simple (except there needs to be the EXDEV check added.) | Plan 9 has a mount flag (mount -c) that it uses to pick the layer | where new objects get created, rather than going by readonly vs. | read-write; we don't have that but could implement it. We don't have the option, but we do have the picked layer - the top one. | Does this seem reasonable? Far too complicated. Keep it simple. kre