Al Viro: > BTW, I wonder what's the right locking for that sucker; overlayfs is probably > too heavy - we are talking about copying a file from one fs to another, which > can obviously take quite a while, so holding ->i_mutex on _parent_ all along > is asking for very serious contention. OTOH, there's a pile of unpleasant
Yes, holding parent->i_mutex can be longer. Using splice function for copy-up is simple and (probably) fastest. But it doesn't support a "hole" in the file (sparse file). All holes are filled with NUL byte and consumes a disk block on the upper layer. It is a problem, especially for users who have smaller tmpfs as his upper layer. The copy-up with considering a hole may cost more, but it can save the storage consumtion. > Another fun issue is copyup vs. copyup - we want to sit and wait for copyup > attempt in progress to complete, rather than start another one in parallel. > And whoever comes the second must check if copyup has succeeded, obviously - > it's possible to have user run into 5% limit and fail copyup, followed by > root doing it successfully. "5% limit" means the reserved are for a superuser on a filesystem, right? As far as I know, overlayfs (UnionMount too?) solves this problem as changing the task credential and the capability. But I am not sure whether it solves the similar problem around the resource limit like RLIMIT_CPU, RLIMIT_FSIZE or something. > Another one: overwriting rename() vs. copyup. Similar to unlink() vs. > copyup(). Hmm, do you mean that, just after the parent dir lock on the underlying fs, the copyup routine should confirm whether the target is still alive? If so, I agree. > Another one: read-only open() vs. copyup(). Hell knows - we obviously don't > want to open a partially copied file; might want to wait or pretend that this > open() has come first and give it the underlying file. The same goes for > stat() vs. copyup(). Exactly. Moreover users don't want to refer to the lower file which is obsoleted by copyup. > FWIW, something like "lock parent, ->create(), ->unlink(), unlock parent, > copy data and metadata, lock parent, allocate a new dentry in covering layer > and do ->lookup() on it, verify that it is negative and not a whiteout, lock > child, use ->link() to put it into directory, unlock everything" would > probably DTRT for unlink/copyup and rename/copyup. The rest... hell knows; Please let me make sure. You are saying, - create the file on the upper layer - get the "struct file" object - hide it from users - before copying the file, unlock the parent in order to stop the long period locking - copy the file without the parent lock. it doesn't matter since the file is invisible to users. - confirm that the target name is still available for copyup - make the completed file visible by ->link() It is ineteresting to ->link() with the unlinked file. While vfs_unlink() rejects such case, it may not matter for the underlying fs. Need to verify FS including jounals. J. R. Okajima -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/