Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
On 21 February 2018 at 17:11, Christoph Hellwigwrote: > On Wed, Feb 21, 2018 at 08:51:15AM +1100, Dave Chinner wrote: >> IOWs, if the filesystem is designed with strictly ordered metadata, >> then fsync()ing a new file also implies that all references to the >> new file are also on stable storage because they happened before the >> fsync on the file was issued. i.e. the directory is fsync'd >> implicitly because it was modified by the same operation that >> created the file. Hence if the file creation is made stable, so must >> be the directory modification done during file creation. >> >> This has nothing to do with POSIX or what the "linux standard" is - >> this is testing whether the implementation of strictly ordered >> metadata journalling is correct or not. If gfs2 does not have >> strictly ordered metadata journalling, then it probably shouldn't >> run these tests > > Exactly. Also this is not just for new entries but also things like > rename. So trying to come up with some adjocs hacks here seems > wrong. > > That being said as far as I know gfs2 does transactional metadata > updates and has one single global log. Why doesn't it get these > things right by default? GFS2 does do metadata journaling. I was under the assumption that gfs2's ordering model differs, but it turns out that all that was missing was a log flush in iop->fsync in case the inode is clean but a log flush hasn't been done for it, yet. Thanks, Andreas
Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
On Wed, Feb 21, 2018 at 08:51:15AM +1100, Dave Chinner wrote: > IOWs, if the filesystem is designed with strictly ordered metadata, > then fsync()ing a new file also implies that all references to the > new file are also on stable storage because they happened before the > fsync on the file was issued. i.e. the directory is fsync'd > implicitly because it was modified by the same operation that > created the file. Hence if the file creation is made stable, so must > be the directory modification done during file creation. > > This has nothing to do with POSIX or what the "linux standard" is - > this is testing whether the implementation of strictly ordered > metadata journalling is correct or not. If gfs2 does not have > strictly ordered metadata journalling, then it probably shouldn't > run these tests Exactly. Also this is not just for new entries but also things like rename. So trying to come up with some adjocs hacks here seems wrong. That being said as far as I know gfs2 does transactional metadata updates and has one single global log. Why doesn't it get these things right by default?
Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
On Tue, Feb 20, 2018 at 09:53:59PM +0100, Andreas Gruenbacher wrote: > On 20 February 2018 at 20:46, Christoph Hellwigwrote: > > On Tue, Feb 20, 2018 at 12:22:01AM +0100, Andreas Gruenbacher wrote: > >> When fsyncing a new file, also fsync the directory the files is in, > >> recursively. This is how Linux filesystems should behave nowadays, > >> even if not mandated by POSIX. > > > > I think that is bullshit. Maybe it is what google wants for ext4 > > non-journal mode which no one else uses anyway. but it certainly > > is anything but normal Linux semantics. > > Here's some code from xfstest generic/322: > > _mount_flakey > $XFS_IO_PROG -f -c "pwrite 0 1M" -c "fsync" $SCRATCH_MNT/foo \ > > $seqres.full 2>&1 || _fail "xfs_io failed" > mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar > md5sum $SCRATCH_MNT/bar | _filter_scratch > > _flakey_drop_and_remount > > md5sum $SCRATCH_MNT/bar | _filter_scratch > _unmount_flakey > > Note that there is no fsync for the parent directory ($SCRATCH_MNT), > yet the test obviously expects the directory to be synced as well. > This isn't implemented as in this patch on all filesystems, but the > major ones all show this behavior. So where's the bullshit? This test is for filesystems that have strictly ordered metadata journalling. All the filesystems that fstests supports via _require_metadata_journalling() have strictly ordered metadata journalling/crash recovery semantics. (i.e. xfs, ext4, btrfs, and f2fs (IIRC)). IOWs, if the filesystem is designed with strictly ordered metadata, then fsync()ing a new file also implies that all references to the new file are also on stable storage because they happened before the fsync on the file was issued. i.e. the directory is fsync'd implicitly because it was modified by the same operation that created the file. Hence if the file creation is made stable, so must be the directory modification done during file creation. This has nothing to do with POSIX or what the "linux standard" is - this is testing whether the implementation of strictly ordered metadata journalling is correct or not. If gfs2 does not have strictly ordered metadata journalling, then it probably shouldn't run these tests Cheers, Dave. -- Dave Chinner dchin...@redhat.com
Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
On 20 February 2018 at 20:46, Christoph Hellwigwrote: > On Tue, Feb 20, 2018 at 12:22:01AM +0100, Andreas Gruenbacher wrote: >> When fsyncing a new file, also fsync the directory the files is in, >> recursively. This is how Linux filesystems should behave nowadays, >> even if not mandated by POSIX. > > I think that is bullshit. Maybe it is what google wants for ext4 > non-journal mode which no one else uses anyway. but it certainly > is anything but normal Linux semantics. Here's some code from xfstest generic/322: _mount_flakey $XFS_IO_PROG -f -c "pwrite 0 1M" -c "fsync" $SCRATCH_MNT/foo \ > $seqres.full 2>&1 || _fail "xfs_io failed" mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar md5sum $SCRATCH_MNT/bar | _filter_scratch _flakey_drop_and_remount md5sum $SCRATCH_MNT/bar | _filter_scratch _unmount_flakey Note that there is no fsync for the parent directory ($SCRATCH_MNT), yet the test obviously expects the directory to be synced as well. This isn't implemented as in this patch on all filesystems, but the major ones all show this behavior. So where's the bullshit? Thanks, Andreas
Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
On Tue, Feb 20, 2018 at 12:22:01AM +0100, Andreas Gruenbacher wrote: > When fsyncing a new file, also fsync the directory the files is in, > recursively. This is how Linux filesystems should behave nowadays, > even if not mandated by POSIX. I think that is bullshit. Maybe it is what google wants for ext4 non-journal mode which no one else uses anyway. but it certainly is anything but normal Linux semantics.
Re: [Cluster-devel] [PATCH] gfs2: Fsync parent directories
Hi Andreas, - Original Message - | When fsyncing a new file, also fsync the directory the files is in, | recursively. This is how Linux filesystems should behave nowadays, | even if not mandated by POSIX. | | Based on ext4 commits 14ece1028, d59729f4e, and 9f713878f. | | Fixes xfstests generic/322, generic/376. | | Signed-off-by: Andreas Gruenbacher| --- It seems like the patch should be calling gfs2_inode_lookup on the parent directory or something, rather than a simple i_grab, and possibly even holding (nw) the parent directory's i_gl glock. Otherwise, the call to gfs2_ail_flush may reference an i_gl that might not exist. I'm concerned about other nodes in the cluster referencing and/or changing the parent directory inode while this is happening. I'm not sure if it's possible. Maybe Nate has a test to check cluster coherency for directories as well as files? Regards, Bob Peterson Red Hat File Systems