On Thu, Feb 11, 2021 at 10:27:06AM +0100, Jean-Pierre André wrote:
> Hi Richard,
> 
> [I quote your report here, as you wanted to do so, but faced some issue]
> 
> >Upstream bug:  https://bugzilla.kernel.org/show_bug.cgi?id=211167
> >Fedora bug:    https://bugzilla.redhat.com/show_bug.cgi?id=1926954
> >ArchLinux bug: https://bbs.archlinux.org/viewtopic.php?id=262243
> >
> >fstrim of ntfs-3g filesystems is broken in newer Linux kernels.
> >
> >I have confirmed this and bisected it to a particular kernel commit:
> >
> >    block: Do not discard buffers under a mounted filesystem
> >    Discarding blocks and buffers under a mounted filesystem is hardly
> >    anything admin wants to do. Usually it will confuse the filesystem and
> 
> IMHO it is the admin job to optimize their system, even running 24/7.
> 
> >    sometimes the loss of buffer_head state (including b_private field) can
> >    even cause crashes like:
> 
> So there is a missing serialization. It is each file system job
> to make sure blocks are not reused or reallocated while they are
> being discarded.
> 
> >    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> >    PGD 0 P4D 0
> >    Oops: 0002 [#1] SMP PTI
> >    CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G O     
> > --------- -  - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1
> >    Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015
> >    RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2]
> >    ...
> >    Call Trace:
> >     __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2]
> >     jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2]
> >     kjournald2+0xbd/0x270 [jbd2]
> >    So if we don't have block device open with O_EXCL already, claim the
> >    block device while we truncate buffer cache. This makes sure any
> >    exclusive block device user (such as filesystem) cannot operate on the
> >    device while we are discarding buffer cache.
> >    Reported-by: Ye Bin <yebi...@huawei.com>
> >    Signed-off-by: Jan Kara <j...@suse.cz>
> >    Reviewed-by: Christoph Hellwig <h...@lst.de>
> >    [axboe: fix !CONFIG_BLOCK error in truncate_bdev_range()]
> >    Signed-off-by: Jens Axboe <ax...@kernel.dk>
> >
> >As far as I can tell the commit is intended to stop you from doing
> >blkdiscard on a block device which is in use by a (kernel) filesystem,
> >which sounds like a good idea.  Breakage of FUSE devices that use
> >block devices is possibly an unintentional side-effect.
> 
> So how is fstrim(8) supposed to be used ? Currently it requires a
> mounted file system as its last argument.
> 
> Technically ntfs-3g can close the device (without unmounting), reopen
> it without O_EXCL for trimming, and reopen again with O_EXCL, but this
> sounds like a very bad way of doing.
> 
> How does the ext4 driver allow allow the discarding to take place ?

I'm definitely not an expert here!  However if you look at the
problematic commit:

https://github.com/torvalds/linux/commit/384d87ef2c954fc58e6c5fd8253e4a1984f5fe02

you'll see it only adds the new checks on the ioctl path, ie. when
coming from userspace.  I suppose that in-kernel drivers like ext4
would not use this code path so it wouldn't be a problem for them.

I think it's just a kernel bug - they didn't anticipate the way that
ntfs-3g or other FUSE filesystems implement fstrim.

> Jean-Pierre
> 
> >
> >Although reverting the commit fixes it, I've no idea how to fix it
> >properly.
> >
> >Rich.
> >
> >(BTW I'm unable to subscribe to ntfs-3g-devel ...)

I think I was able to subscribe in the end.  Hopefully this
message will go through ...

Rich.

> >-- 
> >Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> >Read my programming and virtualization blog: http://rwmj.wordpress.com
> >virt-builder quickly builds VMs from scratch
> >http://libguestfs.org/virt-builder.1.html

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



_______________________________________________
ntfs-3g-devel mailing list
ntfs-3g-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to