Hi!

I moved one of the SMR disks to another box with a 3.18.21 kernel.

I formatted and mounted like this:

   /opt/f2fs-tools/sbin/mkfs.f2fs -lTEST -s90 -t0 -a0 /dev/vg_test/test
   mount -t f2fs -onoatime,flush_merge,no_heap /dev/vg_test/test /mnt

I then copied (tar | tar) 2.1TB of data to the disk, which took about 6
hours, which is about the read speed of this data set (so the speed was very
good).

When I came back after ~10 hours, I found a number of hung task messages
in syslog, and when I entered sync, sync was consuming 100% system time.

I took a snapshot of /sys/kernel/debug/f2fs/status before sync, and the
values arfe "frozen", i.e. they didn't change.

I was able to read from the mounted filesystem normally, and I was able to
read and write the block device itself, so the disk is responsive.

After ~1h in this state, I tried to umount, which made the filesystem
mountpoint go away, but umount hangs, and /sys/kernel/debug/f2fs/status still
doesn't change.

This is the output of /sys/kernel/debug/f2fs/status:

http://ue.tst.eu/d88ce0e21a7ca0fb74b1ecadfa475df0.txt

I then deleted the device, but the echo 1 >/sys/block/sde/device/delete was
also hanging.

Here are /proc/.../stack outputs of sync, umount and bash(echo):

   sync:
   [<ffffffffffffffff>] 0xffffffffffffffff

   umount:
   [<ffffffff8139ba03>] call_rwsem_down_write_failed+0x13/0x20
   [<ffffffff811e7ee6>] deactivate_super+0x46/0x70
   [<ffffffff81204733>] cleanup_mnt+0x43/0x90
   [<ffffffff812047d2>] __cleanup_mnt+0x12/0x20
   [<ffffffff8108e8a4>] task_work_run+0xc4/0xe0
   [<ffffffff81012fa7>] do_notify_resume+0x97/0xb0
   [<ffffffff8178896f>] int_signal+0x12/0x17
   [<ffffffffffffffff>] 0xffffffffffffffff

   bash (delete):
   [<ffffffff810d8917>] msleep+0x37/0x50
   [<ffffffff8135d686>] __blk_drain_queue+0xa6/0x1a0
   [<ffffffff8135da05>] blk_cleanup_queue+0x1b5/0x1c0
   [<ffffffff8152082a>] __scsi_remove_device+0x5a/0xe0
   [<ffffffff815208d6>] scsi_remove_device+0x26/0x40
   [<ffffffff81520917>] sdev_store_delete+0x27/0x30
   [<ffffffff814bf748>] dev_attr_store+0x18/0x30
   [<ffffffff8125bc4d>] sysfs_kf_write+0x3d/0x50
   [<ffffffff8125b154>] kernfs_fop_write+0xe4/0x160
   [<ffffffff811e51a7>] vfs_write+0xb7/0x1f0
   [<ffffffff811e5c26>] SyS_write+0x46/0xb0
   [<ffffffff817886cd>] system_call_fastpath+0x16/0x1b
   [<ffffffffffffffff>] 0xffffffffffffffff

After a forced reboot, I did a fsck, and got this, which looks good except
for the "Wrong segment type" message, which hopefully is harmless.

http://ue.tst.eu/4c750d2301a581cb07249d607aa0e6d0.txt

After mounting, status was this (and was changing):

http://ue.tst.eu/6462606ac3aa85bde0d6674365c86318.txt

Note that 1.4TB of data are missing(!)

This large amount of missing data was certainly unexpected. I assume f2fs
stopped checkpointing earlier, and only after a checkpoint the data is
safe, but being able to write 1.4TB of data without it ever reaching the
disk is very unexpected behaviour for a filesystem (which normally loses
about half a minute of data at most).

Minor question, since the disk actually has 4K physical sectors, and fsck
says sector size = 512, is there a way to teach f2fs that the physical
sector size is actually 4k, or does this not matter because f2fs will do
page-sized writes anyways?

In any case, any insights would be appreciated. I will attwmpt to upgrade
this box to linux 4.2.1 to see if that helps, but 3.18.x is the onl
kernel known to work with smr drives without any issues.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schm...@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to