[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-23 Thread Thimo E
Hi Matthew, sorry for the late reply. Today I triggered another fstrim with the linux-image-5.4.0-75-generic kernel and made a final check on the RAID - for me no trouble occured yet. Thank you for pursuing this topic so persistently and providing the patches to the Ubuntu kernel finally. Best

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-22 Thread Matthew Ruffell
Hi Thimo, The SRU cycle has completed, and all kernels containing the Raid10 block discard performance patches have now been released to -updates. Note that the versions are different than the kernels in -proposed, due to the kernel team needing to do a last minute respin to fix two sets of

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-17 Thread Matthew Ruffell
Hi Thimo, Just checking in. Are you still running 5.4.0-75-generic on your server? Is everything nice and stable? Is your data fully intact, and no signs of corruption at all? My server has been running for two weeks now, and it does a fstrim every 30 minutes, and everything appears to be

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-10 Thread Matthew Ruffell
Hi Thimo, Thanks for letting me know, and great to hear that things are working as expected. I'll check in with you in one week's time, to double check things are still going okay. I spent some time today performing verification on all the kernels in -proposed, testing block discard performance

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-10 Thread Thimo E
Hi Matthew, Thanks for your effort to add this feature to the Ubuntu kernels. I installed linux-image-5.4.0-75-generic on 2021-06-08. Neither during normal work nor during manual fstrim any problems so far. Best regards, Thimo -- You received this bug notification because you are a member

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-06-07 Thread Matthew Ruffell
Hi Thimo, The kernel team have built all of the kernels for this SRU cycle, and have placed them into -proposed for verification. We now need to do some thorough testing and make sure that Raid10 arrays function with good performance, ensure data integrity and make sure we won't be introducing

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-27 Thread Matthew Ruffell
Hi Thimo, As I mentioned in my previous message, I submitted the patches to the Ubuntu kernel mailing list for SRU. These patches have now gotten 2 acks [1][2] from senior kernel team members, and the patches have now been applied [3] to the 4.15, 5.4, 5.8 and 5.11 kernels. [1]

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-19 Thread Matthew Ruffell
Hi Thimo, Thanks for helping test! I really appreciate it. It is great to hear that you haven't had any trouble with the test kernel. Just a quick update on the state of the Raid10 patchset. I submitted them for SRU for the current cycle, and the kernel team wrote back to me asking for more

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-10 Thread Thimo E
Hi Matthew, thank you for your continuous effort. I tested your 5.4.0-72-generic #80+TEST1896578v20210504b1-Ubuntu until now without trouble. I also started fstrim manually on a machine which did not do it for some time due to disabled fstrim service. Regards, Thimo -- You received this bug

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-06 Thread Matthew Ruffell
Hi Thimo, I have been doing quite a bit of regression testing, and so far everything is looking good. The performance of the block discard is there, and I haven't come across any data corruption. I have also spent some time running through the testcase you created for this bug, and I have the

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-05 Thread Thimo E
Hi Matthew, thank you for providing the test-kernel and instructions. I will give it a try. Regards, Thimo -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-05 Thread Matthew Ruffell
Hi Thimo, As promised yesterday, the new re-spins of the test kernels have finished building and are now available in the following ppa: https://launchpad.net/~mruffell/+archive/ubuntu/lp1896578-test The patches used are the ones I will be submitting for SRU, and are more or less identical to

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-04 Thread Matthew Ruffell
Hi Thimo, Thanks for writing back, great timing! So, the new revision of the patches that we have been testing since February have just been merged into mainline. The md/raid10 patches got merged on Friday, and the dm/raid patches got merged on Saturday, and will be tagged into 5.13-rc1. There's

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-05-03 Thread Thimo E
Hi Matthew, are these tests still relevant for you? BR, Thimo -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to corrupted file system To manage notifications

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-04-29 Thread Matthew Ruffell
** Changed in: linux (Ubuntu Groovy) Assignee: Sinclair Willis (yousure1222) => (unassigned) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-04-29 Thread Sinclair Willis
** Changed in: linux (Ubuntu Groovy) Assignee: (unassigned) => Sinclair Willis (yousure1222) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-02-14 Thread Matthew Ruffell
Hi Thimo, Recently, Xiao Ni, the original author of the Raid10 block discard patchset, has posted a new revision of the patchset to the linux-raid mailing list for feedback. Xiao has fixed the two bugs that caused the regression. The first was incorrectly calculating the start offset for block

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2021-01-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.8.0-36.40+21.04.1 --- linux (5.8.0-36.40+21.04.1) hirsute; urgency=medium * Packaging resync (LP: #1786013) - update dkms package versions [ Ubuntu: 5.8.0-36.40 ] * debian/scripts/file-downloader does not handle positive

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.15.0-128.131 --- linux (4.15.0-128.131) bionic; urgency=medium * bionic/linux: 4.15.0-128.131 -proposed tracker (LP: #1907354) * Packaging resync (LP: #1786013) - update dkms package versions * raid10: discard leads to corrupted

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.4.0-58.64 --- linux (5.4.0-58.64) focal; urgency=medium * focal/linux: 5.4.0-58.64 -proposed tracker (LP: #1907390) * Packaging resync (LP: #1786013) - update dkms package versions * raid10: discard leads to corrupted file

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.8.0-33.36 --- linux (5.8.0-33.36) groovy; urgency=medium * groovy/linux: 5.8.0-33.36 -proposed tracker (LP: #1907408) * raid10: discard leads to corrupted file system (LP: #1907262) - Revert "dm raid: remove unnecessary discard

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-11 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- groovy' to 'verification-done-groovy'. If the problem still exists, change the tag

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-10 Thread Matthew Ruffell
Performing verification for Focal. I spun up a m5d.4xlarge instance on AWS, to utilise the 2x 300GB NVMe drives that support block discard. I enabled -proposed, and installed the 5.4.0-58-generic kernel. The following is the repro session running through the full testcase:

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-10 Thread Matthew Ruffell
Performing verification for Bionic. I spun up a m5d.4xlarge instance on AWS, to utilise the 2x 300GB NVMe drives that support block discard. I enabled -proposed, and installed the 4.15.0-128-generic kernel. The following is the repro session running through the full testcase:

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-10 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- focal' to 'verification-done-focal'. If the problem still exists, change the tag

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-10 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- bionic' to 'verification-done-bionic'. If the problem still exists, change the tag

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Eric Desrochers
For Trusty and Xenial, fstrim is scheduled via cron[0] to run weekly at each Sunday at 6h47[1]. For Bionic onward, fstrim is scheduled via systemd timer to also run weekly[2] Impacted users may want to take action before the next scheduled run by downgrading the running kernel or disabling the

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Eric Desrochers
For Trusty and Xenial, fstrim is scheduled via cron[0] to run weekly at each Sunday at 6h47[1]. For Bionic onward, fstrim is scheduled via systemd timer to also run weekly[2] Impacted users may want to take action before the next scheduled run by downgrading the running kernel or temporarily

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Eric Desrochers
@voidlily, I would assume you are running a HWE kernel (v4.15) on Xenial. If it's the case, fixing the Bionic kernel will generate a new HWE (4.15) kernel for Xenial. ** Changed in: linux (Ubuntu Xenial) Status: New => Invalid ** Changed in: linux (Ubuntu Trusty) Status: New =>

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Eric Desrochers
** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread voidlily
This issue is also affecting xenial, or at least the package was pulled from xenial as well. When I try to click the "add distribution" button in launchpad I'm getting an oops error, so posting a comment about xenial being affected in the meantime. -- You received this bug notification because

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Thimo E
This is just the procedure with the least damage I found. Still data loss may happen (and actually happened to some of our systems). Probably first re-adding (after zeroing) the second component to the RAID and then fsck-ing leads to the exact same result but I wanted to keep the second component

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Jay Vosburgh
Thimo, Thanks for the update; just to clarify, for your "procedure to recover," are you saying that that procedure will always resolve the damage, or that even after that procedure, there may be corruption? -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Thimo E
Hi Matthew and all, thank you for taking action immediately. I really appreciate your effort. After investigating the issue further I have to add that the mount option discard seems to trigger the issue, too. @Trent The general problem here is that RAID10 can balance single read streams to all

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Khaled El Mously
** Changed in: linux (Ubuntu Focal) Status: In Progress => Fix Committed ** Changed in: linux (Ubuntu Groovy) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Trent Lloyd
** Attachment added: "blktrace-lp1907262.tar.gz" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/+attachment/5442212/+files/blktrace-lp1907262.tar.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-09 Thread Trent Lloyd
I can reproduce this on a Google Cloud n1-standard-16 using 2x Local NVMe disks. Then partition nvme0n1 and nvne0n2 with only an 8GB partition, then format directly with ext4 (skip LVM). In this setup each 'check' takes <1 min so speeds up testing considerably. Example details - seems

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Khaled El Mously
** Changed in: linux (Ubuntu Bionic) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to corrupted file system To

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Matthew Ruffell
Hi Thimo, Firstly, thank you for your bug report, we really, really appreciate it. You are correct, the recent raid10 patches appear to cause filesystem corruption on raid10 arrays. I have spent the day reproducing, and I can confirm that the 4.15.0-126-generic, 5.4.0-56-generic and

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Matthew Ruffell
** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Groovy) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Bionic)

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Nivedita Singhvi
** Tags added: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to corrupted file system To manage notifications about this bug go to:

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Matthew Ruffell
Hi Thimo, Thank you for the very detailed bug report. I will start investigating this immediately. Thanks, Matthew -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard

[Bug 1907262] Re: raid10: discard leads to corrupted file system

2020-12-08 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: