** Description changed:

- Can you please cherry-pick those 2 important commits from upstream?
+ BugLink: https://bugs.launchpad.net/bugs/2093871
  
- 1. net: introduce helper sendpages_ok()
- (https://github.com/torvalds/linux/commit/23a55f44)
+ [Impact]
  
- Network drivers are using sendpage_ok() to check the first page of an
- iterator in order to disable MSG_SPLICE_PAGES. The iterator can
- represent list of contiguous pages.
+ Currently the nvme-tcp and drbg subsystems try to enable the MSG_SPLICE_PAGES
+ flag on pages to be written, and when MSG_SPLICE_PAGES is set, eventually it
+ calls skb_splice_from_iter(), which then checks all pages with sendpage_ok()
+ to see if all the pages are sendable.
  
- When MSG_SPLICE_PAGES is enabled skb_splice_from_iter() is being used,
- it requires all pages in the iterator to be sendable. Therefore it needs
- to check that each page is sendable.
+ At the moment, both subsystems only check the first page in a potentially
+ contiguous block of pages, if they are sendpage_ok(), and if the first page 
is,
+ then it just assumes all the rest are sendpage_ok() too, and sends the I/O off
+ to eventually be found out by skb_splice_from_iter(). If one or more of the
+ pages in the contiguous block is not sendpage_ok(), then we get a warn 
printed,
+ data transfer is aborted. In the nvme-tcp case, IO then hangs.
  
- The patch introduces a helper sendpages_ok(), it returns true if all the
- contiguous pages are sendable.
+ This patchset introduces sendpages_ok() which iterates over each page in a
+ contiguous block, checks if it is sendpage_ok(), and only returns true if all
+ of them are.
  
- Drivers who want to send contiguous pages with MSG_SPLICE_PAGES may use
- this helper to check whether the page list is OK. If the helper does not
- return true, the driver should remove MSG_SPLICE_PAGES flag.
+ This resolves the whole MSG_SPLICE_PAGES flag situation, since you can now
+ depend on the result of sendpages_ok(), instead of just assuming everything is
+ okay.
  
+ This issue is what caused bug 2075110 [0] to be discovered in the first place,
+ since it was responsible for contigious blocks of pages where the first was
+ sendpage_ok(), but pages further into the block were not.
  
- 2. nvme-tcp: use sendpages_ok() instead of sendpage_ok() 
(https://github.com/torvalds/linux/commit/6af7331a)
+ [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2075110
  
- Currently nvme_tcp_try_send_data() use sendpage_ok() in order to disable
- MSG_SPLICE_PAGES, it check the first page of the iterator, the iterator
- may represent contiguous pages.
+ Even with "md/md-bitmap: fix writing non bitmap pages" applied, the issue can
+ still happen, e.g. with merged IO pages, so this fix is still needed to
+ eliminate the issue.
  
- MSG_SPLICE_PAGES enables skb_splice_from_iter() which checks all the
- pages it sends with sendpage_ok().
+ [Fix]
  
- When nvme_tcp_try_send_data() sends an iterator that the first page is
- sendable, but one of the other pages isn't skb_splice_from_iter() warns
- and aborts the data transfer.
+ The fixes landed in mainline 6.12-rc1:
  
- Using the new helper sendpages_ok() in order to disable MSG_SPLICE_PAGES
- solves the issue.
+ commit 23a55f4492fcf868d068da31a2cd30c15f46207d
+ Author: Ofir Gal <[email protected]>
+ Date:   Thu Jul 18 11:45:12 2024 +0300
+ Subject: net: introduce helper sendpages_ok()
+ Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23a55f4492fcf868d068da31a2cd30c15f46207d
+     
+ commit 6af7331a70b4888df43ec1d7e1803ae2c43b6981
+ Author: Ofir Gal <[email protected]>
+ Date:   Thu Jul 18 11:45:13 2024 +0300
+ Subject: nvme-tcp: use sendpages_ok() instead of sendpage_ok()
+ Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6af7331a70b4888df43ec1d7e1803ae2c43b6981
+     
+ commit 7960af373ade3b39e10106ef415e43a1d2aa48c6
+ Author: Ofir Gal <[email protected]>
+ Date:  Thu Jul 18 11:45:14 2024 +0300
+ Subject: drbd: use sendpages_ok() instead of sendpage_ok()
+ Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7960af373ade3b39e10106ef415e43a1d2aa48c6
+ 
+ They are needed for noble and oracular.
+ 
+ [Testcase]
+ 
+ This is the same testcase as the original bug 2075110 [0], as the fix is
+ designed to prevent it or similar other bugs from happening again.
+ 
+ [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2075110
+ 
+ Because of this, the fix:
+ 
+ commit ab99a87542f194f28e2364a42afbf9fb48b1c724
+ Author: Ofir Gal <[email protected]>
+ Date: Fri Jun 7 10:27:44 2024 +0300
+ Subject: md/md-bitmap: fix writing non bitmap pages
+ Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab99a87542f194f28e2364a42afbf9fb48b1c724
+ 
+ needs to be reverted during your test runs, or you won't see the issue
+ reproduce.
+ 
+ You can use this ppa for updated kernels with the revert to trigger the
+ issue:
+ 
+ https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-revert
+ 
+ This can be reproduced by running blktests md/001 [1], which the author
+ of the fix created to act as a regression test for this issue.
+ 
+ [1]
+ 
https://github.com/osandov/blktests/commit/a24a7b462816fbad7dc6c175e53fcc764ad0a822
+ 
+ Deploy a fresh Noble VM, that has a scratch NVME disk.
+ 
+ $ sudo apt install build-essential fio
+ $ git clone https://github.com/osandov/blktests.git
+ $ cd blktests
+ $ make
+ $ echo "TEST_DEVS=(/dev/nvme0n1)" > config
+ $ sudo ./check md/001
+ 
+ The md/001 test will hang an affected system, and the above oops message
+ will be visible in dmesg.
+ 
+ A test kernel is available in the following ppa:
+ 
+ https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-test
+ 
+ This has both the fixes for this bug, and also bug 2075110. The issue will not
+ reproduce.
+ 
+ There is also a test kernel available with the fix for this bug present, and 
the
+ fix for bug 2075110 reverted, so you can see the impact of these patches only:
+ 
+ https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-repro
+ 
+ This will also not reproduce the issue anymore.
+ 
+ [Where problems could occur]
+ 
+ What we are changing is rather simple. Instead of checking the first page and
+ assuming all the rest in the contiguous block are sendpage_ok(), we now
+ check each page in the contiguous block to see if all of them are 
sendpage_ok().
+ 
+ If any aren't, then we abort the write to the driver, and try again later. 
This
+ saves us time.
+ 
+ However, it does take longer to call sendpage_ok() on each of the pages in the
+ contiguous block, so there will be a minor performance hit.
+ 
+ Small performance hit for correctness should be okay.
+ 
+ Currently we are only applying to nvme-tcp and drbg subsystems. If a 
regression
+ were to occur, it would affect users of those subsystems only.
+ 
+ [Other info]
+ 
+ Upstream mailing list:
+ https://lore.kernel.org/all/[email protected]/T/#u

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2093871

Title:
  Introduce and use sendpages_ok() instead of sendpage_ok() in nvme-tcp
  and drbg

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2093871/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to