[Kernel-packages] [Bug 1842271] Re: Infinite loop in __blkdev_issue_discard() consumes the entire system memory when formatting a raid array
Patch could be found in Bionic-gcp master branch and Eoan, mark this one as Fix-released. ** Changed in: linux (Ubuntu) Status: Incomplete => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842271 Title: Infinite loop in __blkdev_issue_discard() consumes the entire system memory when formatting a raid array Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Bug description: BugLink: https://bugs.launchpad.net/bugs/1842271 [Impact] There is a regression present in block in 4.15.0-56, which causes the machine to consume all system memory and cause the OOM reaper to come out, when attempting to format a newly created raid array on a series of NVMe cluster with an xfs filesystem. The symptoms can be found with all system memory ending up in kmalloc-256 slab, with a nondescript call trace being printed. The problem is caused by the below two commits: commit 3c2f83d8bcbedeb89efcaf55ae64a99dce9d7e34 Author: Ming Lei Date: Fri Oct 12 15:53:10 2018 +0800 Subject: block: don't deal with discard limit in blkdev_issue_discard() BugLink: https://bugs.launchpad.net/bugs/1836802 Upstream-commit: 744889b7cbb56a64f957e65ade7cb65fe3f35714 Pastebin: https://paste.ubuntu.com/p/WKsmwKgCMc/ commit b515257f186e532e0668f7deabcb04b5d27505cf Author: Ming Lei Date: Mon Oct 29 20:57:17 2018 +0800 Subject: block: make sure discard bio is aligned with logical block size BugLink: https://bugs.launchpad.net/bugs/1836802 Upstream-commit: 1adfc5e4136f5967d591c399aff95b3b035f16b7 Pastebin: https://paste.ubuntu.com/p/TcnTck64PJ/ Now, the fault was triggered in two stages. Firstly, in "block: don't deal with discard limit in blkdev_issue_discard()" a while loop was changed such that there is a possibility of an infinite loop if __blkdev_issue_discard() is called with nr_sects > 0 and req_sects somehow becomes 0: int __blkdev_issue_discard(..., sector_t nr_sects, ...) { ... while (nr_sects) { unsigned int req_sects = nr_sects; sector_t end_sect; end_sect = sector + req_sects; ... nr_sects -= req_sects; sector = end_sect; ... } if req_sects is 0, then end_sect is always equal to sector, and the most important part, nr_sects is only decremented in one place, by req_sects, which if 0, would lead to the infinite loop condition. Now, since req_sects is initially equal to nr_sects, the loop would never be entered in the first place if nr_sects is 0. This is where the second commit, "block: make sure discard bio is aligned with logical block size" comes in. This commit adds a line to the above loop, to allow req_sects to be set to a new value: int __blkdev_issue_discard(..., sector_t nr_sects, ...) { ... while (nr_sects) { unsigned int req_sects = nr_sects; sector_t end_sect; req_sects = min(req_sects, bio_allowed_max_sectors(q)); end_sect = sector + req_sects; ... nr_sects -= req_sects; sector = end_sect; ... } We see that req_sects will now be the minimum of itself and bio_allowed_max_sectors(q), a new function introduced by the same commit. static inline unsigned int bio_allowed_max_sectors(struct request_queue *q) { return round_down(UINT_MAX, queue_logical_block_size(q)) >> 9; } queue_logical_block_size(q) queries the hardware for the logical block size of the underlying device. static inline unsigned short queue_logical_block_size(struct request_queue *q) { int retval = 512; if (q && q->limits.logical_block_size) retval = q->limits.logical_block_size; return retval; } if q->limits.logical_block_size is 512 or smaller, then bit shifted right by 9 yields 0. bio_allowed_max_sectors will return 0, and the min with req_sects == nr_sects will favour the new 0. This causes nr_sects to never be decremented since req_sects is 0, and req_sects will never change since the min() that takes in itself will always favour the 0. From there the infinite loop iterates and fills up the kmalloc-256 slab with newly created bio entries, until all memory is exhausted and the OOM reaper comes out and starts killing processes, which is ineffective since this is a kernel memory leak. [Fix] The fix comes in the form of: commit a55264933f12c2fdc28a66841c4724021e8c1caf Author: Mikulas Patocka Date: Tue Jul 3 13:34:22 2018 -0400 Subject: block: fix infinite loop if the device loses discard capability BugLink: https://bugs.launchpad.net/bugs/1837257 Upstream-commit: b88aef36b87c9787a4db724923ec4f57dfd513f3 Pastebin: https://paste.ubuntu.com/p/bgC9vQnB8x/ This adds a check right after the min(req_sects, bio_allowed_max_sectors(q)); to test if req_sects has been set to 0, and if
[Kernel-packages] [Bug 1842271] Re: Infinite loop in __blkdev_issue_discard() consumes the entire system memory when formatting a raid array
This was released in 4.15.0-59-generic and works as intended. Marking Fix Released. Note, this still needs to land in linux-gcp for Bionic due to some technical reasons. ** Changed in: linux (Ubuntu Bionic) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842271 Title: Infinite loop in __blkdev_issue_discard() consumes the entire system memory when formatting a raid array Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Fix Released Bug description: BugLink: https://bugs.launchpad.net/bugs/1842271 [Impact] There is a regression present in block in 4.15.0-56, which causes the machine to consume all system memory and cause the OOM reaper to come out, when attempting to format a newly created raid array on a series of NVMe cluster with an xfs filesystem. The symptoms can be found with all system memory ending up in kmalloc-256 slab, with a nondescript call trace being printed. The problem is caused by the below two commits: commit 3c2f83d8bcbedeb89efcaf55ae64a99dce9d7e34 Author: Ming Lei Date: Fri Oct 12 15:53:10 2018 +0800 Subject: block: don't deal with discard limit in blkdev_issue_discard() BugLink: https://bugs.launchpad.net/bugs/1836802 Upstream-commit: 744889b7cbb56a64f957e65ade7cb65fe3f35714 Pastebin: https://paste.ubuntu.com/p/WKsmwKgCMc/ commit b515257f186e532e0668f7deabcb04b5d27505cf Author: Ming Lei Date: Mon Oct 29 20:57:17 2018 +0800 Subject: block: make sure discard bio is aligned with logical block size BugLink: https://bugs.launchpad.net/bugs/1836802 Upstream-commit: 1adfc5e4136f5967d591c399aff95b3b035f16b7 Pastebin: https://paste.ubuntu.com/p/TcnTck64PJ/ Now, the fault was triggered in two stages. Firstly, in "block: don't deal with discard limit in blkdev_issue_discard()" a while loop was changed such that there is a possibility of an infinite loop if __blkdev_issue_discard() is called with nr_sects > 0 and req_sects somehow becomes 0: int __blkdev_issue_discard(..., sector_t nr_sects, ...) { ... while (nr_sects) { unsigned int req_sects = nr_sects; sector_t end_sect; end_sect = sector + req_sects; ... nr_sects -= req_sects; sector = end_sect; ... } if req_sects is 0, then end_sect is always equal to sector, and the most important part, nr_sects is only decremented in one place, by req_sects, which if 0, would lead to the infinite loop condition. Now, since req_sects is initially equal to nr_sects, the loop would never be entered in the first place if nr_sects is 0. This is where the second commit, "block: make sure discard bio is aligned with logical block size" comes in. This commit adds a line to the above loop, to allow req_sects to be set to a new value: int __blkdev_issue_discard(..., sector_t nr_sects, ...) { ... while (nr_sects) { unsigned int req_sects = nr_sects; sector_t end_sect; req_sects = min(req_sects, bio_allowed_max_sectors(q)); end_sect = sector + req_sects; ... nr_sects -= req_sects; sector = end_sect; ... } We see that req_sects will now be the minimum of itself and bio_allowed_max_sectors(q), a new function introduced by the same commit. static inline unsigned int bio_allowed_max_sectors(struct request_queue *q) { return round_down(UINT_MAX, queue_logical_block_size(q)) >> 9; } queue_logical_block_size(q) queries the hardware for the logical block size of the underlying device. static inline unsigned short queue_logical_block_size(struct request_queue *q) { int retval = 512; if (q && q->limits.logical_block_size) retval = q->limits.logical_block_size; return retval; } if q->limits.logical_block_size is 512 or smaller, then bit shifted right by 9 yields 0. bio_allowed_max_sectors will return 0, and the min with req_sects == nr_sects will favour the new 0. This causes nr_sects to never be decremented since req_sects is 0, and req_sects will never change since the min() that takes in itself will always favour the 0. From there the infinite loop iterates and fills up the kmalloc-256 slab with newly created bio entries, until all memory is exhausted and the OOM reaper comes out and starts killing processes, which is ineffective since this is a kernel memory leak. [Fix] The fix comes in the form of: commit a55264933f12c2fdc28a66841c4724021e8c1caf Author: Mikulas Patocka Date: Tue Jul 3 13:34:22 2018 -0400 Subject: block: fix infinite loop if the device loses discard capability BugLink: https://bugs.launchpad.net/bugs/1837257 Upstream-commit: b88aef36b87c9787a4db724923ec4f57dfd513f3 Pastebin: https://paste.ubuntu.com/p/bgC9vQnB8x/ This adds a check right after the