On Sat, Jul 20, 2019 at 8:00 AM Zhihao Cheng wrote:
>
> Running stress-test test_2 in mtd-utils on ubi device, sometimes we can
> get following oops message:
>
> BUG: unable to handle page fault for address: 0140
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 280a067 P4D 280a067 PUD 0
> Oops: [#1] SMP
> CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0
> -0-ga698c8995f-prebuilt.qemu.org 04/01/2014
> Workqueue: writeback wb_workfn (flush-ubifs_0_0)
> RIP: 0010:rb_next_postorder+0x2e/0xb0
> Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db
> 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a
> 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03
> RSP: 0018:c9887758 EFLAGS: 00010202
> RAX: 888129ae4700 RBX: 888138b08400 RCX: 8081
> RDX: 0130 RSI: 80800024 RDI: 888138b08400
> RBP: 888138b08400 R08: ea0004a6b920 R09:
> R10: c9887740 R11: 0001 R12: 888128d48000
> R13: 0800 R14: 011e R15: 07c8
> FS: () GS:88813ba0()
> knlGS:
> CS: 0010 DS: ES: CR0: 80050033
> CR2: 0140 CR3: 00013789d000 CR4: 06f0
> DR0: DR1: DR2:
> DR3: DR6: fffe0ff0 DR7: 0400
> Call Trace:
> destroy_old_idx+0x5d/0xa0 [ubifs]
> ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs]
> do_commit+0x3eb/0x830 [ubifs]
> ubifs_run_commit+0xdc/0x1c0 [ubifs]
>
> Above Oops are due to the slab-out-of-bounds happened in do-while of
> function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In
> function layout_in_gaps, there is a do-while loop placing index nodes
> into the gaps created by obsolete index nodes in non-empty index LEBs
> until rest index nodes can totally be placed into pre-allocated empty
> LEBs. @c->gap_lebs points to a memory area(integer array) which records
> LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB
> is found, corresponding lnum will be incrementally written into the
> memory area pointed by @c->gap_lebs. The size
> ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before
> do-while loop and can not be changed in the loop. But @c->lst.idx_lebs
> could be increased by function ubifs_change_lp (called by
> layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the
> loop. So, sometimes oob happens when number of cycles in do-while loop
> exceeds the original value of @c->lst.idx_lebs. See detail in
> https://bugzilla.kernel.org/show_bug.cgi?id=204229.
> This patch fixes oob in layout_in_gaps.
>
> Signed-off-by: Zhihao Cheng
> ---
> fs/ubifs/tnc_commit.c | 34 +++---
> 1 file changed, 27 insertions(+), 7 deletions(-)
>
> diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
> index a384a0f..234be1c 100644
> --- a/fs/ubifs/tnc_commit.c
> +++ b/fs/ubifs/tnc_commit.c
> @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union
> ubifs_key *key,
> /**
> * layout_leb_in_gaps - layout index nodes using in-the-gaps method.
> * @c: UBIFS file-system description object
> - * @p: return LEB number here
> + * @p: return LEB number in @c->gap_lebs[p]
> *
> * This function lays out new index nodes for dirty znodes using in-the-gaps
> * method of TNC commit.
> @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union
> ubifs_key *key,
> * This function returns the number of index nodes written into the gaps, or
> a
> * negative error code on failure.
> */
> -static int layout_leb_in_gaps(struct ubifs_info *c, int *p)
> +static int layout_leb_in_gaps(struct ubifs_info *c, int p)
> {
> struct ubifs_scan_leb *sleb;
> struct ubifs_scan_node *snod;
> @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int
> *p)
> * filled, however we do not check there at present.
> */
> return lnum; /* Error code */
> - *p = lnum;
> + c->gap_lebs[p] = lnum;
> dbg_gc("LEB %d", lnum);
> /*
> * Scan the index LEB. We use the generic scan for this even though
> @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt)
> */
> static int layout_in_gaps(struct ubifs_info *c, int cnt)
> {
> - int err, leb_needed_cnt, written, *p;
> + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, *gap_lebs;
>
> dbg_gc("%d znodes to write", cnt);
>
> @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt)
> if (!c->gap_lebs)
> return -ENOMEM;
>