[PATCH v2] char: tpm: fix error return code in tpm_cr50_i2c_tis_recv()
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 3a253caaad11 ("char: tpm: add i2c driver for cr50") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/char/tpm/tpm_tis_i2c_cr50.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c b/drivers/char/tpm/tpm_tis_i2c_cr50.c index ec9a65e7887d..f19c227d20f4 100644 --- a/drivers/char/tpm/tpm_tis_i2c_cr50.c +++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c @@ -483,6 +483,7 @@ static int tpm_cr50_i2c_tis_recv(struct tpm_chip *chip, u8 *buf, size_t buf_len) expected = be32_to_cpup((__be32 *)(buf + 2)); if (expected > buf_len) { dev_err(>dev, "Buffer too small to receive i2c data\n"); + rc = -E2BIG; goto out_err; } -- 2.25.4
[PATCH] char: tpm: fix error return code in tpm_cr50_i2c_tis_recv()
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 3a253caaad11 ("char: tpm: add i2c driver for cr50") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/char/tpm/tpm_tis_i2c_cr50.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c b/drivers/char/tpm/tpm_tis_i2c_cr50.c index ec9a65e7887d..e908da78fbf4 100644 --- a/drivers/char/tpm/tpm_tis_i2c_cr50.c +++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c @@ -483,6 +483,7 @@ static int tpm_cr50_i2c_tis_recv(struct tpm_chip *chip, u8 *buf, size_t buf_len) expected = be32_to_cpup((__be32 *)(buf + 2)); if (expected > buf_len) { dev_err(>dev, "Buffer too small to receive i2c data\n"); + rc = -EIO; goto out_err; } -- 2.25.4
[PATCH] perf tools: Fix error return code in cmd_buildid_cache()
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: e3ed75bb537a8 ("perf buildid-cache: Move session...") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- tools/perf/builtin-buildid-cache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c index ecd0d3cb6f5c..f703ba11e12c 100644 --- a/tools/perf/builtin-buildid-cache.c +++ b/tools/perf/builtin-buildid-cache.c @@ -448,7 +448,8 @@ int cmd_buildid_cache(int argc, const char **argv) return PTR_ERR(session); } - if (symbol__init(session ? >header.env : NULL) < 0) + ret = symbol__init(session ? >header.env : NULL); + if (ret < 0) goto out; setup_pager(); -- 2.25.4
Re: [PATCH 3/4] ubifs: Update directory size when creating whiteouts
在 2021/1/25 15:55, Richard Weinberger 写道: The idea was that in the !whiteout case, sz_change is always 0. Oh, sz_change was initialized to 0, I missed it. Thanks.
Re: [PATCH 3/4] ubifs: Update directory size when creating whiteouts
在 2021/1/23 10:45, Zhihao Cheng 写道: @@ -430,6 +433,7 @@ static int do_tmpfile(struct inode *dir, struct dentry *dentry, return 0; out_cancel: Still one question: Does this need a judgment? Like this, if (whiteout) dir->i_size -= sz_change; + dir->i_size -= sz_change; mutex_unlock(_ui->ui_mutex); out_inode: make_bad_inode(inode); __ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/
Re: [PATCH 3/4] ubifs: Update directory size when creating whiteouts
在 2021/1/23 5:22, Richard Weinberger 写道: Although whiteouts are unlinked files they will get re-linked later, I just want to make sure, is this where the count is increased? do_rename -> inc_nlink(whiteout) therefore the size of the parent directory needs to be updated too. Cc: sta...@vger.kernel.org Fixes: 9e0a1fff8db5 ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Richard Weinberger --- fs/ubifs/dir.c | 4 1 file changed, 4 insertions(+) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 8a34e0118ee9..b5d523e5854f 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -361,6 +361,7 @@ static int do_tmpfile(struct inode *dir, struct dentry *dentry, struct ubifs_budget_req ino_req = { .dirtied_ino = 1 }; struct ubifs_inode *ui, *dir_ui = ubifs_inode(dir); int err, instantiated = 0; + int sz_change = 0; struct fscrypt_name nm; /* @@ -411,6 +412,8 @@ static int do_tmpfile(struct inode *dir, struct dentry *dentry, mark_inode_dirty(inode); drop_nlink(inode); *whiteout = inode; + sz_change = CALC_DENT_SIZE(fname_len()); + dir->i_size += sz_change; } else { d_tmpfile(dentry, inode); } @@ -430,6 +433,7 @@ static int do_tmpfile(struct inode *dir, struct dentry *dentry, return 0; out_cancel: Does this need a judgment? Like this, if (whiteout) dir->i_size -= sz_change; + dir->i_size -= sz_change; mutex_unlock(_ui->ui_mutex); out_inode: make_bad_inode(inode);
Re: [PATCH 1/4] ubifs: Correctly set inode size in ubifs_link()
在 2021/1/23 5:22, Richard Weinberger 写道: Reviewed-by: Zhihao Cheng diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 9a6b8660425a..04912dedca48 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -693,7 +693,7 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, struct inode *inode = d_inode(old_dentry); struct ubifs_inode *ui = ubifs_inode(inode); struct ubifs_inode *dir_ui = ubifs_inode(dir); - int err, sz_change = CALC_DENT_SIZE(dentry->d_name.len); + int err, sz_change; struct ubifs_budget_req req = { .new_dent = 1, .dirtied_ino = 2, .dirtied_ino_d = ALIGN(ui->data_len, 8) }; struct fscrypt_name nm; @@ -731,6 +731,8 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, if (inode->i_nlink == 0) ubifs_delete_orphan(c, inode->i_ino); + sz_change = CALC_DENT_SIZE(fname_len()); + inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode);
Re: [PATCH] ubifs: Fix memleak in ubifs_init_authentication
在 2021/1/5 14:03, Dinghao Liu 写道: When crypto_shash_digestsize() fails, c->hmac_tfm has not been freed before returning, which leads to memleak. Fixes: 49525e5eecca5 ("ubifs: Add helper functions for authentication support") Signed-off-by: Dinghao Liu Reviewed-by: Zhihao Cheng --- fs/ubifs/auth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/auth.c b/fs/ubifs/auth.c index 51a7c8c2c3f0..e564d5ff8781 100644 --- a/fs/ubifs/auth.c +++ b/fs/ubifs/auth.c @@ -327,7 +327,7 @@ int ubifs_init_authentication(struct ubifs_info *c) ubifs_err(c, "hmac %s is bigger than maximum allowed hmac size (%d > %d)", hmac_name, c->hmac_desc_len, UBIFS_HMAC_ARR_SZ); err = -EINVAL; - goto out_free_hash; + goto out_free_hmac; } err = crypto_shash_setkey(c->hmac_tfm, ukp->data, ukp->datalen);
Re: [PATCH v2] ubifs: Fix read out-of-bounds in ubifs_jnl_write_inode()
在 2020/12/24 7:07, Richard Weinberger 写道: Reproducer: 0. config KASAN && apply print.patch 1. mount ubifs on /root/temp 2. run test.sh What does test.sh do? Go to Link: https://bugzilla.kernel.org/show_bug.cgi?id=210865. test.sh creates a very long path file test_file, and then create a symbol link link_file for test_file, so ubifs inode for link_file will be assigned a big value for ui->data_len. When we change atime for link_file, ubifs_jnl_write_inode will be executed by wb_writeback. By this way, write_len could be not aligned with 8 bytes. 3. cd /root/temp && ls // change atime for link_file 4. wait 1~2 minutes In order to solve the read oob problem in ubifs_wbuf_write_nolock, just align the write_len to 8 bytes when alloc the memory. So that this patch will not affect the use of write_len in other functions, such as ubifs_jnl_write_inode->make_reservation and ubifs_jnl_write_inode->ubifs_node_calc_hash. I gave this a second thought and I'm not so sure anymore what exactly is going on. The problem is real, I fully agree with you but I need to dig deeper into the journal and wbuf code to double check that we really fix the right thing and not just paper other something. Thanks, //richard .
Re: [PATCH] ubifs: Fix read out-of-bounds in ubifs_jnl_write_inode()
在 2020/12/23 14:28, Chengsong Ke 写道: Reviewed-by: Zhihao Cheng From: kechengsong ubifs_jnl_write_inode() probably cause read out-of-bounds in some situation. There is kasan stack: [ 336.432159] BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0 [ 336.433634] Read of size 4 at addr 888019612ff8 by task kworker/u8:4/135 [ 336.434605] [ 336.434830] CPU: 1 PID: 135 Comm: kworker/u8:4 Not tainted 5.10.0-11826-gaf2a097952f3-dirty #338 [ 336.436050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 336.437876] Workqueue: writeback wb_workfn (flush-ubifs_0_0) [ 336.438670] Call Trace: [ 336.439021] ? dump_stack+0xdd/0x126 [ 336.439513] ? print_address_description.constprop.0+0x2c/0x3c0 [ 336.440308] ? _raw_write_lock_irqsave+0x140/0x140 [ 336.440921] ? ecc_sw_hamming_calculate+0x1dc/0x7d0 [ 336.441546] ? ecc_sw_hamming_calculate+0x1dc/0x7d0 [ 336.442186] ? kasan_report.cold+0x5d/0xd8 [ 336.442711] ? nand_reset_op+0x280/0x310 [ 336.443218] ? ecc_sw_hamming_calculate+0x1dc/0x7d0 [ 336.443842] ? __asan_load4+0x77/0x120 [ 336.444334] ? ecc_sw_hamming_calculate+0x1dc/0x7d0 [ 336.444963] ? nand_ecc_sw_hamming_calculate+0x6c/0x80 [ 336.445619] ? rawnand_sw_hamming_calculate+0x12/0x20 [ 336.446263] ? nand_write_page_swecc+0xa9/0x160 [ 336.446849] ? nand_do_write_ops+0x390/0x830 [ 336.447406] ? __writeback_single_inode+0x6cc/0x880 [ 336.448041] ? nand_write_oob+0x78/0x100 [ 336.448568] ? mtd_write_oob_std+0xe2/0x160 [ 336.449127] ? mtd_write_oob+0xec/0x1b0 [ 336.449679] ? mtd_write+0x92/0xf0 [ 336.450128] ? mtd_write_oob+0x1b0/0x1b0 [ 336.450633] ? ubi_self_check_all_ff+0x82/0x2e0 [ubi] [ 336.451328] ? __list_add_valid+0x2b/0x130 [ 336.451865] ? ubi_io_write+0x2c2/0xa90 [ubi] [ 336.452472] ? _raw_read_lock_irq+0x90/0x90 [ 336.453078] ? kmem_cache_alloc_trace+0x465/0x8b0 [ 336.453749] ? do_sync_erase+0x350/0x350 [ubi] [ 336.454430] ? __kasan_check_write+0x20/0x30 [ 336.455050] ? down_write+0xf2/0x190 [ 336.455569] ? down_write_killable+0x1b0/0x1b0 [ 336.456221] ? check_mapping+0x2c/0x590 [ubi] [ 336.456890] ? ubi_eba_write_leb+0x58a/0xfa0 [ubi] [ 336.457618] ? __kmalloc+0x490/0x910 [ 336.458142] ? ubifs_jnl_write_inode.cold+0x6f/0x878 [ubifs] [ 336.459033] ? writeback_sb_inodes+0x3a9/0x9a0 [ 336.459672] ? __writeback_inodes_wb+0xc8/0x170 [ 336.460330] ? wb_writeback+0x637/0x700 [ 336.460882] ? wb_workfn+0x8af/0xb80 [ 336.461398] ? process_one_work+0x467/0x9f0 [ 336.462004] ? worker_thread+0x34d/0x8e0 [ 336.462582] ? kthread+0x204/0x280 [ 336.463047] ? ret_from_fork+0x1f/0x30 [ 336.463570] ? create_prof_cpu_mask+0x30/0x30 [ 336.464185] ? ubi_eba_read_leb_sg+0x1f0/0x1f0 [ubi] [ 336.464917] ? hrtimer_active+0x9b/0x100 [ 336.465468] ? ubi_leb_write+0x22c/0x2f0 [ubi] [ 336.466130] ? ubifs_leb_write+0xf2/0x1b0 [ubifs] [ 336.466851] ? ubifs_wbuf_write_nolock+0x412/0x1280 [ubifs] [ 336.467686] ? write_head+0xdf/0x1c0 [ubifs] [ 336.468355] ? ubifs_jnl_write_inode.cold+0x3ec/0x878 [ubifs] [ 336.469183] ? ret_from_fork+0x1e/0x30 [ 336.469707] ? ubifs_jnl_write_data+0x660/0x660 [ubifs] [ 336.470497] ? unwind_next_frame+0x247/0xca0 [ 336.471095] ? ret_from_fork+0x1f/0x30 [ 336.471574] ? fprop_reflect_period_percpu.isra.0+0x1f/0x1b0 [ 336.472335] ? generic_writepages+0x93/0x140 [ 336.472933] ? __kasan_check_write+0x20/0x30 [ 336.473526] ? mutex_lock+0xa6/0x110 [ 336.474031] ? __mutex_lock_slowpath+0x30/0x30 [ 336.474662] ? ubifs_write_inode+0x1c3/0x290 [ubifs] [ 336.475446] ? __writeback_single_inode+0x6cc/0x880 [ 336.476155] ? wbc_attach_and_unlock_inode+0x2b6/0x400 [ 336.476891] ? writeback_sb_inodes+0x3a9/0x9a0 [ 336.477528] ? write_inode_now+0x1e0/0x1e0 [ 336.478119] ? __writeback_inodes_wb+0xc8/0x170 [ 336.478770] ? wb_writeback+0x637/0x700 [ 336.479326] ? __writeback_inodes_wb+0x170/0x170 [ 336.479992] ? current_work+0xa0/0xa0 [ 336.480524] ? _find_next_bit.constprop.0+0x3e/0x140 [ 336.481241] ? find_next_bit+0x18/0x30 [ 336.481780] ? cpumask_next+0x2f/0x40 [ 336.482312] ? wb_workfn+0x8af/0xb80 [ 336.482832] ? update_cfs_group+0x1e/0x1b0 [ 336.483421] ? inode_wait_for_writeback+0x60/0x60 [ 336.484106] ? schedule+0xb7/0x240 [ 336.484595] ? finish_task_switch+0x14e/0x9a0 [ 336.485225] ? __kasan_check_write+0x20/0x30 [ 336.485841] ? __schedule+0x6f4/0x1600 [ 336.486382] ? __kasan_check_read+0x1d/0x30 [ 336.486981] ? read_word_at_a_time+0x16/0x30 [ 336.487594] ? process_one_work+0x467/0x9f0 [ 336.488198] ? worker_thread+0x34d/0x8e0 [ 336.488762] ? rescuer_thread+0x820/0x820 [ 336.489344] ? kthread+0x204/0x280 [ 336.489839] ? kthread_bind+0x50/0x50 [ 336.490367] ? ret_from_fork+0x1f/0x30 [ 336.490913] [ 336.491138] Allocated by task 135: [ 336.491629] kasan_save_stack+0x23/0x60 [ 336.492189] __kasan_kmalloc.constprop.0+0x10b/0x120
[PATCH v2] btrfs: free-space-cache: Fix error return code in __load_free_space_cache
Fix to return the error code(instead always 0) when memory allocating failed in __load_free_space_cache(). This lacks the analysis of consequences, so there's only one caller and that will treat values <=0 as 'cache not loaded'. There's no functional change but otherwise the error values should be there for clarity. Fixes: a67509c30079f4c50 ("Btrfs: add a io_ctl struct and helpers for dealing with the space cache") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- fs/btrfs/free-space-cache.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index af0013d3df63..ae4059ce2f84 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -744,8 +744,10 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, while (num_entries) { e = kmem_cache_zalloc(btrfs_free_space_cachep, GFP_NOFS); - if (!e) + if (!e) { + ret = -ENOMEM; goto free_cache; + } ret = io_ctl_read_entry(_ctl, e, ); if (ret) { @@ -764,6 +766,7 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, e->trim_state = BTRFS_TRIM_STATE_TRIMMED; if (!e->bytes) { + ret = -1; kmem_cache_free(btrfs_free_space_cachep, e); goto free_cache; } @@ -784,6 +787,7 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, e->bitmap = kmem_cache_zalloc( btrfs_free_space_bitmap_cachep, GFP_NOFS); if (!e->bitmap) { + ret = -ENOMEM; kmem_cache_free( btrfs_free_space_cachep, e); goto free_cache; -- 2.25.4
[PATCH] dmaengine: mv_xor_v2: Fix error return code in mv_xor_v2_probe()
Return the corresponding error code when first_msi_entry() returns NULL in mv_xor_v2_probe(). Fixes: 19a340b1a820430 ("dmaengine: mv_xor_v2: new driver") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/dma/mv_xor_v2.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dma/mv_xor_v2.c b/drivers/dma/mv_xor_v2.c index 2753a6b916f6..9b0d463f89bb 100644 --- a/drivers/dma/mv_xor_v2.c +++ b/drivers/dma/mv_xor_v2.c @@ -771,8 +771,10 @@ static int mv_xor_v2_probe(struct platform_device *pdev) goto disable_clk; msi_desc = first_msi_entry(>dev); - if (!msi_desc) + if (!msi_desc) { + ret = -ENODEV; goto free_msi_irqs; + } xor_dev->msi_desc = msi_desc; ret = devm_request_irq(>dev, msi_desc->irq, -- 2.25.4
[PATCH] mmc: pxamci: Fix error return code in pxamci_probe
Fix to return the error code from devm_gpiod_get_optional() instaed of 0 in pxamci_probe(). Fixes: f54005b508b9a9d9c ("mmc: pxa: Use GPIO descriptor for power") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/mmc/host/pxamci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c index 29f6180a0036..316393c694d7 100644 --- a/drivers/mmc/host/pxamci.c +++ b/drivers/mmc/host/pxamci.c @@ -731,6 +731,7 @@ static int pxamci_probe(struct platform_device *pdev) host->power = devm_gpiod_get_optional(dev, "power", GPIOD_OUT_LOW); if (IS_ERR(host->power)) { + ret = PTR_ERR(host->power); dev_err(dev, "Failed requesting gpio_power\n"); goto out; } -- 2.25.4
[PATCH] btrfs: free-space-cache: Fix error return code in __load_free_space_cache
Fix to return the error code(instead always 0) when memory allocating failed in __load_free_space_cache(). Fixes: a67509c30079f4c50 ("Btrfs: add a io_ctl struct and helpers ...") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- fs/btrfs/free-space-cache.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index af0013d3df63..ae4059ce2f84 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -744,8 +744,10 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, while (num_entries) { e = kmem_cache_zalloc(btrfs_free_space_cachep, GFP_NOFS); - if (!e) + if (!e) { + ret = -ENOMEM; goto free_cache; + } ret = io_ctl_read_entry(_ctl, e, ); if (ret) { @@ -764,6 +766,7 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, e->trim_state = BTRFS_TRIM_STATE_TRIMMED; if (!e->bytes) { + ret = -1; kmem_cache_free(btrfs_free_space_cachep, e); goto free_cache; } @@ -784,6 +787,7 @@ static int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, e->bitmap = kmem_cache_zalloc( btrfs_free_space_bitmap_cachep, GFP_NOFS); if (!e->bitmap) { + ret = -ENOMEM; kmem_cache_free( btrfs_free_space_cachep, e); goto free_cache; -- 2.25.4
[PATCH] drivers: soc: ti: knav_qmss_queue: Fix error return code in knav_queue_probe
Fix to return the error code from of_get_child_by_name() instaed of 0 in knav_queue_probe(). Fixes: 41f93af900a20d1a0a ("soc: ti: add Keystone Navigator QMSS driver") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/soc/ti/knav_qmss_queue.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/soc/ti/knav_qmss_queue.c b/drivers/soc/ti/knav_qmss_queue.c index a460f201bf8e..a6a2138d5c29 100644 --- a/drivers/soc/ti/knav_qmss_queue.c +++ b/drivers/soc/ti/knav_qmss_queue.c @@ -1851,9 +1851,10 @@ static int knav_queue_probe(struct platform_device *pdev) if (ret) goto err; - regions = of_get_child_by_name(node, "descriptor-regions"); + regions = of_get_child_by_name(node, "descriptor-regions"); if (!regions) { dev_err(dev, "descriptor-regions not specified\n"); + ret = -ENODEV; goto err; } ret = knav_queue_setup_regions(kdev, regions); -- 2.25.4
Re: [PATCH] ubifs: wbuf: Don't leak kernel memory to flash
在 2020/11/17 16:43, Richard Weinberger 写道: On Tue, Nov 17, 2020 at 2:28 AM Zhihao Cheng wrote: Reviewed-by: Zhihao Cheng Thanks for reviewing, highly appreciated! You're welcome. Actually I've been following the linux-mtd. It's just that this patch isn't complicated, so I checked it. :-)
Re: [PATCH] ubifs: wbuf: Don't leak kernel memory to flash
在 2020/11/17 5:05, Richard Weinberger 写道: Write buffers use a kmalloc()'ed buffer, they can leak up to seven bytes of kernel memory to flash if writes are not aligned. So use ubifs_pad() to fill these gaps with padding bytes. This was never a problem while scanning because the scanner logic manually aligns node lengths and skips over these gaps. Cc: Fixes: 1e51764a3c2ac05a2 ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger --- fs/ubifs/io.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c index 7e4bfaf2871f..eae9cf5a57b0 100644 --- a/fs/ubifs/io.c +++ b/fs/ubifs/io.c @@ -319,7 +319,7 @@ void ubifs_pad(const struct ubifs_info *c, void *buf, int pad) { uint32_t crc; - ubifs_assert(c, pad >= 0 && !(pad & 7)); + ubifs_assert(c, pad >= 0); if (pad >= UBIFS_PAD_NODE_SZ) { struct ubifs_ch *ch = buf; @@ -764,6 +764,10 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len) * write-buffer. */ memcpy(wbuf->buf + wbuf->used, buf, len); + if (aligned_len > len) { + ubifs_assert(c, aligned_len - len < 8); + ubifs_pad(c, wbuf->buf + wbuf->used + len, aligned_len - len); + } if (aligned_len == wbuf->avail) { dbg_io("flush jhead %s wbuf to LEB %d:%d", @@ -856,13 +860,18 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, void *buf, int len) } spin_lock(>lock); - if (aligned_len) + if (aligned_len) { /* * And now we have what's left and what does not take whole * max. write unit, so write it to the write-buffer and we are * done. */ memcpy(wbuf->buf, buf + written, len); + if (aligned_len > len) { + ubifs_assert(c, aligned_len - len < 8); + ubifs_pad(c, wbuf->buf + len, aligned_len - len); + } + } if (c->leb_size - wbuf->offs >= c->max_write_size) wbuf->size = c->max_write_size; Reviewed-by: Zhihao Cheng
[PATCH v2] spi: cadence-quadspi: Fix error return code in cqspi_probe
Fix to return the error code from devm_reset_control_get_optional_exclusive() instaed of 0 in cqspi_probe(). Fixes: 31fb632b5d43ca ("spi: Move cadence-quadspi driver to drivers/spi/") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/spi/spi-cadence-quadspi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 40938cf3806d..ba7d40c2922f 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -1260,12 +1260,14 @@ static int cqspi_probe(struct platform_device *pdev) /* Obtain QSPI reset control */ rstc = devm_reset_control_get_optional_exclusive(dev, "qspi"); if (IS_ERR(rstc)) { + ret = PTR_ERR(rstc); dev_err(dev, "Cannot get QSPI reset.\n"); goto probe_reset_failed; } rstc_ocp = devm_reset_control_get_optional_exclusive(dev, "qspi-ocp"); if (IS_ERR(rstc_ocp)) { + ret = PTR_ERR(rstc_ocp); dev_err(dev, "Cannot get QSPI OCP reset.\n"); goto probe_reset_failed; } -- 2.25.4
[PATCH] i2c: qup: Fix error return code in qup_i2c_bam_schedule_desc()
Fix to return the error code from qup_i2c_change_state() instaed of 0 in qup_i2c_bam_schedule_desc(). Fixes: fbf9921f8b35d9b2 ("i2c: qup: Fix error handling") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/i2c/busses/i2c-qup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c index fbc04b60cfd1..5a47915869ae 100644 --- a/drivers/i2c/busses/i2c-qup.c +++ b/drivers/i2c/busses/i2c-qup.c @@ -801,7 +801,8 @@ static int qup_i2c_bam_schedule_desc(struct qup_i2c_dev *qup) if (ret || qup->bus_err || qup->qup_err) { reinit_completion(>xfer); - if (qup_i2c_change_state(qup, QUP_RUN_STATE)) { + ret = qup_i2c_change_state(qup, QUP_RUN_STATE); + if (ret) { dev_err(qup->dev, "change to run state timed out"); goto desc_err; } -- 2.25.4
[PATCH] spi: cadence-quadspi: Fix error return code in cqspi_probe
Fix to return the error code from devm_reset_control_get_optional_exclusive() instaed of 0 in cqspi_probe(). Fixes: 31fb632b5d43ca ("spi: Move cadence-quadspi driver to drivers/spi/") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- drivers/spi/spi-cadence-quadspi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 40938cf3806d..22d7158edb71 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -1260,12 +1260,14 @@ static int cqspi_probe(struct platform_device *pdev) /* Obtain QSPI reset control */ rstc = devm_reset_control_get_optional_exclusive(dev, "qspi"); if (IS_ERR(rstc)) { + ret = PTR_ERR(rstc); dev_err(dev, "Cannot get QSPI reset.\n"); goto probe_reset_failed; } rstc_ocp = devm_reset_control_get_optional_exclusive(dev, "qspi-ocp"); if (IS_ERR(rstc_ocp)) { + ret = PTR_ERR(rstc); dev_err(dev, "Cannot get QSPI OCP reset.\n"); goto probe_reset_failed; } -- 2.25.4
[PATCH v2] binfmt_elf_fdpic: return corresponding errcode if create_elf_fdpic_tables() fail
Function load_elf_fdpic_binary() may return 0 to caller if create_elf_fdpic_tables() fail, which will misslead caller to continue running without handling errors. Fixes: 1da177e4c3f41524e886 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng --- fs/binfmt_elf_fdpic.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index be4062b8ba75..6243abf3f8f3 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -434,8 +434,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm) current->mm->start_stack = current->mm->start_brk + stack_size; #endif - if (create_elf_fdpic_tables(bprm, current->mm, - _params, _params) < 0) + retval = create_elf_fdpic_tables(bprm, current->mm, + _params, _params); + if (retval < 0) goto error; kdebug("- start_code %lx", current->mm->start_code); -- 2.25.4
Re: [PATCH RFC 0/5] ubifs: Prevent memory oob accessing while dumping node
在 2020/6/16 15:11, Zhihao Cheng 写道: We use function ubifs_dump_node() to dump bad node caused by some reasons (Such as bit flipping caused by hardware error, writing bypass ubifs or unknown bugs in ubifs). The node content can not be trusted anymore, so we should prevent memory out-of-bounds accessing while dumping node in following situations: 1. bad node_len: Dumping data according to 'ch->len' which may exceed the size of memory allocated for node. 2. bad node content: Some kinds of node can record additional data, eg. index node and orphan node, make sure the size of additional data not beyond the node length. 3. node_type changes: Read data according to type A, but expected type B, before that, node is allocated according to type B's size. Length of type A node is greater than type B node. Commit acc5af3efa303d5f3 ("ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len") handles situation 1 for data node only, it would be better if we can solve problems in above situations for all kinds of nodes. Patch 1 adds a new parameter 'node_len'(size of memory which is allocated for the node) in function ubifs_dump_node(), safe dumping length of the node should be: minimum(ch->len, c->ranges[node_type].max_len, node_len). Besides, c->ranges[node_type].min_len can not greater than safe dumping length, which may caused by node_type changes(situation 3). Patch 2 reverts commit acc5af3efa303d5f ("ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len") to prepare for patch 3. Patch 3 replaces modified function ubifs_dump_node() in all node dumping places except for ubifs_dump_sleb(). Patch 4 removes unused function ubifs_dump_sleb(), Patch 5 allows ubifs_dump_node() to dump all branches of the index node. Some tests after patchset applied: https://bugzilla.kernel.org/show_bug.cgi?id=208203 Zhihao Cheng (5): ubifs: Limit dumping length by size of memory which is allocated for the node Revert "ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len" ubifs: Pass node length in all node dumping callers ubifs: ubifs_dump_sleb: Remove unused function ubifs: ubifs_dump_node: Dump all branches of the index node fs/ubifs/commit.c | 4 +- fs/ubifs/debug.c| 111 ++-- fs/ubifs/debug.h| 5 +- fs/ubifs/file.c | 2 +- fs/ubifs/io.c | 37 +-- fs/ubifs/journal.c | 3 +- fs/ubifs/master.c | 4 +- fs/ubifs/orphan.c | 6 ++- fs/ubifs/recovery.c | 6 +-- fs/ubifs/replay.c | 4 +- fs/ubifs/sb.c | 2 +- fs/ubifs/scan.c | 4 +- fs/ubifs/super.c| 2 +- fs/ubifs/tnc.c | 8 ++-- fs/ubifs/tnc_misc.c | 4 +- fs/ubifs/ubifs.h| 4 +- 16 files changed, 108 insertions(+), 98 deletions(-) ping, although it is not a serious problem for ubifs, but dumping extra memory by formating specified ubifs img may cause security problem.
[PATCH 1/3] ubifs: Fix a memleak after dumping authentication mount options
Fix a memory leak after dumping authentication mount options in error handling branch. Signed-off-by: Zhihao Cheng Cc: # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") --- fs/ubifs/super.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index a2420c900275..6f85cd618766 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1141,6 +1141,18 @@ static int ubifs_parse_options(struct ubifs_info *c, char *options, return 0; } +/* + * ubifs_release_options - release mount parameters which have been dumped. + * @c: UBIFS file-system description object + */ +static void ubifs_release_options(struct ubifs_info *c) +{ + kfree(c->auth_key_name); + c->auth_key_name = NULL; + kfree(c->auth_hash_name); + c->auth_hash_name = NULL; +} + /** * destroy_journal - destroy journal data structures. * @c: UBIFS file-system description object @@ -1650,8 +1662,7 @@ static void ubifs_umount(struct ubifs_info *c) ubifs_lpt_free(c, 0); ubifs_exit_authentication(c); - kfree(c->auth_key_name); - kfree(c->auth_hash_name); + ubifs_release_options(c); kfree(c->cbuf); kfree(c->rcvrd_mst_node); kfree(c->mst_node); @@ -2219,6 +2230,7 @@ static int ubifs_fill_super(struct super_block *sb, void *data, int silent) out_unlock: mutex_unlock(>umount_mutex); out_close: + ubifs_release_options(c); ubi_close_volume(c->ubi); out: return err; -- 2.25.4
[PATCH 2/3] ubifs: Don't parse authentication mount options in remount process
There is no need to dump authentication options while remounting, because authentication initialization can only be doing once in the first mount process. Dumping authentication mount options in remount process may cause memory leak if UBIFS has already been mounted with old authentication mount options. Signed-off-by: Zhihao Cheng Cc: # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") --- fs/ubifs/super.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 6f85cd618766..9796f5df2f7f 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1110,14 +1110,20 @@ static int ubifs_parse_options(struct ubifs_info *c, char *options, break; } case Opt_auth_key: - c->auth_key_name = kstrdup(args[0].from, GFP_KERNEL); - if (!c->auth_key_name) - return -ENOMEM; + if (!is_remount) { + c->auth_key_name = kstrdup(args[0].from, + GFP_KERNEL); + if (!c->auth_key_name) + return -ENOMEM; + } break; case Opt_auth_hash_name: - c->auth_hash_name = kstrdup(args[0].from, GFP_KERNEL); - if (!c->auth_hash_name) - return -ENOMEM; + if (!is_remount) { + c->auth_hash_name = kstrdup(args[0].from, + GFP_KERNEL); + if (!c->auth_hash_name) + return -ENOMEM; + } break; case Opt_ignore: break; -- 2.25.4
[PATCH 3/3] ubifs: mount_ubifs: Release authentication resource in error handling path
Release the authentication related resource in some error handling branches in mount_ubifs(). Signed-off-by: Zhihao Cheng Cc: # 4.20+ Fixes: d8a22773a12c6d7 ("ubifs: Enable authentication support") --- fs/ubifs/super.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 9796f5df2f7f..732218ef6656 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1331,7 +1331,7 @@ static int mount_ubifs(struct ubifs_info *c) err = ubifs_read_superblock(c); if (err) - goto out_free; + goto out_auth; c->probing = 0; @@ -1343,18 +1343,18 @@ static int mount_ubifs(struct ubifs_info *c) ubifs_err(c, "'compressor \"%s\" is not compiled in", ubifs_compr_name(c, c->default_compr)); err = -ENOTSUPP; - goto out_free; + goto out_auth; } err = init_constants_sb(c); if (err) - goto out_free; + goto out_auth; sz = ALIGN(c->max_idx_node_sz, c->min_io_size) * 2; c->cbuf = kmalloc(sz, GFP_NOFS); if (!c->cbuf) { err = -ENOMEM; - goto out_free; + goto out_auth; } err = alloc_wbufs(c); @@ -1629,6 +1629,8 @@ static int mount_ubifs(struct ubifs_info *c) free_wbufs(c); out_cbuf: kfree(c->cbuf); +out_auth: + ubifs_exit_authentication(c); out_free: kfree(c->write_reserve_buf); kfree(c->bu.buf); -- 2.25.4
Re: [PATCH 1/2] ubifs: xattr: Fix some potential memory leaks while iterating entries
在 2020/9/14 3:08, Richard Weinberger 写道: On Mon, Jun 1, 2020 at 11:11 AM Zhihao Cheng wrote: I agree that this needs fixing. Did you also look into getting rid of pxent? UBIFS uses the pxent pattern over and over and the same error got copy pasted a lot. :-( I thought about it. I'm not sure whether it is good to drop 'pxent'. Maybe you mean doing changes looks like following(Takes ubifs_jnl_write_inode() for example): diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index 4a5b06f8d812..fcd5ac047b34 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -879,13 +879,19 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) union ubifs_key key; struct fscrypt_name nm = {0}; struct inode *xino; - struct ubifs_dent_node *xent, *pxent = NULL; + struct ubifs_dent_node *xent; if (ui->xattr_cnt >= ubifs_xattr_max_cnt(c)) { ubifs_err(c, "Cannot delete inode, it has too much xattrs!"); goto out_release; } + fname_name() = kmalloc(UBIFS_MAX_NLEN, GFP_NOFS); + if (!fname_name()) { + err = -ENOMEM; + goto out_release; + } + lowest_xent_key(c, , inode->i_ino); while (1) { xent = ubifs_tnc_next_ent(c, , ); @@ -894,11 +900,12 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) if (err == -ENOENT) break; + kfree(fname_name()); goto out_release; } - fname_name() = xent->name; fname_len() = le16_to_cpu(xent->nlen); + strncpy(fname_name(), xent->name, fname_len()); xino = ubifs_iget(c->vfs_sb, le64_to_cpu(xent->inum)); if (IS_ERR(xino)) { @@ -907,6 +914,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) xent->name, err); ubifs_ro_mode(c, err); kfree(xent); + kfree(fname_name()); goto out_release; } ubifs_assert(c, ubifs_inode(xino)->xattr); @@ -916,11 +924,10 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) ino = (void *)ino + UBIFS_INO_NODE_SZ; iput(xino); - kfree(pxent); - pxent = xent; key_read(c, >key, ); + kfree(xent); } - kfree(pxent); + kfree(fname_name()); } pack_inode(c, ino, inode, 1); The kill_xattrs process is more intuitive without the pxent. However, the release process for the memory (stores xent->name) is similar to 'pxent'. If you think it's better than v1, I will send v2.
[PATCH] ubifs: setflags: Don't show error message when vfs_ioc_setflags_prepare() fails
Following process will trigger ubifs_err: 1. useradd -m freg(Under root) 2. cd /home/freg && mkdir mp (Under freg) 3. mount -t ubifs /dev/ubi0_0 /home/freg/mp (Under root) 4. cd /home/freg && echo 123 > mp/a (Under root) 5. cd mp && chown freg a && chgrp freg a && chmod 777 a (Under root) 6. chattr +i a(Under freg) UBIFS error (ubi0:0 pid 1723): ubifs_ioctl [ubifs]: can't modify inode 65 attributes chattr: Operation not permitted while setting flags on a This is not an UBIFS problem, it was caused by task priviliage checking on file operations. Remove error message printing from kernel just like other filesystems (eg. ext4), since we already have enough information from userspace tools. Signed-off-by: Zhihao Cheng --- fs/ubifs/ioctl.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/ubifs/ioctl.c b/fs/ubifs/ioctl.c index 3df9be2c684c..4363d85a3fd4 100644 --- a/fs/ubifs/ioctl.c +++ b/fs/ubifs/ioctl.c @@ -134,7 +134,6 @@ static int setflags(struct inode *inode, int flags) return err; out_unlock: - ubifs_err(c, "can't modify inode %lu attributes", inode->i_ino); mutex_unlock(>ui_mutex); ubifs_release_budget(c, ); return err; -- 2.25.4
[PATCH] ubifs: ubifs_jnl_change_xattr: Remove assertion 'nlink > 0' for host inode
Changing xattr of a temp file will trigger following assertion failed and make ubifs turn into readonly filesystem: ubifs_assert_failed [ubifs]: UBIFS assert failed: host->i_nlink > 0, in fs/ubifs/journal.c:1801 Reproducer: 1. fd = open(__O_TMPFILE) 2. fsetxattr(fd, key, value2, XATTR_CREATE) 3. fsetxattr(fd, key, value2, XATTR_REPLACE) Fix this by removing assertion 'nlink > 0' for host inode. Reported-by: Chengsong Ke Signed-off-by: Zhihao Cheng --- fs/ubifs/journal.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index 4a5b06f8d812..0b9f686d1d42 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -1798,7 +1798,6 @@ int ubifs_jnl_change_xattr(struct ubifs_info *c, const struct inode *inode, u8 hash[UBIFS_HASH_ARR_SZ]; dbg_jnl("ino %lu, ino %lu", host->i_ino, inode->i_ino); - ubifs_assert(c, host->i_nlink > 0); ubifs_assert(c, inode->i_nlink > 0); ubifs_assert(c, mutex_is_locked(_ui->ui_mutex)); -- 2.25.4
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/8 3:29, Richard Weinberger 写道: On Fri, Aug 7, 2020 at 4:18 AM Zhihao Cheng wrote: Maybe it's just me being dense and in need for a vacation. ;-) I have quite a few ubi/ubifs patches in pending list, may you comment/check them before 5.9 ending please? thanks. \( ̄▽ ̄) For example: https://patchwork.ozlabs.org/project/linux-mtd/patch/20200601091037.3794172-2-chengzhih...@huawei.com/ https://patchwork.ozlabs.org/project/linux-mtd/patch/20200602112410.660785-1-chengzhih...@huawei.com/ https://patchwork.ozlabs.org/project/linux-mtd/cover/20200616071146.2607061-1-chengzhih...@huawei.com/
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/7 4:15, Richard Weinberger 写道: On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng wrote: Er, I can't get the point. I can list two possible situations, did I miss other situations? Yes. You keep ignoring the case I brought up. Let's start from scratch, maybe I miss something. So I'm sorry for being persistent. Never mind, we're all trying to figure it out. :-) . Besides, I'm not good at expressing question in English. (In Practicing...) The ubi thread can be reduced to a loop like this one: 1. for (;;) { 2. if (kthread_should_stop()) 3. break; 4. 5. if ( /* no work pending*/ ){ 6. set_current_state(TASK_INTERRUPTIBLE); 7. schedule(); 8. continue; 9. } 10. 11. do_work(); 12. } syzcaller found a case where stopping the thread did not work. If another task tries to stop the thread while no work is pending and the program counter in the thread is between lines 5 and 6, the kthread_stop() instruction has no effect. It has no effect because the thread sets the thread state to interruptible sleep and then schedules away. This is a common anti-pattern in the Linux kernel, sadly. Yes, but UBIFS is the exception, my solution looks like UBIFS. int ubifs_bg_thread(void *info) { while(1) { if (kthread_should_stop()) break; set_current_state(TASK_INTERRUPTIBLE); if (!c->need_bgt) { /* * Nothing prevents us from going sleep now and * be never woken up and block the task which * could wait in 'kthread_stop()' forever. */ if (kthread_should_stop()) break; schedule(); continue; } } } Do you agree with me so far or do you think syzcaller found a different issue? Yes, I agree. Your patch changes the loop as follows: 1. for (;;) { 2. if (kthread_should_stop()) 3. break; 4. 5. if ( /* no work pending*/ ){ 6. set_current_state(TASK_INTERRUPTIBLE); 7. 8. if (kthread_should_stop()) { 9. set_current_state(TASK_RUNNING); 10. break; 11. } 12. 13. schedule(); 14. continue; 15. } 16. 17. do_work(); 18. } That way there is a higher chance that the thread sees the stop flag and gracefully terminates, I fully agree on that. There's no disagreement so far. But it does not completely solve the problem. If kthread_stop() happens while the program counter of the ubi thread is at line 12, the stop flag is still missed and we end up in interruptible sleep just like before. That's where we hold different views. I have 3 viewpoints(You can point out which one you disagree.): 1. If kthread_stop() happens at line 12, ubi thread is *marked* with stop flag, it will stop at kthread_should_stop() as long as it can reach the next iteration. 2. If task A is on runqueue and its state is TASK_RUNNING, task A will be scheduled to execute. 3. If kthread_stop() happens at line 12, after program counter going to line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have explained this in situation 1 in last session. I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag after the process you described. Line 12 kthread_stop() set_bit(mark stop flag) && wake_up_process(enqueue && set TASK_RUNNING ) => TASK_RUNNING & stop flag & on runqueue Line 13 schedule() Do nothing but pick next task to execute So, to solve the problem entirely I suggest changing schedule() to schedule_timeout() and let the thread wake up periodically.
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/5 5:56, Richard Weinberger 写道: On Tue, Aug 4, 2020 at 4:58 AM Zhihao Cheng wrote: Oh, you're thinking about influence by schedule(), I get it. But I think it still works. Because the ubi_thread is still on runqueue, it will be scheduled to execute later anyway. It will not get woken. This is the problem. opstate of ubi_thread on runqueue set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE Yes if (kthread_should_stop()) // not satisfy TASK_INTERRUPTIBLE Yes kthread_stop: wake_up_process ttwu_queue ttwu_do_activate ttwu_do_wakeup TASK_RUNNING Yes schedule __schedule(false) // prev->state is TASK_RUNNING, so we cannot move it from runqueue by deactivate_task(). So just pick next task to execute, ubi_thread is still on runqueue and will be scheduled to execute later. It will be in state TASK_RUNNING only if your check is reached. If kthread_stop() is called *before* your code: + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } ...everything is fine. But there is still a race window between your if (kthread_should_stop()) and schedule() in the next line. So if kthread_stop() is called right *after* the if and *before* schedule(), the task state is still TASK_INTERRUPTIBLE --> schedule() will not return unless the task is explicitly woken, which does not happen. Er, I can't get the point. I can list two possible situations, did I miss other situations? P1:ubi_thread set_current_state(TASK_INTERRUPTIBLE) if (kthread_should_stop()) { set_current_state(TASK_RUNNING) break } schedule() -> don't *remove* task from runqueue if *TASK_RUNNING*, removing operation is protected by rq_lock P2:kthread_stop set_bit(KTHREAD_SHOULD_STOP, >flags) wake_up_process(k) -> enqueue task & set *TASK_RUNNING*, these two operations are protected by rq_lock wait_for_completion(>exited) Situation 1: P1_set_current_state on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE P1_kthread_should_stop on-rq, TASK_INTERRUPTIBLE P2_set_bit on-rq, TASK_INTERRUPTIBLE , KTHREAD_SHOULD_STOP P2_wake_up_process on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING , KTHREAD_SHOULD_STOP P1_schedule on-rq, TASK_RUNNING , KTHREAD_SHOULD_STOP P2_wait_for_completion // wait for P1 exit Situation 2: P1_set_current_state on-rq, TASK_RUNNING -> TASK_INTERRUPTIBLE P1_kthread_should_stop on-rq, TASK_INTERRUPTIBLE P2_set_bit on-rq, TASK_INTERRUPTIBLE , KTHREAD_SHOULD_STOP P1_schedule off-rq, TASK_INTERRUPTIBLE , KTHREAD_SHOULD_STOP P2_wake_up_process on-rq, TASK_INTERRUPTIBLE -> TASK_RUNNING , KTHREAD_SHOULD_STOP P2_wait_for_completion // wait for P1 exit Before your patch, the race window was much larger, I fully agree, but your patch does not cure the problem it just makes it harder to hit. And using mdelay() to verify such a thing is also tricky because mdelay() will influence the task state.
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/4 6:11, Richard Weinberger 写道: On Mon, Aug 3, 2020 at 4:01 AM Zhihao Cheng wrote: Hmm, I see the problem but I fear this patch does not cure the race completely. It just lowers the chance to hit it. What if KTHREAD_SHOULD_STOP is set right after you checked for it? The patch can handle this case. ubi_thread will exit at kthread_should_stop() in next iteration. How can it reach the next iteration? Maybe I didn't fully get your explanation. As far as I understand the problem correctly, the following happens: 1. ubi_thread is running and the program counter is somewhere between "if (kthread_should_stop())" and schedule() 2. While detaching kthread_stop() is called 3. Since the program counter in the thread is right before schedule(), it does not check KTHREAD_SHOULD_STOP and blindly calls into schedule() 4. The thread goes to sleep and nothing wakes it anymore -> endless wait. Is this correct so far? Oh, you're thinking about influence by schedule(), I get it. But I think it still works. Because the ubi_thread is still on runqueue, it will be scheduled to execute later anyway. op state of ubi_thread on runqueue set_current_state(TASK_INTERRUPTIBLE) TASK_INTERRUPTIBLE Yes if (kthread_should_stop()) // not satisfy TASK_INTERRUPTIBLE Yes kthread_stop: wake_up_process ttwu_queue ttwu_do_activate ttwu_do_wakeup TASK_RUNNING Yes schedule __schedule(false) // prev->state is TASK_RUNNING, so we cannot move it from runqueue by deactivate_task(). So just pick next task to execute, ubi_thread is still on runqueue and will be scheduled to execute later. The test patch added mdelay(5000) before schedule(), which can make sure kthread_stop()->wake_up_process() executed before schedule(). Previous analysis can be proved through test. @@ -1638,6 +1641,15 @@ int ubi_thread(void *u) !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) { set_current_state(TASK_INTERRUPTIBLE); spin_unlock(>wl_lock); + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + + pr_err("Check should stop B\n"); + mdelay(5000); + pr_err("delay 5000ms \n"); + schedule(); continue; } Your solution is putting another check for KTHREAD_SHOULD_STOP before schedule(). I argue that this will just reduce the chance to hit the race window because it can still happen that kthread_stop() is being called right after the second check and again before schedule(). Then we end up with the same situation.
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/3 5:25, Richard Weinberger 写道: On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng wrote: A detach hung is possible when a race occurs between the detach process and the ubi background thread. The following sequences outline the race: ubi thread: if (list_empty(>works)... ubi detach: set_bit(KTHREAD_SHOULD_STOP, >flags) => by kthread_stop() wake_up_process() => ubi thread is still running, so 0 is returned ubi thread: set_current_state(TASK_INTERRUPTIBLE) schedule() => ubi thread will never be scheduled again ubi detach: wait_for_completion() => hung task! To fix that, we need to check kthread_should_stop() after we set the task state, so the ubi thread will either see the stop bit and exit or the task state is reset to runnable such that it isn't scheduled out indefinitely. Signed-off-by: Zhihao Cheng Cc: Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images") Reported-by: syzbot+853639d0cb16c31c7...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 5146cce5fe32..a4d4343053d7 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1628,6 +1628,19 @@ int ubi_thread(void *u) !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) { set_current_state(TASK_INTERRUPTIBLE); spin_unlock(>wl_lock); + + /* +* Check kthread_should_stop() after we set the task +* state to guarantee that we either see the stop bit +* and exit or the task state is reset to runnable such +* that it's not scheduled out indefinitely and detects +* the stop bit at kthread_should_stop(). +*/ + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + Hmm, I see the problem but I fear this patch does not cure the race completely. It just lowers the chance to hit it. What if KTHREAD_SHOULD_STOP is set right after you checked for it? The patch can handle this case. ubi_thread will exit at kthread_should_stop() in next iteration. You can apply following patch to verify it. (You may set 'kernel.hung_task_timeout_secs = 10' by sysctl) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 27636063ed1b..44047861c2a1 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -91,6 +91,7 @@ #include #include "ubi.h" #include "wl.h" +#include /* Number of physical eraseblocks reserved for wear-leveling purposes */ #define WL_RESERVED_PEBS 1 @@ -1627,8 +1628,10 @@ int ubi_thread(void *u) for (;;) { int err; - if (kthread_should_stop()) + if (kthread_should_stop()) { + pr_err("Exit at stop A\n"); break; + } if (try_to_freeze()) continue; @@ -1638,6 +1641,15 @@ int ubi_thread(void *u) !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) { set_current_state(TASK_INTERRUPTIBLE); spin_unlock(>wl_lock); + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + + pr_err("Check should stop B\n"); + mdelay(5000); + pr_err("delay 5000ms \n"); + schedule(); continue; } diff --git a/kernel/kthread.c b/kernel/kthread.c index 132f84a5fde3..18627acb9573 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -590,6 +590,10 @@ int kthread_stop(struct task_struct *k) get_task_struct(k); kthread = to_kthread(k); set_bit(KTHREAD_SHOULD_STOP, >flags); + + if (k->comm && !strncmp(k->comm, "ubi_bgt0d", strlen("ubi_bgt0d"))) + pr_err("Stop flag has been set for task %s\n", k->comm); + kthread_unpark(k); wake_up_process(k); wait_for_completion(>exited); kernel msg: [ 210.602005] Check should stop B # Please execute ubi_detach manually when you see this [ 211.347638] Stop flag has been set for task ubi_bgt0d [ 215.603026] delay 5000ms [ 215.605728] Exit at stop A schedule(); continue; }
[f2fs-dev][PATCH] f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
Current judgment condition of f2fs_bug_on in function update_sit_entry(): new_vblocks >> (sizeof(unsigned short) << 3) || new_vblocks > sbi->blocks_per_seg which equivalents to: new_vblocks < 0 || new_vblocks > sbi->blocks_per_seg The latter is more intuitive. Signed-off-by: Zhihao Cheng Reported-by: Jack Qiu --- fs/f2fs/segment.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 196f31503511..41836447418d 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2140,7 +2140,7 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del) new_vblocks = se->valid_blocks + del; offset = GET_BLKOFF_FROM_SEG0(sbi, blkaddr); - f2fs_bug_on(sbi, (new_vblocks >> (sizeof(unsigned short) << 3) || + f2fs_bug_on(sbi, (new_vblocks < 0 || (new_vblocks > sbi->blocks_per_seg))); se->valid_blocks = new_vblocks; -- 2.25.4
Re: [PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
在 2020/7/11 14:37, Zhihao Cheng 写道: 在 2020/7/7 20:47, Richard Weinberger 写道: - Ursprüngliche Mail - Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks. On my test bed I didn't see a fs corruption, what I saw was a failing orphan self test while playing with O_TMPFILE and linkat(). Do we have a reproducer, or can I get the fail testcase? Is it a xfstest case? I think xfstests triggered it, yes. Later today I can check. :) Thanks, //richard . I think I have found the testcases, overlay/006 and overlay/041. The 'mv' and 'rm' operations will put lowertestfile into orphan list twice, so we must reseve the orphan deletion operation in ubifs_link(), otherwise the testcase fails and we will see the following msg: Sorry, not lowertestfile, it's tempfile which is generated by ovl copy-up (mv operation). The tempfile is linked after copy-up finished. The tempfile is then unlinked by 'rm' operation. overlay/006 2s ... - output mismatch (see /root/git/xfstests-dev/results//overlay/006.out.bad) --- tests/overlay/006.out 2020-07-07 21:42:57.73700 +0800 +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 14:31:55.34000 +0800 @@ -1,2 +1,4 @@ QA output created by 006 Silence is golden +rm: cannot remove '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument +lowertestfile ... [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: orphaned twice [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: orphan list not empty at unmount So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in function ubifs_link(). Following modifications applied in linux-5.8 has been tested by overlay/041, overlay/006 and other tmpfile cases (generic/531, generic/530, generic/509, generic/389, generic/004). diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..fd4443a5e8c6 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, err = ubifs_jnl_update(c, dir, , inode, 0, 0); if (err) goto out_cancel; + + /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ + if (inode->i_nlink == 1) + ubifs_delete_orphan(c, inode->i_ino); + unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); iput(inode); -- __ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/
Re: [PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
在 2020/7/11 14:37, Zhihao Cheng 写道: 在 2020/7/7 20:47, Richard Weinberger 写道: - Ursprüngliche Mail - Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks. On my test bed I didn't see a fs corruption, what I saw was a failing orphan self test while playing with O_TMPFILE and linkat(). Do we have a reproducer, or can I get the fail testcase? Is it a xfstest case? I think xfstests triggered it, yes. Later today I can check. :) Thanks, //richard . I think I have found the testcases, overlay/006 and overlay/041. The 'mv' and 'rm' operations will put lowertestfile into orphan list twice, so we must reseve the orphan deletion operation in ubifs_link(), otherwise the testcase fails and we will see the following msg: overlay/006 2s ... - output mismatch (see /root/git/xfstests-dev/results//overlay/006.out.bad) --- tests/overlay/006.out 2020-07-07 21:42:57.73700 +0800 +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 14:31:55.34000 +0800 @@ -1,2 +1,4 @@ QA output created by 006 Silence is golden +rm: cannot remove '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument +lowertestfile ... [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: orphaned twice [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: orphan list not empty at unmount So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in function ubifs_link(). Following modifications applied in linux-5.8 has been tested by overlay/041, overlay/006 and other tmpfile cases (generic/531, generic/530, generic/509, generic/389, generic/004). Results for testcases generic/530, generic/509, generic/389 and generic/004 are still "not run". diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..fd4443a5e8c6 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, err = ubifs_jnl_update(c, dir, , inode, 0, 0); if (err) goto out_cancel; + + /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ + if (inode->i_nlink == 1) + ubifs_delete_orphan(c, inode->i_ino); + unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); iput(inode); --
Re: [PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
在 2020/7/7 20:47, Richard Weinberger 写道: - Ursprüngliche Mail - Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks. On my test bed I didn't see a fs corruption, what I saw was a failing orphan self test while playing with O_TMPFILE and linkat(). Do we have a reproducer, or can I get the fail testcase? Is it a xfstest case? I think xfstests triggered it, yes. Later today I can check. :) Thanks, //richard . I think I have found the testcases, overlay/006 and overlay/041. The 'mv' and 'rm' operations will put lowertestfile into orphan list twice, so we must reseve the orphan deletion operation in ubifs_link(), otherwise the testcase fails and we will see the following msg: overlay/006 2s ... - output mismatch (see /root/git/xfstests-dev/results//overlay/006.out.bad) --- tests/overlay/006.out 2020-07-07 21:42:57.73700 +0800 +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 14:31:55.34000 +0800 @@ -1,2 +1,4 @@ QA output created by 006 Silence is golden +rm: cannot remove '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument +lowertestfile ... [ 382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: orphaned twice [ 382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: orphan list not empty at unmount So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in function ubifs_link(). Following modifications applied in linux-5.8 has been tested by overlay/041, overlay/006 and other tmpfile cases (generic/531, generic/530, generic/509, generic/389, generic/004). diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..fd4443a5e8c6 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, err = ubifs_jnl_update(c, dir, , inode, 0, 0); if (err) goto out_cancel; + + /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ + if (inode->i_nlink == 1) + ubifs_delete_orphan(c, inode->i_ino); + unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); iput(inode); --
[PATCH v3 1/2] ubi: fastmap: Don't produce the initial next anchor PEB when fastmap is disabled
Following process triggers a memleak caused by forgetting to release the initial next anchor PEB (CONFIG_MTD_UBI_FASTMAP is disabled): 1. attach -> __erase_worker -> produce the initial next anchor PEB 2. detach -> ubi_fastmap_close (Do nothing, it should have released the initial next anchor PEB) Don't produce the initial next anchor PEB in __erase_worker() when fastmap is disabled. Signed-off-by: Zhihao Cheng Suggested-by: Sascha Hauer Fixes: f9c34bb529975fe ("ubi: Fix producing anchor PEBs") Reported-by: syzbot+d9aab50b1154e3d16...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 27636063ed1b..42cac572f82d 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1086,7 +1086,8 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) if (!err) { spin_lock(>wl_lock); - if (!ubi->fm_next_anchor && e->pnum < UBI_FM_MAX_START) { + if (!ubi->fm_disabled && !ubi->fm_next_anchor && + e->pnum < UBI_FM_MAX_START) { /* Abort anchor production, if needed it will be * enabled again in the wear leveling started below. */ -- 2.25.4
[PATCH v3 0/2] ubi: fastmap: Produce and release fm_anchor peb correctly
v1 -> v2: Adapt Sascha's suggestions for fm_diabled checking in __erase_worker(). v2 -> v3: Free fm_anchor peb during cloing fastmap. Zhihao Cheng (2): ubi: fastmap: Don't produce the initial next anchor PEB when fastmap is disabled ubi: fastmap: Free fastmap next anchor peb during detach drivers/mtd/ubi/fastmap-wl.c | 5 + drivers/mtd/ubi/wl.c | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) -- 2.25.4
[PATCH v3 2/2] ubi: fastmap: Free fastmap next anchor peb during detach
ubi_wl_entry related with the fm_next_anchor PEB is not freed during detach, which causes a memory leak. Don't forget to release fm_next_anchor PEB while detaching ubi from mtd when CONFIG_MTD_UBI_FASTMAP is enabled. Signed-off-by: Zhihao Cheng Fixes: 4b68bf9a69d22d ("ubi: Select fastmap anchor PEBs considering...") --- drivers/mtd/ubi/fastmap-wl.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index 83afc00e365a..28f55f9cf715 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -381,6 +381,11 @@ static void ubi_fastmap_close(struct ubi_device *ubi) ubi->fm_anchor = NULL; } + if (ubi->fm_next_anchor) { + return_unused_peb(ubi, ubi->fm_next_anchor); + ubi->fm_next_anchor = NULL; + } + if (ubi->fm) { for (i = 0; i < ubi->fm->used_blocks; i++) kfree(ubi->fm->e[i]); -- 2.25.4
[PATCH v2] ubifs: Fix wrong orphan node deletion in ubifs_jnl_update|rename
There a wrong orphan node deleting in error handling path in ubifs_jnl_update() and ubifs_jnl_rename(), which may cause following error msg: UBIFS error (ubi0:0 pid 1522): ubifs_delete_orphan [ubifs]: missing orphan ino 65 Fix this by checking whether the node has been operated for adding to orphan list before being deleted, Signed-off-by: Zhihao Cheng Fixes: 823838a486888cf484e ("ubifs: Add hashes to the tree node cache") --- fs/ubifs/journal.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index e5ec1afe1c66..2cf05f87565c 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -539,7 +539,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, const struct fscrypt_name *nm, const struct inode *inode, int deletion, int xent) { - int err, dlen, ilen, len, lnum, ino_offs, dent_offs; + int err, dlen, ilen, len, lnum, ino_offs, dent_offs, orphan_added = 0; int aligned_dlen, aligned_ilen, sync = IS_DIRSYNC(dir); int last_reference = !!(deletion && inode->i_nlink == 0); struct ubifs_inode *ui = ubifs_inode(inode); @@ -630,6 +630,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, goto out_finish; } ui->del_cmtno = c->cmt_no; + orphan_added = 1; } err = write_head(c, BASEHD, dent, len, , _offs, sync); @@ -702,7 +703,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, kfree(dent); out_ro: ubifs_ro_mode(c, err); - if (last_reference) + if (orphan_added) ubifs_delete_orphan(c, inode->i_ino); finish_reservation(c); return err; @@ -1218,7 +1219,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir, void *p; union ubifs_key key; struct ubifs_dent_node *dent, *dent2; - int err, dlen1, dlen2, ilen, lnum, offs, len; + int err, dlen1, dlen2, ilen, lnum, offs, len, orphan_added = 0; int aligned_dlen1, aligned_dlen2, plen = UBIFS_INO_NODE_SZ; int last_reference = !!(new_inode && new_inode->i_nlink == 0); int move = (old_dir != new_dir); @@ -1334,6 +1335,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir, goto out_finish; } new_ui->del_cmtno = c->cmt_no; + orphan_added = 1; } err = write_head(c, BASEHD, dent, len, , , sync); @@ -1415,7 +1417,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir, release_head(c, BASEHD); out_ro: ubifs_ro_mode(c, err); - if (last_reference) + if (orphan_added) ubifs_delete_orphan(c, new_inode->i_ino); out_finish: finish_reservation(c); -- 2.25.4
Re: [PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
在 2020/7/7 20:09, Richard Weinberger 写道: - Ursprüngliche Mail - Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks. On my test bed I didn't see a fs corruption, what I saw was a failing orphan self test while playing with O_TMPFILE and linkat(). Do we have a reproducer, or can I get the fail testcase? Is it a xfstest case? When you create a tmpfile it has a link count of 0 and an orphan is installed. Such that the tmpfile is gone after a reboot but you can still use it prior to that. By using linkat() you can raise the link counter to 1 again. Thus, the orphan needs to be removed. This is pattern overlayfs uses a lot. Since UBIFS never supported raising the link counter from 0 to 1 we have many corner cases and fixing all these turned out into a nightmare. ...as you can see from the amount broken patches from me :-(. Thanks, //richard .
Re: [PATCH] ubifs: Fix wrong orphan node deletion in ubifs_jnl_update()
在 2020/7/7 19:52, Richard Weinberger 写道: On Thu, Jul 2, 2020 at 5:21 PM Zhihao Cheng wrote: There a wrong orphan node deleting in error handling path in ubifs_jnl_update(), which may cause following error msg: UBIFS error (ubi0:0 pid 1522): ubifs_delete_orphan [ubifs]: missing orphan ino 65 Fix this by checking whether the node has been operated for adding to orphan list before being deleted, Signed-off-by: Zhihao Cheng --- fs/ubifs/journal.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index e5ec1afe1c66..db0a80dd9d52 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -539,7 +539,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, const struct fscrypt_name *nm, const struct inode *inode, int deletion, int xent) { - int err, dlen, ilen, len, lnum, ino_offs, dent_offs; + int err, dlen, ilen, len, lnum, ino_offs, dent_offs, orphan_added = 0; int aligned_dlen, aligned_ilen, sync = IS_DIRSYNC(dir); int last_reference = !!(deletion && inode->i_nlink == 0); struct ubifs_inode *ui = ubifs_inode(inode); @@ -630,6 +630,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, goto out_finish; } ui->del_cmtno = c->cmt_no; + orphan_added = 1; } err = write_head(c, BASEHD, dent, len, , _offs, sync); @@ -702,7 +703,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, kfree(dent); out_ro: ubifs_ro_mode(c, err); - if (last_reference) + if (last_reference && orphan_added) I think you can just check for orphan_added here. Looks good otherwise, thanks for fixing! :-) Sounds reasonable. I will send v2.
Re: [PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
在 2020/7/7 19:26, Richard Weinberger 写道: On Wed, Jul 1, 2020 at 1:28 PM Zhihao Cheng wrote: There is a potential space leak problem while linking tmpfile, in which case, inode node (with nlink=0) is valid in tnc (on flash), which leads to space leak. Meanwhile, the corresponding data nodes won't be released from tnc. For example, (A reproducer can be found in Link): $ mount UBIFS [process A][process B] [TNC] [orphan area] ubifs_tmpfile inode_A (nlink=0) inode_A do_commit inode_A (nlink=0) inode_A ↑ (comment: It makes sure not replay inode_A in next mount) ubifs_link inode_A (nlink=0) inode_A ubifs_delete_orphan inode_A (nlink=0) do_commit inode_A (nlink=0) ---> POWERCUT <--- (ubifs_jnl_update) $ mount UBIFS inode_A will neither be replayed in ubifs_replay_journal() nor ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will always on tnc, it occupy space but is non-visable for users. Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing orphans.") handles problem in mistakenly deleting relinked tmpfile while replaying orphan area. Since that, tmpfile inode should always live in orphan area even it is linked. Fix it by reverting commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). Well, by reverting this commit you re-introduce the issue it fixes. ;-\ Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") wanted to fix. I think orphan area is used to remind filesystem don't forget to delete inodes (whose nlink is 0) in next unclean rebooting. Generally, the file system is not corrupted caused by replaying orphan nodes. Ralph reported a filesystem corruption in combination with overlayfs. Can you tell me the details about that problem? Thanks.
[PATCH] ubifs: Fix wrong orphan node deletion in ubifs_jnl_update()
There a wrong orphan node deleting in error handling path in ubifs_jnl_update(), which may cause following error msg: UBIFS error (ubi0:0 pid 1522): ubifs_delete_orphan [ubifs]: missing orphan ino 65 Fix this by checking whether the node has been operated for adding to orphan list before being deleted, Signed-off-by: Zhihao Cheng --- fs/ubifs/journal.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index e5ec1afe1c66..db0a80dd9d52 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -539,7 +539,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, const struct fscrypt_name *nm, const struct inode *inode, int deletion, int xent) { - int err, dlen, ilen, len, lnum, ino_offs, dent_offs; + int err, dlen, ilen, len, lnum, ino_offs, dent_offs, orphan_added = 0; int aligned_dlen, aligned_ilen, sync = IS_DIRSYNC(dir); int last_reference = !!(deletion && inode->i_nlink == 0); struct ubifs_inode *ui = ubifs_inode(inode); @@ -630,6 +630,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, goto out_finish; } ui->del_cmtno = c->cmt_no; + orphan_added = 1; } err = write_head(c, BASEHD, dent, len, , _offs, sync); @@ -702,7 +703,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir, kfree(dent); out_ro: ubifs_ro_mode(c, err); - if (last_reference) + if (last_reference && orphan_added) ubifs_delete_orphan(c, inode->i_ino); finish_reservation(c); return err; -- 2.25.4
[PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
There is a potential space leak problem while linking tmpfile, in which case, inode node (with nlink=0) is valid in tnc (on flash), which leads to space leak. Meanwhile, the corresponding data nodes won't be released from tnc. For example, (A reproducer can be found in Link): $ mount UBIFS [process A][process B] [TNC] [orphan area] ubifs_tmpfile inode_A (nlink=0) inode_A do_commit inode_A (nlink=0) inode_A ↑ (comment: It makes sure not replay inode_A in next mount) ubifs_link inode_A (nlink=0) inode_A ubifs_delete_orphan inode_A (nlink=0) do_commit inode_A (nlink=0) ---> POWERCUT <--- (ubifs_jnl_update) $ mount UBIFS inode_A will neither be replayed in ubifs_replay_journal() nor ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will always on tnc, it occupy space but is non-visable for users. Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing orphans.") handles problem in mistakenly deleting relinked tmpfile while replaying orphan area. Since that, tmpfile inode should always live in orphan area even it is linked. Fix it by reverting commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). Signed-off-by: Zhihao Cheng Cc: # v5.3+ Fixes: 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") Link: https://bugzilla.kernel.org/show_bug.cgi?id=208405 --- fs/ubifs/dir.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..9534c4bb598f 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -747,8 +742,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); iput(inode); -- 2.25.4
[PATCH] ubifs: Fix a potential space leak problem while linking tmpfile
There is a potential space leak problem while linking tmpfile, in which case, inode node (with nlink=0) is valid in tnc (on flash), which leads to space leak. Meanwhile, the corresponding data nodes won't be released from tnc. For example, (A reproducer can be found in Link): $ mount UBIFS [process A][process B] [TNC] [orphan area] ubifs_tmpfile inode_A (nlink=0) inode_A do_commit inode_A (nlink=0) inode_A ↑ (comment: It makes sure not replay inode_A in next mount) ubifs_link inode_A (nlink=0) inode_A ubifs_delete_orphan inode_A (nlink=0) do_commit inode_A (nlink=0) ---> POWERCUT <--- (ubifs_jnl_update) $ mount UBIFS inode_A will neither be replayed in ubifs_replay_journal() nor ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will always on tnc, it occupy space but is non-visable for users. Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing orphans.") handles problem in mistakenly deleting relinked tmpfile while replaying orphan area. Since that, tmpfile inode should always live in orphan area even it is linked. Fix it by reverting commit 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()"). Signed-off-by: Zhihao Cheng Cc: Fixes: 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()") Link: https://bugzilla.kernel.org/show_bug.cgi?id=208405 --- fs/ubifs/dir.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ef85ec167a84..9534c4bb598f 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, goto out_fname; lock_2_inodes(dir, inode); - - /* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */ - if (inode->i_nlink == 0) - ubifs_delete_orphan(c, inode->i_ino); - inc_nlink(inode); ihold(inode); inode->i_ctime = current_time(inode); @@ -747,8 +742,6 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir, dir->i_size -= sz_change; dir_ui->ui_size = dir->i_size; drop_nlink(inode); - if (inode->i_nlink == 0) - ubifs_add_orphan(c, inode->i_ino); unlock_2_inodes(dir, inode); ubifs_release_budget(c, ); iput(inode); -- 2.25.4
[PATCH RFC 1/5] ubifs: Limit dumping length by size of memory which is allocated for the node
To prevent memory out-of-bounds accessing in ubifs_dump_node(), actual dumping length should be restricted by another condition(size of memory which is allocated for the node). This patch handles following situations (These situations may be caused by bit flipping due to hardware error, writing bypass ubifs, unknown bugs in ubifs, etc.): 1. bad node_len: Dumping data according to 'ch->len' which may exceed the size of memory allocated for node. 2. bad node content: Some kinds of node can record additional data, eg. index node and orphan node, make sure the size of additional data not beyond the node length. 3. node_type changes: Read data according to type A, but expected type B, before that, node is allocated according to type B's size. Length of type A node is greater than type B node. Signed-off-by: Zhihao Cheng --- fs/ubifs/debug.c | 63 ++-- fs/ubifs/debug.h | 3 ++- 2 files changed, 52 insertions(+), 14 deletions(-) diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index 31288d8fa2ce..e995e553541d 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -291,9 +291,9 @@ void ubifs_dump_inode(struct ubifs_info *c, const struct inode *inode) kfree(pdent); } -void ubifs_dump_node(const struct ubifs_info *c, const void *node) +void ubifs_dump_node(const struct ubifs_info *c, const void *node, int node_len) { - int i, n; + int i, n, type, safe_len, max_node_len, min_node_len; union ubifs_key key; const struct ubifs_ch *ch = node; char key_buf[DBG_KEY_BUF_LEN]; @@ -306,10 +306,40 @@ void ubifs_dump_node(const struct ubifs_info *c, const void *node) return; } + /* Skip dumping unknown type node */ + type = ch->node_type; + if (type < 0 || type >= UBIFS_NODE_TYPES_CNT) { + pr_err("node type %d was not recognized\n", type); + return; + } + spin_lock(_lock); dump_ch(node); - switch (ch->node_type) { + if (c->ranges[type].max_len == 0) { + max_node_len = min_node_len = c->ranges[type].len; + } else { + max_node_len = c->ranges[type].max_len; + min_node_len = c->ranges[type].min_len; + } + safe_len = le32_to_cpu(ch->len); + safe_len = safe_len > 0 ? safe_len : 0; + safe_len = min3(safe_len, max_node_len, node_len); + if (safe_len < min_node_len) { + pr_err("node len(%d) is too short for %s, left %d bytes:\n", + safe_len, dbg_ntype(type), + safe_len > UBIFS_CH_SZ ? + safe_len - (int)UBIFS_CH_SZ : 0); + if (safe_len > UBIFS_CH_SZ) + print_hex_dump(KERN_ERR, "", DUMP_PREFIX_OFFSET, 32, 1, + (void *)node + UBIFS_CH_SZ, + safe_len - UBIFS_CH_SZ, 0); + goto out_unlock; + } + if (safe_len != le32_to_cpu(ch->len)) + pr_err("\ttruncated node length %d\n", safe_len); + + switch (type) { case UBIFS_PAD_NODE: { const struct ubifs_pad_node *pad = node; @@ -453,7 +483,8 @@ void ubifs_dump_node(const struct ubifs_info *c, const void *node) pr_err("\tnlen %d\n", nlen); pr_err("\tname "); - if (nlen > UBIFS_MAX_NLEN) + if (nlen > UBIFS_MAX_NLEN || + nlen > safe_len - UBIFS_DENT_NODE_SZ) pr_err("(bad name length, not printing, bad or corrupted node)"); else { for (i = 0; i < nlen && dent->name[i]; i++) @@ -467,7 +498,6 @@ void ubifs_dump_node(const struct ubifs_info *c, const void *node) case UBIFS_DATA_NODE: { const struct ubifs_data_node *dn = node; - int dlen = le32_to_cpu(ch->len) - UBIFS_DATA_NODE_SZ; key_read(c, >key, ); pr_err("\tkey%s\n", @@ -475,10 +505,13 @@ void ubifs_dump_node(const struct ubifs_info *c, const void *node) pr_err("\tsize %u\n", le32_to_cpu(dn->size)); pr_err("\tcompr_typ %d\n", (int)le16_to_cpu(dn->compr_type)); - pr_err("\tdata size %d\n", dlen); - pr_err("\tdata:\n"); + pr_err("\tdata size %u\n", + le32_to_cpu(ch->len) - (unsigned int)UBIFS_DATA_NODE_SZ); + pr_err("\tdata (length = %d):\n", + safe_len - (int)UBIFS_DATA_NODE_SZ); print_hex_dump(KERN_ERR, &quo
[PATCH RFC 3/5] ubifs: Pass node length in all node dumping callers
Function ubifs_dump_node() has been modified to avoid memory oob accessing while dumping node, node length (corresponding to the size of allocated memory for node) should be passed into all node dumping callers. Signed-off-by: Zhihao Cheng --- fs/ubifs/commit.c | 4 ++-- fs/ubifs/debug.c| 30 +++--- fs/ubifs/file.c | 2 +- fs/ubifs/io.c | 23 +++ fs/ubifs/journal.c | 3 ++- fs/ubifs/master.c | 4 ++-- fs/ubifs/orphan.c | 6 -- fs/ubifs/recovery.c | 6 +++--- fs/ubifs/replay.c | 4 ++-- fs/ubifs/sb.c | 2 +- fs/ubifs/scan.c | 4 ++-- fs/ubifs/super.c| 2 +- fs/ubifs/tnc.c | 8 fs/ubifs/tnc_misc.c | 4 ++-- fs/ubifs/ubifs.h| 4 ++-- 15 files changed, 54 insertions(+), 52 deletions(-) diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c index ad292c5a43a9..2a3963e3b569 100644 --- a/fs/ubifs/commit.c +++ b/fs/ubifs/commit.c @@ -701,13 +701,13 @@ int dbg_check_old_index(struct ubifs_info *c, struct ubifs_zbranch *zroot) out_dump: ubifs_err(c, "dumping index node (iip=%d)", i->iip); - ubifs_dump_node(c, idx); + ubifs_dump_node(c, idx, ubifs_idx_node_sz(c, c->fanout)); list_del(>list); kfree(i); if (!list_empty()) { i = list_entry(list.prev, struct idx_node, list); ubifs_err(c, "dumping parent index node"); - ubifs_dump_node(c, >idx); + ubifs_dump_node(c, >idx, ubifs_idx_node_sz(c, c->fanout)); } out_free: while (!list_empty()) { diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index e995e553541d..24a6e6fb5e9a 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -871,7 +871,7 @@ void ubifs_dump_leb(const struct ubifs_info *c, int lnum) cond_resched(); pr_err("Dumping node at LEB %d:%d len %d\n", lnum, snod->offs, snod->len); - ubifs_dump_node(c, snod->node); + ubifs_dump_node(c, snod->node, c->leb_size - snod->offs); } pr_err("(pid %d) finish dumping LEB %d\n", current->pid, lnum); @@ -1248,7 +1248,7 @@ static int dbg_check_key_order(struct ubifs_info *c, struct ubifs_zbranch *zbr1, ubifs_err(c, "but it should have key %s according to tnc", dbg_snprintf_key(c, >key, key_buf, DBG_KEY_BUF_LEN)); - ubifs_dump_node(c, dent1); + ubifs_dump_node(c, dent1, UBIFS_MAX_DENT_NODE_SZ); goto out_free; } @@ -1260,7 +1260,7 @@ static int dbg_check_key_order(struct ubifs_info *c, struct ubifs_zbranch *zbr1, ubifs_err(c, "but it should have key %s according to tnc", dbg_snprintf_key(c, >key, key_buf, DBG_KEY_BUF_LEN)); - ubifs_dump_node(c, dent2); + ubifs_dump_node(c, dent2, UBIFS_MAX_DENT_NODE_SZ); goto out_free; } @@ -1279,9 +1279,9 @@ static int dbg_check_key_order(struct ubifs_info *c, struct ubifs_zbranch *zbr1, dbg_snprintf_key(c, , key_buf, DBG_KEY_BUF_LEN)); ubifs_msg(c, "first node at %d:%d\n", zbr1->lnum, zbr1->offs); - ubifs_dump_node(c, dent1); + ubifs_dump_node(c, dent1, UBIFS_MAX_DENT_NODE_SZ); ubifs_msg(c, "second node at %d:%d\n", zbr2->lnum, zbr2->offs); - ubifs_dump_node(c, dent2); + ubifs_dump_node(c, dent2, UBIFS_MAX_DENT_NODE_SZ); out_free: kfree(dent2); @@ -2146,7 +2146,7 @@ static int check_leaf(struct ubifs_info *c, struct ubifs_zbranch *zbr, out_dump: ubifs_msg(c, "dump of node at LEB %d:%d", zbr->lnum, zbr->offs); - ubifs_dump_node(c, node); + ubifs_dump_node(c, node, zbr->len); out_free: kfree(node); return err; @@ -2279,7 +2279,7 @@ static int check_inodes(struct ubifs_info *c, struct fsck_data *fsckd) ubifs_msg(c, "dump of the inode %lu sitting in LEB %d:%d", (unsigned long)fscki->inum, zbr->lnum, zbr->offs); - ubifs_dump_node(c, ino); + ubifs_dump_node(c, ino, zbr->len); kfree(ino); return -EINVAL; } @@ -2350,12 +2350,12 @@ int dbg_check_data_nodes_order(struct ubifs_info *c, struct list_head *head) if (sa->type != UBIFS_DATA_NODE) { ubifs_err(c, "bad node type %d", sa->type); - ubifs_dump_node(c, sa->node); + ubifs_dump_node(c, sa->node, c->leb_size - sa->offs); return -EINVAL; } if (sb->type != UBIFS_DATA_NODE) { ubifs
[PATCH RFC 0/5] ubifs: Prevent memory oob accessing while dumping node
We use function ubifs_dump_node() to dump bad node caused by some reasons (Such as bit flipping caused by hardware error, writing bypass ubifs or unknown bugs in ubifs). The node content can not be trusted anymore, so we should prevent memory out-of-bounds accessing while dumping node in following situations: 1. bad node_len: Dumping data according to 'ch->len' which may exceed the size of memory allocated for node. 2. bad node content: Some kinds of node can record additional data, eg. index node and orphan node, make sure the size of additional data not beyond the node length. 3. node_type changes: Read data according to type A, but expected type B, before that, node is allocated according to type B's size. Length of type A node is greater than type B node. Commit acc5af3efa303d5f3 ("ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len") handles situation 1 for data node only, it would be better if we can solve problems in above situations for all kinds of nodes. Patch 1 adds a new parameter 'node_len'(size of memory which is allocated for the node) in function ubifs_dump_node(), safe dumping length of the node should be: minimum(ch->len, c->ranges[node_type].max_len, node_len). Besides, c->ranges[node_type].min_len can not greater than safe dumping length, which may caused by node_type changes(situation 3). Patch 2 reverts commit acc5af3efa303d5f ("ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len") to prepare for patch 3. Patch 3 replaces modified function ubifs_dump_node() in all node dumping places except for ubifs_dump_sleb(). Patch 4 removes unused function ubifs_dump_sleb(), Patch 5 allows ubifs_dump_node() to dump all branches of the index node. Some tests after patchset applied: https://bugzilla.kernel.org/show_bug.cgi?id=208203 Zhihao Cheng (5): ubifs: Limit dumping length by size of memory which is allocated for the node Revert "ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len" ubifs: Pass node length in all node dumping callers ubifs: ubifs_dump_sleb: Remove unused function ubifs: ubifs_dump_node: Dump all branches of the index node fs/ubifs/commit.c | 4 +- fs/ubifs/debug.c| 111 ++-- fs/ubifs/debug.h| 5 +- fs/ubifs/file.c | 2 +- fs/ubifs/io.c | 37 +-- fs/ubifs/journal.c | 3 +- fs/ubifs/master.c | 4 +- fs/ubifs/orphan.c | 6 ++- fs/ubifs/recovery.c | 6 +-- fs/ubifs/replay.c | 4 +- fs/ubifs/sb.c | 2 +- fs/ubifs/scan.c | 4 +- fs/ubifs/super.c| 2 +- fs/ubifs/tnc.c | 8 ++-- fs/ubifs/tnc_misc.c | 4 +- fs/ubifs/ubifs.h| 4 +- 16 files changed, 108 insertions(+), 98 deletions(-) -- 2.25.4
[PATCH RFC 4/5] ubifs: ubifs_dump_sleb: Remove unused function
Function ubifs_dump_sleb() is defined but unused, it can be removed. Signed-off-by: Zhihao Cheng --- fs/ubifs/debug.c | 16 fs/ubifs/debug.h | 2 -- 2 files changed, 18 deletions(-) diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index 24a6e6fb5e9a..2d07615369f9 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -828,22 +828,6 @@ void ubifs_dump_lpt_info(struct ubifs_info *c) spin_unlock(_lock); } -void ubifs_dump_sleb(const struct ubifs_info *c, -const struct ubifs_scan_leb *sleb, int offs) -{ - struct ubifs_scan_node *snod; - - pr_err("(pid %d) start dumping scanned data from LEB %d:%d\n", - current->pid, sleb->lnum, offs); - - list_for_each_entry(snod, >nodes, list) { - cond_resched(); - pr_err("Dumping node at LEB %d:%d len %d\n", - sleb->lnum, snod->offs, snod->len); - ubifs_dump_node(c, snod->node); - } -} - void ubifs_dump_leb(const struct ubifs_info *c, int lnum) { struct ubifs_scan_leb *sleb; diff --git a/fs/ubifs/debug.h b/fs/ubifs/debug.h index 42610fa5f3a7..ed966108da80 100644 --- a/fs/ubifs/debug.h +++ b/fs/ubifs/debug.h @@ -252,8 +252,6 @@ void ubifs_dump_lprop(const struct ubifs_info *c, void ubifs_dump_lprops(struct ubifs_info *c); void ubifs_dump_lpt_info(struct ubifs_info *c); void ubifs_dump_leb(const struct ubifs_info *c, int lnum); -void ubifs_dump_sleb(const struct ubifs_info *c, -const struct ubifs_scan_leb *sleb, int offs); void ubifs_dump_znode(const struct ubifs_info *c, const struct ubifs_znode *znode); void ubifs_dump_heap(struct ubifs_info *c, struct ubifs_lpt_heap *heap, -- 2.25.4
[PATCH RFC 2/5] Revert "ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len"
This reverts commit acc5af3efa303d5f36cc8c0f61716161f6ca1384. No need to avoid memory oob in dumping for data node alone. Later, node length will be passed into function 'ubifs_dump_node()' which replaces all node dumping places. Signed-off-by: Zhihao Cheng --- fs/ubifs/io.c | 16 ++-- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c index 7e4bfaf2871f..8ceb51478800 100644 --- a/fs/ubifs/io.c +++ b/fs/ubifs/io.c @@ -225,7 +225,7 @@ int ubifs_is_mapped(const struct ubifs_info *c, int lnum) int ubifs_check_node(const struct ubifs_info *c, const void *buf, int lnum, int offs, int quiet, int must_chk_crc) { - int err = -EINVAL, type, node_len, dump_node = 1; + int err = -EINVAL, type, node_len; uint32_t crc, node_crc, magic; const struct ubifs_ch *ch = buf; @@ -278,22 +278,10 @@ int ubifs_check_node(const struct ubifs_info *c, const void *buf, int lnum, out_len: if (!quiet) ubifs_err(c, "bad node length %d", node_len); - if (type == UBIFS_DATA_NODE && node_len > UBIFS_DATA_NODE_SZ) - dump_node = 0; out: if (!quiet) { ubifs_err(c, "bad node at LEB %d:%d", lnum, offs); - if (dump_node) { - ubifs_dump_node(c, buf); - } else { - int safe_len = min3(node_len, c->leb_size - offs, - (int)UBIFS_MAX_DATA_NODE_SZ); - pr_err("\tprevent out-of-bounds memory access\n"); - pr_err("\ttruncated data node length %d\n", safe_len); - pr_err("\tcorrupted data node:\n"); - print_hex_dump(KERN_ERR, "\t", DUMP_PREFIX_OFFSET, 32, 1, - buf, safe_len, 0); - } + ubifs_dump_node(c, buf); dump_stack(); } return err; -- 2.25.4
[PATCH RFC 5/5] ubifs: ubifs_dump_node: Dump all branches of the index node
An index node can have up to c->fanout branches, all branches should be displayed while dumping index node. Signed-off-by: Zhihao Cheng --- fs/ubifs/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index 2d07615369f9..a65f73e91b4f 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -537,7 +537,7 @@ void ubifs_dump_node(const struct ubifs_info *c, const void *node, int node_len) pr_err("\tlevel %d\n", (int)le16_to_cpu(idx->level)); pr_err("\tBranches:\n"); - for (i = 0; i < n && i < c->fanout - 1; i++) { + for (i = 0; i < n && i < c->fanout; i++) { const struct ubifs_branch *br; br = ubifs_idx_branch(c, idx, i); -- 2.25.4
[PATCH v2] ubi: fastmap: Don't produce the initial anchor PEB when fastmap is disabled
Following process triggers a memleak caused by forgetting to release the initial anchor PEB (CONFIG_MTD_UBI_FASTMAP is disabled): 1. attach -> __erase_worker -> produce the initial anchor PEB 2. detach -> ubi_fastmap_close (Do nothing, it should have released the initial anchor PEB) Don't produce the initial anchor PEB in __erase_worker() when fastmap is disabled. Signed-off-by: Zhihao Cheng Suggested-by: Sascha Hauer Fixes: f9c34bb529975fe ("ubi: Fix producing anchor PEBs") Reported-by: syzbot+d9aab50b1154e3d16...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 5146cce5fe32..f88486ccf5d5 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1079,7 +1079,8 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) if (!err) { spin_lock(>wl_lock); - if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) { + if (!ubi->fm_disabled && !ubi->fm_anchor && + e->pnum < UBI_FM_MAX_START) { ubi->fm_anchor = e; ubi->fm_do_produce_anchor = 0; } else { -- 2.25.4
Re: [PATCH] ubi: fastmap: Don't produce the initial anchor PEB when fastmap is disabled
在 2020/6/2 17:23, Sascha Hauer 写道: Hi, On Mon, Jun 01, 2020 at 05:11:34PM +0800, Zhihao Cheng wrote: Following process triggers a memleak caused by forgetting to release the initial anchor PEB (CONFIG_MTD_UBI_FASTMAP is disabled): 1. attach -> __erase_worker -> produce the initial anchor PEB 2. detach -> ubi_fastmap_close (Do nothing, it should have released the initial anchor PEB) Don't produce the initial anchor PEB in __erase_worker() when fastmap is disabled. Signed-off-by: Zhihao Cheng Fixes: f9c34bb529975fe ("ubi: Fix producing anchor PEBs") Reported-by: syzbot+d9aab50b1154e3d16...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 5146cce5fe32..5ebe1084a8e7 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1079,13 +1079,19 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) if (!err) { spin_lock(>wl_lock); - if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) { +#ifdef CONFIG_MTD_UBI_FASTMAP + if (!ubi->fm_disabled && !ubi->fm_anchor && + e->pnum < UBI_FM_MAX_START) { Rather than introducing another #ifdef you could do a if (IS_ENABLED(CONFIG_MTD_UBI_FASTMAP) && !ubi->fm_disabled && !ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) And I am not sure if the IS_ENABLED(CONFIG_MTD_UBI_FASTMAP) is necessary at all because we do a ubi->fm_disabled = 1 when fastmap is disabled. Regards, Sascha Agree.
[PATCH v3] afs: Fix memory leak in afs_put_sysnames()
Fix afs_put_sysnames() to actually free the specified afs_sysnames object after its reference count has been decreased to zero and its contents have been released. Signed-off-by: Zhihao Cheng Cc: # v4.17+ Fixes: 6f8880d8e681557 ("afs: Implement @sys substitution handling") --- fs/afs/proc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 468e1713bce1..6f34c84a0fd0 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -563,6 +563,7 @@ void afs_put_sysnames(struct afs_sysnames *sysnames) if (sysnames->subs[i] != afs_init_sysname && sysnames->subs[i] != sysnames->blank) kfree(sysnames->subs[i]); + kfree(sysnames); } } -- 2.25.4
Re: [PATCH 1/2] ubifs: Fix potential memory leaks while iterating entries
在 2020/6/1 20:00, Markus Elfring 写道: Fix some potential memory leaks in error handling branches while iterating xattr entries. Such information is useful. For example, function ubifs_tnc_remove_ino() forgets to free pxent if it exists. Similar problems also exist in ubifs_purge_xattrs(), ubifs_add_orphan() and ubifs_jnl_write_inode(). Can an other wording variant be a bit nicer? Thanks for reminding, I will improve this description. I suggest to avoid the specification of duplicate function calls (also for the desired exception handling). Will it be helpful to add a few jump targets? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162#n455 I've thought about using "goto err_tag_2" in kill_xattrs code block to release prev xent, but later it needs to jump to 'out_release tag‘ for releasing previously requested memory, which can clutter the code. It seems that two consecutive 'goto tags' will make the code less readable. Regards, Markus .
[PATCH v2] afs: Fix memory leak in afs_put_sysnames()
sysnames should be freed after refcnt being decreased to zero in afs_put_sysnames(). Signed-off-by: Zhihao Cheng Cc: # v4.17+ Fixes: 6f8880d8e681557 ("afs: Implement @sys substitution handling") --- fs/afs/proc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 468e1713bce1..6f34c84a0fd0 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -563,6 +563,7 @@ void afs_put_sysnames(struct afs_sysnames *sysnames) if (sysnames->subs[i] != afs_init_sysname && sysnames->subs[i] != sysnames->blank) kfree(sysnames->subs[i]); + kfree(sysnames); } } -- 2.25.4
[PATCH] afs: Fix memory leak in afs_put_sysnames()
sysnames should be freed after refcnt being decreased to zero in afs_put_sysnames(). Besides, it would be better set net->sysnames to 'NULL' after net->sysnames being released if afs_put_sysnames() aims on an afs_sysnames object. Signed-off-by: Zhihao Cheng Cc: # v4.17+ Fixes: 6f8880d8e681557 ("afs: Implement @sys substitution handling") --- fs/afs/dir.c | 2 +- fs/afs/internal.h | 2 ++ fs/afs/main.c | 4 ++-- fs/afs/proc.c | 25 - 4 files changed, 25 insertions(+), 8 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index d1e1caa23c8b..cb9d8aa91048 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -894,7 +894,7 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry, */ ret = NULL; out_s: - afs_put_sysnames(subs); + afs_put_sysnames_and_null(net); kfree(buf); out_p: key_put(key); diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 80255513e230..615dd5f9ad6f 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1093,12 +1093,14 @@ extern void __net_exit afs_proc_cleanup(struct afs_net *); extern int afs_proc_cell_setup(struct afs_cell *); extern void afs_proc_cell_remove(struct afs_cell *); extern void afs_put_sysnames(struct afs_sysnames *); +extern void afs_put_sysnames_and_null(struct afs_net *); #else static inline int afs_proc_init(struct afs_net *net) { return 0; } static inline void afs_proc_cleanup(struct afs_net *net) {} static inline int afs_proc_cell_setup(struct afs_cell *cell) { return 0; } static inline void afs_proc_cell_remove(struct afs_cell *cell) {} static inline void afs_put_sysnames(struct afs_sysnames *sysnames) {} +static inline void afs_put_sysnames_and_null(struct afs_net *net) {} #endif /* diff --git a/fs/afs/main.c b/fs/afs/main.c index c9c45d7078bd..6bf73fc65fb5 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -132,7 +132,7 @@ static int __net_init afs_net_init(struct net *net_ns) net->live = false; afs_proc_cleanup(net); error_proc: - afs_put_sysnames(net->sysnames); + afs_put_sysnames_and_null(net); error_sysnames: net->live = false; return ret; @@ -150,7 +150,7 @@ static void __net_exit afs_net_exit(struct net *net_ns) afs_purge_servers(net); afs_close_socket(net); afs_proc_cleanup(net); - afs_put_sysnames(net->sysnames); + afs_put_sysnames_and_null(net); } static struct pernet_operations afs_net_ops = { diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 468e1713bce1..26e1e73281a6 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -554,15 +554,30 @@ static int afs_proc_sysname_write(struct file *file, char *buf, size_t size) goto out; } -void afs_put_sysnames(struct afs_sysnames *sysnames) +static void afs_free_sysnames(struct afs_sysnames *sysnames) { int i; + for (i = 0; i < sysnames->nr; i++) + if (sysnames->subs[i] != afs_init_sysname && + sysnames->subs[i] != sysnames->blank) + kfree(sysnames->subs[i]); + kfree(sysnames); +} + +void afs_put_sysnames(struct afs_sysnames *sysnames) +{ + if (sysnames && refcount_dec_and_test(>usage)) + afs_free_sysnames(sysnames); +} + +void afs_put_sysnames_and_null(struct afs_net *net) +{ + struct afs_sysnames *sysnames = net->sysnames; + if (sysnames && refcount_dec_and_test(>usage)) { - for (i = 0; i < sysnames->nr; i++) - if (sysnames->subs[i] != afs_init_sysname && - sysnames->subs[i] != sysnames->blank) - kfree(sysnames->subs[i]); + afs_free_sysnames(sysnames); + net->sysnames = NULL; } } -- 2.25.4
[PATCH] ubi: check kthread_should_stop() after the setting of task state
A detach hung is possible when a race occurs between the detach process and the ubi background thread. The following sequences outline the race: ubi thread: if (list_empty(>works)... ubi detach: set_bit(KTHREAD_SHOULD_STOP, >flags) => by kthread_stop() wake_up_process() => ubi thread is still running, so 0 is returned ubi thread: set_current_state(TASK_INTERRUPTIBLE) schedule() => ubi thread will never be scheduled again ubi detach: wait_for_completion() => hung task! To fix that, we need to check kthread_should_stop() after we set the task state, so the ubi thread will either see the stop bit and exit or the task state is reset to runnable such that it isn't scheduled out indefinitely. Signed-off-by: Zhihao Cheng Cc: Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images") Reported-by: syzbot+853639d0cb16c31c7...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 5146cce5fe32..a4d4343053d7 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1628,6 +1628,19 @@ int ubi_thread(void *u) !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) { set_current_state(TASK_INTERRUPTIBLE); spin_unlock(>wl_lock); + + /* +* Check kthread_should_stop() after we set the task +* state to guarantee that we either see the stop bit +* and exit or the task state is reset to runnable such +* that it's not scheduled out indefinitely and detects +* the stop bit at kthread_should_stop(). +*/ + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + schedule(); continue; } -- 2.25.4
[PATCH] ubi: fastmap: Don't produce the initial anchor PEB when fastmap is disabled
Following process triggers a memleak caused by forgetting to release the initial anchor PEB (CONFIG_MTD_UBI_FASTMAP is disabled): 1. attach -> __erase_worker -> produce the initial anchor PEB 2. detach -> ubi_fastmap_close (Do nothing, it should have released the initial anchor PEB) Don't produce the initial anchor PEB in __erase_worker() when fastmap is disabled. Signed-off-by: Zhihao Cheng Fixes: f9c34bb529975fe ("ubi: Fix producing anchor PEBs") Reported-by: syzbot+d9aab50b1154e3d16...@syzkaller.appspotmail.com --- drivers/mtd/ubi/wl.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 5146cce5fe32..5ebe1084a8e7 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1079,13 +1079,19 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) if (!err) { spin_lock(>wl_lock); - if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) { +#ifdef CONFIG_MTD_UBI_FASTMAP + if (!ubi->fm_disabled && !ubi->fm_anchor && + e->pnum < UBI_FM_MAX_START) { ubi->fm_anchor = e; ubi->fm_do_produce_anchor = 0; } else { wl_tree_add(e, >free); ubi->free_count++; } +#else + wl_tree_add(e, >free); + ubi->free_count++; +#endif spin_unlock(>wl_lock); -- 2.25.4
[PATCH 1/2] ubifs: xattr: Fix some potential memory leaks while iterating entries
Fix some potential memory leaks in error handling branches while iterating xattr entries. For example, function ubifs_tnc_remove_ino() forgets to free pxent if it exists. Similar problems also exist in ubifs_purge_xattrs(), ubifs_add_orphan() and ubifs_jnl_write_inode(). Signed-off-by: Zhihao Cheng Cc: Fixes: 1e51764a3c2ac05a2 ("UBIFS: add new flash file system") --- fs/ubifs/journal.c | 2 ++ fs/ubifs/orphan.c | 2 ++ fs/ubifs/tnc.c | 3 +++ fs/ubifs/xattr.c | 2 ++ 4 files changed, 9 insertions(+) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index e5ec1afe1c66..0b4b94b079a6 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -893,6 +893,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) if (err == -ENOENT) break; + kfree(pxent); goto out_release; } @@ -905,6 +906,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode) ubifs_err(c, "dead directory entry '%s', error %d", xent->name, err); ubifs_ro_mode(c, err); + kfree(pxent); kfree(xent); goto out_release; } diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c index 283f9eb48410..b0117878b3a0 100644 --- a/fs/ubifs/orphan.c +++ b/fs/ubifs/orphan.c @@ -173,6 +173,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum) err = PTR_ERR(xent); if (err == -ENOENT) break; + kfree(pxent); return err; } @@ -182,6 +183,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum) xattr_orphan = orphan_add(c, xattr_inum, orphan); if (IS_ERR(xattr_orphan)) { + kfree(pxent); kfree(xent); return PTR_ERR(xattr_orphan); } diff --git a/fs/ubifs/tnc.c b/fs/ubifs/tnc.c index e8e7b0e9532e..33742ee3945b 100644 --- a/fs/ubifs/tnc.c +++ b/fs/ubifs/tnc.c @@ -2885,6 +2885,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum) err = PTR_ERR(xent); if (err == -ENOENT) break; + kfree(pxent); return err; } @@ -2898,6 +2899,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum) fname_len() = le16_to_cpu(xent->nlen); err = ubifs_tnc_remove_nm(c, , ); if (err) { + kfree(pxent); kfree(xent); return err; } @@ -2906,6 +2908,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum) highest_ino_key(c, , xattr_inum); err = ubifs_tnc_remove_range(c, , ); if (err) { + kfree(pxent); kfree(xent); return err; } diff --git a/fs/ubifs/xattr.c b/fs/ubifs/xattr.c index 9aefbb60074f..a0b9b349efe6 100644 --- a/fs/ubifs/xattr.c +++ b/fs/ubifs/xattr.c @@ -522,6 +522,7 @@ int ubifs_purge_xattrs(struct inode *host) xent->name, err); ubifs_ro_mode(c, err); kfree(pxent); + kfree(xent); return err; } @@ -531,6 +532,7 @@ int ubifs_purge_xattrs(struct inode *host) err = remove_xattr(c, host, xino, ); if (err) { kfree(pxent); + kfree(xent); iput(xino); ubifs_err(c, "cannot remove xattr, error %d", err); return err; -- 2.25.4
[PATCH 2/2] ubifs: dent: Fix some potential memory leaks while iterating entries
Fix some potential memory leaks in error handling branches while iterating dent entries. For example, function dbg_check_dir() forgets to free pdent if it exists. Signed-off-by: Zhihao Cheng Cc: Fixes: 1e51764a3c2ac05a2 ("UBIFS: add new flash file system") --- fs/ubifs/debug.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index 0f5a480fe264..b4a3abf54df7 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -1123,6 +1123,7 @@ int dbg_check_dir(struct ubifs_info *c, const struct inode *dir) err = PTR_ERR(dent); if (err == -ENOENT) break; + kfree(pdent); return err; } -- 2.25.4
Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps
Can the current modification method be confirmed? 在 2019/9/16 6:00, Richard Weinberger 写道: > I need to give this another thought
[QUESTION] Hung task warning while running syzkaller test
Hi, everyone. We met a hung task problem while running syzkaller test. The stacks of hung tasks vary in net/fs/sched, and we provide a stable reproduce test case in fs. The higher the kernel version, the lower the probability of reproduce. Maybe the mainline has gradually optimized the scheduling and mutex. Environment: A. qemu(x86_64 8-core 16GB-RAM) B. physical machine (x86_64 8-core 314GB-RAM) ./syz-execprog -executor=/home/abc/syz-executor -repeat=0 -procs=16 -cover=0 repro repro is a configuration file containing syzkaller execution instructions, which shown as follows: syz_execute_func(&(0x7f000140)="f2aa984413e80f05953205830071f32ef30f1b6f002e67440f381d953b009fcc77a7141e8f6978e394db9600928640c4e2b140da6c4f086447deecf2460fd6c40f49100045660fc462c0f72644804740df6e32b8417e10bd61796e91565646bc16442ecbb1a978c33537771656c441add398b5000feb76f7f7210173dddfc421785a6600a32c9f5d04ecc7c764660f60050004c4035922770063c4217be62e450f8a016321f0c4e25dbe044c31e053b3eb53b3eb890f32d393400f383ca8faffec1f8dbf4f1e480404fb2e400f1ad30fae746d00ab07c4a2d538cb0ff803461439f5e3480f5140a3c4c4021bf7e8561eeaea0f6c3dce67460ffd1a000fb2430f12f5c423557904e774") socket(0x1, 0x8, 0x4) Hung task in kernel-4.4(See full message in hung_task_verbose.log): [ 420.762345] INFO: task syz-executor.1:8244 blocked for more than 140 seconds. [ 420.763691] Not tainted 4.4.186-514.55.6.9.x86_64 #1 [ 420.764645] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 420.765931] syz-executor.1 D 88040e6efc80 13728 8244 8242 0x [ 420.767189] 88040e6efc80 88040e71c990 8804 880077df3d80 [ 420.768497] 88040e71bd80 88040e6f 0246 88041f5007c0 [ 420.769800] 88040e71bd80 88040e6efc98 818c6ebc [ 420.771109] Call Trace: [ 420.771540] [] schedule+0x3c/0x90 [ 420.772369] [] schedule_preempt_disabled+0x15/0x20 [ 420.773437] [] mutex_lock_nested+0x182/0x500 [ 420.774421] [] ? walk_component+0x21f/0x310 [ 420.775396] [] ? __inode_permission+0x3a/0x80 [ 420.776391] [] walk_component+0x21f/0x310 [ 420.777333] [] ? path_lookupat+0x1b/0x110 [ 420.778273] [] path_lookupat+0x5d/0x110 [ 420.779197] [] filename_lookup+0xb1/0x180 [ 420.780130] [] ? rcu_read_lock_sched_held+0x6d/0x80 [ 420.781211] [] ? kmem_cache_alloc+0x240/0x2b0 [ 420.782212] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 420.783312] [] user_path_at_empty+0x36/0x40 [ 420.784284] [] path_removexattr+0x43/0xb0 [ 420.785229] [] ? lockdep_sys_exit_thunk+0x12/0x14 [ 420.786283] [] SyS_lremovexattr+0x10/0x20 [ 420.787232] [] entry_SYSCALL_64_fastpath+0x1e/0x9a [ 420.788302] 1 lock held by syz-executor.1/8244: [ 420.789051] #0: (>s_type->i_mutex_key#2){+.+.+.}, at: [] walk_component+0x21f/0x310 Hung task in kernel-5.3-rc6: [30391.827102] INFO: task syz-executor.6:12211 blocked for more than 143 seconds. [30391.827194] Not tainted 5.3.0-rc6-514.55.6.9.x86_64 #41 [30391.827214] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [30391.827239] syz-executor.6 D13904 12211 12143 0x [30391.827319] Call Trace: [30391.828583] ? __schedule+0x3cc/0x8b0 [30391.828669] schedule+0x30/0xb0 [30391.828785] rwsem_down_write_slowpath+0x2d2/0x730 [30391.829039] ? filename_create+0x9d/0x1d0 [30391.829110] ? filename_create+0x9d/0x1d0 [30391.829136] ? rwsem_down_write_slowpath+0x5/0x730 [30391.829163] filename_create+0x9d/0x1d0 [30391.829247] do_mkdirat+0x54/0x120 [30391.829361] do_syscall_64+0x85/0x380 [30391.829445] entry_SYSCALL_64_after_hwframe+0x49/0xbe [30391.829509] RIP: 0033:0x2148 [30391.829562] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f2 aa 98 44 13 e8 0f 05 <95> 32 05 83 00 00 00 71 f3 2e f3 0f 1b 6f 00 2e 67 66 66 44 0f 38 [30391.829604] RSP: 002b:7fd154213bd8 EFLAGS: 0203 ORIG_RAX: 0053 [30391.829638] RAX: ffda RBX: 0009 RCX: 2148 [30391.829659] RDX: da194cf4f57fa1d4 RSI: RDI: 7fd15421460a [30391.829680] RBP: 0045 R08: 0005 R09: 0006 [30391.829703] R10: 0007 R11: 0203 R12: 000b [30391.829724] R13: 014c R14: 000d R15: Intro of attachments: hung_task_verbose.log: verbose of hung task(with lockdep) repro: reproduction file which
Re: [PATCH xfstests] generic/192: Move 'cd /' to the place where the program exits
I agree with your proposal. Indeed, '$here' is referenced in other places where executable files under src are used, including '$here/src/feature', '$here/src/seek_sanity_test', etc. I have a question about why many test cases execute 'cd /' before the end. For example, generic/124, generic/122, generic/003. I wonder the intention of operation 'cd /'. 在 2019/10/13 20:46, Eryu Guan 写道: > On Wed, Oct 09, 2019 at 04:27:57PM +0800, Zhihao Cheng wrote: >> Running generic/192 with overlayfs(Let ubifs as base fs) yields the >> following output: >> >> generic/192 - output mismatch >> QA output created by 192 >> sleep for 5 seconds >> test >> +./common/rc: line 316: src/t_dir_type: No such file or directory >> delta1 is in range >> delta2 is in range >> ... >> >> When the use case fails, the call stack in generic/192 is: >> >> local unknowns=$(src/t_dir_type $dir u | wc -l)common/rc:316 >> _supports_filetype common/rc:299 >> _overlay_mount common/overlay:52 >> _overlay_test_mount >> common/overlay:93 >> _test_mountcommon/rc:407 >> _test_cycle_mount generic/192:50 >> >> Before _test_cycle_mount() being invoked, generic/192 executed 'cd /' >> to change work dir from 'xfstests-dev' to '/', so src/t_dir_type was not >> found. >> >> Signed-off-by: Zhihao Cheng > > Thanks for the debug! But I think the right fix is to call t_dir_type > via "$here", i.e. > > local unknowns=$($here/src/t_dir_type $dir u | wc -l) > > 'here', which points to the top level dir of xfstests source code, is > defined in every test in test setup, and is guaranteed not to be empty. > > Thanks, > Eryu > >> --- >> tests/generic/192 | 8 ++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/tests/generic/192 b/tests/generic/192 >> index 50b3d6fd..5550f39e 100755 >> --- a/tests/generic/192 >> +++ b/tests/generic/192 >> @@ -15,7 +15,12 @@ echo "QA output created by $seq" >> here=`pwd` >> tmp=/tmp/$$ >> status=1# failure is the default! >> -trap "exit \$status" 0 1 2 3 15 >> +trap "_cleanup; exit \$status" 0 1 2 3 15 >> + >> +_cleanup() >> +{ >> +cd / >> +} >> >> _access_time() >> { >> @@ -46,7 +51,6 @@ sleep $delay # sleep to allow time to move on for access >> cat $testfile >> time2=`_access_time $testfile | tee -a $seqres.full` >> >> -cd / >> _test_cycle_mount >> time3=`_access_time $testfile | tee -a $seqres.full` >> >> -- >> 2.13.6 >> > > . >
[PATCH xfstests] generic/192: Move 'cd /' to the place where the program exits
Running generic/192 with overlayfs(Let ubifs as base fs) yields the following output: generic/192 - output mismatch QA output created by 192 sleep for 5 seconds test +./common/rc: line 316: src/t_dir_type: No such file or directory delta1 is in range delta2 is in range ... When the use case fails, the call stack in generic/192 is: local unknowns=$(src/t_dir_type $dir u | wc -l) common/rc:316 _supports_filetypecommon/rc:299 _overlay_mountcommon/overlay:52 _overlay_test_mount common/overlay:93 _test_mount common/rc:407 _test_cycle_mount generic/192:50 Before _test_cycle_mount() being invoked, generic/192 executed 'cd /' to change work dir from 'xfstests-dev' to '/', so src/t_dir_type was not found. Signed-off-by: Zhihao Cheng --- tests/generic/192 | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tests/generic/192 b/tests/generic/192 index 50b3d6fd..5550f39e 100755 --- a/tests/generic/192 +++ b/tests/generic/192 @@ -15,7 +15,12 @@ echo "QA output created by $seq" here=`pwd` tmp=/tmp/$$ status=1 # failure is the default! -trap "exit \$status" 0 1 2 3 15 +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / +} _access_time() { @@ -46,7 +51,6 @@ sleep $delay # sleep to allow time to move on for access cat $testfile time2=`_access_time $testfile | tee -a $seqres.full` -cd / _test_cycle_mount time3=`_access_time $testfile | tee -a $seqres.full` -- 2.13.6
[PATCH xfstests v3] overlay: Enable character device to be the base fs partition
When running overlay tests using character devices as base fs partitions, all overlay usecase results become 'notrun'. Function '_overay_config_override' (common/config) detects that the current base fs partition is not a block device and will set FSTYP to base fs. The overlay usecase will check the current FSTYP, and if it is not 'overlay' or 'generic', it will skip the execution. For example, using UBIFS as base fs skips all overlay usecases: FSTYP -- ubifs # FSTYP should be overridden as 'overlay' MKFS_OPTIONS -- /dev/ubi0_1 # Character device MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch overlay/001 [not run] not suitable for this filesystem type: ubifs overlay/002 [not run] not suitable for this filesystem type: ubifs overlay/003 [not run] not suitable for this filesystem type: ubifs When checking that the base fs partition is a block/character device, FSTYP is overwritten as 'overlay'. This patch allows the base fs partition to be a character device that can also execute overlay usecases (such as ubifs). Signed-off-by: Zhihao Cheng Signed-off-by: Amir Goldstein --- common/config | 6 +++--- common/rc | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/common/config b/common/config index 4c86a49..4eda36c 100644 --- a/common/config +++ b/common/config @@ -532,7 +532,7 @@ _canonicalize_mountpoint() # When SCRATCH/TEST_* vars are defined in evironment and not # in config file, this function is called after vars have already # been overriden in the previous test. -# In that case, TEST_DEV is a directory and not a blockdev and +# In that case, TEST_DEV is a directory and not a blockdev/chardev and # the function will return without overriding the SCRATCH/TEST_* vars. _overlay_config_override() { @@ -550,7 +550,7 @@ _overlay_config_override() #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values #of the configured base fs and SCRATCH/TEST_DEV vars are set to the #overlayfs base and mount dirs inside base fs mount. - [ -b "$TEST_DEV" ] || return 0 + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 # Config file may specify base fs type, but we obay -overlay flag [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" @@ -570,7 +570,7 @@ _overlay_config_override() export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" - [ -b "$SCRATCH_DEV" ] || return 0 + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 # Store original base fs vars export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" diff --git a/common/rc b/common/rc index 66c7fd4..8d57c37 100644 --- a/common/rc +++ b/common/rc @@ -3100,7 +3100,7 @@ _require_scratch_shutdown() # SCRATCH_DEV, in this case OVL_BASE_SCRATCH_DEV # will be null, so check OVL_BASE_SCRATCH_DEV before # running shutdown to avoid shutting down base fs accidently. - _notrun "$SCRATCH_DEV is not a block device" + _notrun "This test requires a valid $OVL_BASE_SCRATCH_DEV as ovl base fs" else src/godown -f $OVL_BASE_SCRATCH_MNT 2>&1 \ || _notrun "Underlying filesystem does not support shutdown" -- 2.7.4
Re: [PATCH xfstests v2] overlay: Enable character device to be the base fs partition
There are indeed many '-b' options in xfstests. I only confirmed the line of overlay test. Other -b test options I need to reconfirm later. 在 2019/9/25 11:17, Darrick J. Wong 写道: > On Tue, Sep 24, 2019 at 08:05:50PM -0700, Darrick J. Wong wrote: >> On Wed, Sep 25, 2019 at 09:54:08AM +0800, Zhihao Cheng wrote: >>> There is a message in _supported_fs(): >>> _notrun "not suitable for this filesystem type: $FSTYP" >>> for when overlay usecases are executed on a chararcter device based base >> >> You can do that? >> >> What does that even look like? > > OH, ubifs. Ok. > > /me wonders if there are more places in xfstests with test -b that needs > fixing... > > --D > >> --D >> >>> fs. _overay_config_override() detects that the current base fs partition >>> is not a block device, and FSTYP won't be overwritten as 'overlay' before >>> executing usecases which results in all overlay usecases become 'notrun'. >>> In addition, all generic usecases are based on base fs rather than overlay. >>> >>> We want to rewrite FSTYP to 'overlay' before running the usecases. To do >>> this, we need to add additional character device judgments for TEST_DEV >>> and SCRATCH_DEV in _overay_config_override(). >>> >>> Signed-off-by: Zhihao Cheng >>> --- >>> common/config | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/common/config b/common/config >>> index 4c86a49..a22acdb 100644 >>> --- a/common/config >>> +++ b/common/config >>> @@ -550,7 +550,7 @@ _overlay_config_override() >>> #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values >>> #of the configured base fs and SCRATCH/TEST_DEV vars are set to the >>> #overlayfs base and mount dirs inside base fs mount. >>> - [ -b "$TEST_DEV" ] || return 0 >>> + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 >>> >>> # Config file may specify base fs type, but we obay -overlay flag >>> [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" >>> @@ -570,7 +570,7 @@ _overlay_config_override() >>> export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" >>> export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" >>> >>> - [ -b "$SCRATCH_DEV" ] || return 0 >>> + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 >>> >>> # Store original base fs vars >>> export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" >>> -- >>> 2.7.4 >>> > > . >
Re: [PATCH xfstests] overlay: Enable character device to be the base fs partition
Oh, You are right, I understood it wrong. Thanks for reminding. 在 2019/9/25 11:15, Eryu Guan 写道: > On Tue, Sep 24, 2019 at 10:19:38PM +0800, Zhihao Cheng wrote: >> As far as I know, _require_scratch_shutdown() is called after >> _overay_config_override(), at this moment, FSTYP equals to base fs. >> According the implementation of _require_scratch_shutdown: >> 3090 _require_scratch_shutdown() >> 3091 { >> 3092 [ -x src/godown ] || _notrun "src/godown executable not found" >> 3093 >> 3094 _scratch_mkfs > /dev/null 2>&1 || _notrun "_scratch_mkfs failed on >> $SCRATCH_DEV" >> 3095 _scratch_mount >> 3096 >> 3097 if [ $FSTYP = "overlay" ]; then >># FSTYP = base fs >> 3098 if [ -z $OVL_BASE_SCRATCH_DEV ]; then >> 3099 # In lagacy overlay usage, it may specify directory as >> 3100 # SCRATCH_DEV, in this case OVL_BASE_SCRATCH_DEV >> 3101 # will be null, so check OVL_BASE_SCRATCH_DEV before >> 3102 # running shutdown to avoid shutting down base fs >> accidently. >> 3103 _notrun "$SCRATCH_DEV is not a block device" >> 3104 else >> 3105 src/godown -f $OVL_BASE_SCRATCH_MNT 2>&1 \ >> 3106 || _notrun "Underlying filesystem does not support shutdown" >> 3107 fi >> 3108 else >> 3109 src/godown -f $SCRATCH_MNT 2>&1 \ >> 3110 || _notrun "$FSTYP does not support shutdown" >># Executes this path >> 3111 fi >> 3112 >> 3113 _scratch_unmount >> 3114 } >> So, we can't get output: _notrun "$SCRATCH_DEV is not a block device". >> Instead, the verbose should like: >> after _overlay_config_override FSTYP=ubifs# Additional print message >> FSTYP -- ubifs >> PLATFORM -- Linux/x86_64 >> MKFS_OPTIONS -- /dev/ubi0_1 >> MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch >> >> generic/042[not run] ubifs does not support shutdown >> >> But I'll consider describing error more concisely in v2. >> >> 在 2019/9/24 20:33, Amir Goldstein 写道: >>> On Tue, Sep 24, 2019 at 12:34 PM Zhihao Cheng >>> wrote: >>>> >>>> When running overlay tests using character devices as base fs partitions, >>>> all overlay usecase results become 'notrun'. Function >>>> '_overay_config_override' (common/config) detects that the current base >>>> fs partition is not a block device and will set FSTYP to base fs. The >>>> overlay usecase will check the current FSTYP, and if it is not 'overlay' >>>> or 'generic', it will skip the execution. >>>> >>>> For example, using UBIFS as base fs skips all overlay usecases: >>>> >>>> FSTYP -- ubifs # FSTYP should be overridden as 'overlay' >>>> MKFS_OPTIONS -- /dev/ubi0_1 # Character device >>>> MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch >>>> >>>> overlay/001 [not run] not suitable for this filesystem type: ubifs >>>> overlay/002 [not run] not suitable for this filesystem type: ubifs >>>> overlay/003 [not run] not suitable for this filesystem type: ubifs >>>> ... >>>> >>>> When checking that the base fs partition is a block/character device, >>>> FSTYP is overwritten as 'overlay'. This patch allows the base fs >>>> partition to be a character device that can also execute overlay >>>> usecases (such as ubifs). >>>> >>>> Signed-off-by: Zhihao Cheng >>>> --- >>>> common/config | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/common/config b/common/config >>>> index 4c86a49..a22acdb 100644 >>>> --- a/common/config >>>> +++ b/common/config >>>> @@ -550,7 +550,7 @@ _overlay_config_override() >>>> #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the >>>> values >>>> #of the configured base fs and SCRATCH/TEST_DEV vars are set >>>> to the >>>> #overlayfs base and mount dirs inside base fs mount. >>>> - [ -b "$TEST_DEV" ] || return 0 >>>> + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 >>>> >>>> # Config file ma
[PATCH xfstests v2] overlay: Enable character device to be the base fs partition
There is a message in _supported_fs(): _notrun "not suitable for this filesystem type: $FSTYP" for when overlay usecases are executed on a chararcter device based base fs. _overay_config_override() detects that the current base fs partition is not a block device, and FSTYP won't be overwritten as 'overlay' before executing usecases which results in all overlay usecases become 'notrun'. In addition, all generic usecases are based on base fs rather than overlay. We want to rewrite FSTYP to 'overlay' before running the usecases. To do this, we need to add additional character device judgments for TEST_DEV and SCRATCH_DEV in _overay_config_override(). Signed-off-by: Zhihao Cheng --- common/config | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/common/config b/common/config index 4c86a49..a22acdb 100644 --- a/common/config +++ b/common/config @@ -550,7 +550,7 @@ _overlay_config_override() #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values #of the configured base fs and SCRATCH/TEST_DEV vars are set to the #overlayfs base and mount dirs inside base fs mount. - [ -b "$TEST_DEV" ] || return 0 + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 # Config file may specify base fs type, but we obay -overlay flag [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" @@ -570,7 +570,7 @@ _overlay_config_override() export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" - [ -b "$SCRATCH_DEV" ] || return 0 + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 # Store original base fs vars export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" -- 2.7.4
Re: [PATCH xfstests] overlay: Enable character device to be the base fs partition
As far as I know, _require_scratch_shutdown() is called after _overay_config_override(), at this moment, FSTYP equals to base fs. According the implementation of _require_scratch_shutdown: 3090 _require_scratch_shutdown() 3091 { 3092 [ -x src/godown ] || _notrun "src/godown executable not found" 3093 3094 _scratch_mkfs > /dev/null 2>&1 || _notrun "_scratch_mkfs failed on $SCRATCH_DEV" 3095 _scratch_mount 3096 3097 if [ $FSTYP = "overlay" ]; then # FSTYP = base fs 3098 if [ -z $OVL_BASE_SCRATCH_DEV ]; then 3099 # In lagacy overlay usage, it may specify directory as 3100 # SCRATCH_DEV, in this case OVL_BASE_SCRATCH_DEV 3101 # will be null, so check OVL_BASE_SCRATCH_DEV before 3102 # running shutdown to avoid shutting down base fs accidently. 3103 _notrun "$SCRATCH_DEV is not a block device" 3104 else 3105 src/godown -f $OVL_BASE_SCRATCH_MNT 2>&1 \ 3106 || _notrun "Underlying filesystem does not support shutdown" 3107 fi 3108 else 3109 src/godown -f $SCRATCH_MNT 2>&1 \ 3110 || _notrun "$FSTYP does not support shutdown" # Executes this path 3111 fi 3112 3113 _scratch_unmount 3114 } So, we can't get output: _notrun "$SCRATCH_DEV is not a block device". Instead, the verbose should like: after _overlay_config_override FSTYP=ubifs# Additional print message FSTYP -- ubifs PLATFORM -- Linux/x86_64 MKFS_OPTIONS -- /dev/ubi0_1 MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch generic/042 [not run] ubifs does not support shutdown But I'll consider describing error more concisely in v2. 在 2019/9/24 20:33, Amir Goldstein 写道: > On Tue, Sep 24, 2019 at 12:34 PM Zhihao Cheng wrote: >> >> When running overlay tests using character devices as base fs partitions, >> all overlay usecase results become 'notrun'. Function >> '_overay_config_override' (common/config) detects that the current base >> fs partition is not a block device and will set FSTYP to base fs. The >> overlay usecase will check the current FSTYP, and if it is not 'overlay' >> or 'generic', it will skip the execution. >> >> For example, using UBIFS as base fs skips all overlay usecases: >> >> FSTYP -- ubifs # FSTYP should be overridden as 'overlay' >> MKFS_OPTIONS -- /dev/ubi0_1 # Character device >> MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch >> >> overlay/001 [not run] not suitable for this filesystem type: ubifs >> overlay/002 [not run] not suitable for this filesystem type: ubifs >> overlay/003 [not run] not suitable for this filesystem type: ubifs >> ... >> >> When checking that the base fs partition is a block/character device, >> FSTYP is overwritten as 'overlay'. This patch allows the base fs >> partition to be a character device that can also execute overlay >> usecases (such as ubifs). >> >> Signed-off-by: Zhihao Cheng >> --- >> common/config | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/common/config b/common/config >> index 4c86a49..a22acdb 100644 >> --- a/common/config >> +++ b/common/config >> @@ -550,7 +550,7 @@ _overlay_config_override() >> #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values >> #of the configured base fs and SCRATCH/TEST_DEV vars are set to >> the >> #overlayfs base and mount dirs inside base fs mount. >> - [ -b "$TEST_DEV" ] || return 0 >> + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 >> >> # Config file may specify base fs type, but we obay -overlay flag >> [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" >> @@ -570,7 +570,7 @@ _overlay_config_override() >> export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" >> export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" >> >> - [ -b "$SCRATCH_DEV" ] || return 0 >> + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 >> >> # Store original base fs vars >> export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" >> -- >> 2.7.4 >> > > Looks fine. > > One nit: there is a message in _require_scratch_shutdown(): > _notrun "$SCRATCH_DEV is not a block device" > for when $OVL_BASE_SCRATCH_DEV is not defined. > > Could probably use a better describing error anyway. > > Thanks, > Amir. > > . >
Re: [PATCH xfstests] overlay: Enable character device to be the base fs partition
After incorporating patches, use overlay usecases to test character device-based base fs, and all overlay usecases are executed: FSTYP -- overlay # FSTYP has be overridden as 'overlay' PLATFORM -- Linux/x86_64 localhost MKFS_OPTIONS -- /tmp/scratch MOUNT_OPTIONS -- /tmp/scratch /tmp/scratch/ovl-mnt overlay/001 [not run] This test requires at least 8GB free on /tmp/scratch to run overlay/002 1s overlay/003 0s overlay/004 0s overlay/005 1s overlay/006 0s overlay/007 0s overlay/008 0s overlay/009 0s overlay/010 0s overlay/011 1s overlay/012 0s overlay/013 0s overlay/014 1s ... Attachments: setup.sh: Create character device for base fs (UBIFS) local.config: Xfstests local configuration 在 2019/9/24 17:40, Zhihao Cheng 写道: > When running overlay tests using character devices as base fs partitions, > all overlay usecase results become 'notrun'. Function > '_overay_config_override' (common/config) detects that the current base > fs partition is not a block device and will set FSTYP to base fs. The > overlay usecase will check the current FSTYP, and if it is not 'overlay' > or 'generic', it will skip the execution. > > For example, using UBIFS as base fs skips all overlay usecases: > > FSTYP -- ubifs # FSTYP should be overridden as 'overlay' > MKFS_OPTIONS -- /dev/ubi0_1 # Character device > MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch > > overlay/001 [not run] not suitable for this filesystem type: ubifs > overlay/002 [not run] not suitable for this filesystem type: ubifs > overlay/003 [not run] not suitable for this filesystem type: ubifs > ... > > When checking that the base fs partition is a block/character device, > FSTYP is overwritten as 'overlay'. This patch allows the base fs > partition to be a character device that can also execute overlay > usecases (such as ubifs). > > Signed-off-by: Zhihao Cheng > --- > common/config | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/common/config b/common/config > index 4c86a49..a22acdb 100644 > --- a/common/config > +++ b/common/config > @@ -550,7 +550,7 @@ _overlay_config_override() > #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values > #of the configured base fs and SCRATCH/TEST_DEV vars are set to the > #overlayfs base and mount dirs inside base fs mount. > - [ -b "$TEST_DEV" ] || return 0 > + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 > > # Config file may specify base fs type, but we obay -overlay flag > [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" > @@ -570,7 +570,7 @@ _overlay_config_override() > export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" > export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" > > - [ -b "$SCRATCH_DEV" ] || return 0 > + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 > > # Store original base fs vars > export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" > export FSTYP=ubifs export TEST_DEV=/dev/ubi0_0 export TEST_DIR=/tmp/test export TEST_FS_MOUNT_OPTS="-t ubifs" export SCRATCH_DEV=/dev/ubi0_1 export SCRATCH_MNT=/tmp/scratch export MOUNT_OPTIONS="-t ubifs" #!/bin/bash set -e TMP=/tmp/test SMP=/tmp/scratch umount $TMP $SMP 2>/dev/null || true mkdir -p $TMP $SMP modprobe -r ubifs 2>/dev/null || true for i in $(seq 0 1) do ubidetach -p /dev/mtd$i 2>/dev/null || true done modprobe -r ubi 2>/dev/null || true modprobe -r nandsim 2>/dev/null || true mtd=/dev/mtd0 ubi=/dev/ubi0 ARCH=$(uname -m) if test "$ARCH" == ppc || test "$ARCH" == armv7l then # 512MB, 8-bits, page size 4KB, OOB 128B, block 128KB ID="0x20,0xac,0x00,0x16" TSIZE=128MiB SSIZE=350MiB else # 2GB, 8-bits, page size 4KB, OOB 128B, block 128KB ID="0x20,0xa5,0x00,0x16" TSIZE=400MiB SSIZE=1500MiB fi modprobe nandsim id_bytes=$ID flash_erase -q -j $mtd 0 0 modprobe ubi modprobe ubifs ubiattach -p $mtd ubimkvol $ubi -N test -s $TSIZE ubimkvol $ubi -N scratch -s $SSIZE exit 0
[PATCH xfstests] overlay: Enable character device to be the base fs partition
When running overlay tests using character devices as base fs partitions, all overlay usecase results become 'notrun'. Function '_overay_config_override' (common/config) detects that the current base fs partition is not a block device and will set FSTYP to base fs. The overlay usecase will check the current FSTYP, and if it is not 'overlay' or 'generic', it will skip the execution. For example, using UBIFS as base fs skips all overlay usecases: FSTYP -- ubifs # FSTYP should be overridden as 'overlay' MKFS_OPTIONS -- /dev/ubi0_1 # Character device MOUNT_OPTIONS -- -t ubifs /dev/ubi0_1 /tmp/scratch overlay/001 [not run] not suitable for this filesystem type: ubifs overlay/002 [not run] not suitable for this filesystem type: ubifs overlay/003 [not run] not suitable for this filesystem type: ubifs ... When checking that the base fs partition is a block/character device, FSTYP is overwritten as 'overlay'. This patch allows the base fs partition to be a character device that can also execute overlay usecases (such as ubifs). Signed-off-by: Zhihao Cheng --- common/config | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/common/config b/common/config index 4c86a49..a22acdb 100644 --- a/common/config +++ b/common/config @@ -550,7 +550,7 @@ _overlay_config_override() #the new OVL_BASE_SCRATCH/TEST_DEV/MNT vars are set to the values #of the configured base fs and SCRATCH/TEST_DEV vars are set to the #overlayfs base and mount dirs inside base fs mount. - [ -b "$TEST_DEV" ] || return 0 + [ -b "$TEST_DEV" ] || [ -c "$TEST_DEV" ] || return 0 # Config file may specify base fs type, but we obay -overlay flag [ "$FSTYP" == overlay ] || export OVL_BASE_FSTYP="$FSTYP" @@ -570,7 +570,7 @@ _overlay_config_override() export TEST_DIR="$OVL_BASE_TEST_DIR/$OVL_MNT" export MOUNT_OPTIONS="$OVERLAY_MOUNT_OPTIONS" - [ -b "$SCRATCH_DEV" ] || return 0 + [ -b "$SCRATCH_DEV" ] || [ -c "$SCRATCH_DEV" ] || return 0 # Store original base fs vars export OVL_BASE_SCRATCH_DEV="$SCRATCH_DEV" -- 2.7.4
Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps
在 2019/9/16 6:00, Richard Weinberger 写道: > On Fri, Aug 16, 2019 at 10:01 AM chengzhihao wrote: >> >>> ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); >> >> I've done 50 problem reproduces on different flash devices and made sure >> that the assertion was not triggered. See record.txt for details. > > Thanks for letting me know. :) > I need to give this another thought, then we can apply it for -rc2. > ACK. :)
[PATCH RFC v2] ubi: ubi_wl_get_peb: Increase the number of attempts while getting PEB
Running stress test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. This patch increases the number of attempts to obtain PEB. An extreme case (No free PEBs left after creating test volumes) has been tested on different type of machines for 100 times. The biggest number of attempts are shown below: x86_64 arm64 2-core4 4 4-core8 4 8-core4 4 Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c44c847 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret, attempts = 0; struct ubi_fm_pool *pool = >fm_pool; struct ubi_fm_pool *wl_pool = >fm_wl_pool; @@ -221,12 +221,12 @@ int ubi_wl_get_peb(struct ubi_device *ubi) if (pool->used == pool->size) { spin_unlock(>wl_lock); - if (retried) { + attempts++; + if (attempts == 10) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); ret = -ENOSPC; goto out; } - retried = 1; up_read(>fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
[PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = >fm_pool; struct ubi_fm_pool *wl_pool = >fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(>wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(>works)); + spin_unlock(>wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(>wl_lock); up_read(>fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
[RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = >fm_pool; struct ubi_fm_pool *wl_pool = >fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(>wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(>works)); + spin_unlock(>wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(>wl_lock); up_read(>fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
[PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps
Running stress-test test_2 in mtd-utils on ubi device, sometimes we can get following oops message: BUG: unable to handle page fault for address: 0140 #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 280a067 P4D 280a067 PUD 0 Oops: [#1] SMP CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-ubifs_0_0) RIP: 0010:rb_next_postorder+0x2e/0xb0 Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 RSP: 0018:c9887758 EFLAGS: 00010202 RAX: 888129ae4700 RBX: 888138b08400 RCX: 8081 RDX: 0130 RSI: 80800024 RDI: 888138b08400 RBP: 888138b08400 R08: ea0004a6b920 R09: R10: c9887740 R11: 0001 R12: 888128d48000 R13: 0800 R14: 011e R15: 07c8 FS: () GS:88813ba0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0140 CR3: 00013789d000 CR4: 06f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: destroy_old_idx+0x5d/0xa0 [ubifs] ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] do_commit+0x3eb/0x830 [ubifs] ubifs_run_commit+0xdc/0x1c0 [ubifs] Above Oops are due to the slab-out-of-bounds happened in do-while of function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In function layout_in_gaps, there is a do-while loop placing index nodes into the gaps created by obsolete index nodes in non-empty index LEBs until rest index nodes can totally be placed into pre-allocated empty LEBs. @c->gap_lebs points to a memory area(integer array) which records LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB is found, corresponding lnum will be incrementally written into the memory area pointed by @c->gap_lebs. The size ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before do-while loop and can not be changed in the loop. But @c->lst.idx_lebs could be increased by function ubifs_change_lp (called by layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the loop. So, sometimes oob happens when number of cycles in do-while loop exceeds the original value of @c->lst.idx_lebs. See detail in https://bugzilla.kernel.org/show_bug.cgi?id=204229. This patch fixes oob in layout_in_gaps. Signed-off-by: Zhihao Cheng --- fs/ubifs/tnc_commit.c | 34 +++--- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c index a384a0f..234be1c 100644 --- a/fs/ubifs/tnc_commit.c +++ b/fs/ubifs/tnc_commit.c @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, /** * layout_leb_in_gaps - layout index nodes using in-the-gaps method. * @c: UBIFS file-system description object - * @p: return LEB number here + * @p: return LEB number in @c->gap_lebs[p] * * This function lays out new index nodes for dirty znodes using in-the-gaps * method of TNC commit. @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, * This function returns the number of index nodes written into the gaps, or a * negative error code on failure. */ -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) +static int layout_leb_in_gaps(struct ubifs_info *c, int p) { struct ubifs_scan_leb *sleb; struct ubifs_scan_node *snod; @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) * filled, however we do not check there at present. */ return lnum; /* Error code */ - *p = lnum; + c->gap_lebs[p] = lnum; dbg_gc("LEB %d", lnum); /* * Scan the index LEB. We use the generic scan for this even though @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) */ static int layout_in_gaps(struct ubifs_info *c, int cnt) { - int err, leb_needed_cnt, written, *p; + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, *gap_lebs; dbg_gc("%d znodes to write", cnt); @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) if (!c->gap_lebs) return -ENOMEM; - p = c->gap_lebs; + old_idx_lebs = c->lst.idx_lebs; do { - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); + ubifs_assert(c,
[PATCH RFC v2] mtd: ubi: Add fastmap sysfs attribute
The UBI device can be attached to a MTD device via fastmap by setting CONFIG_MTD_UBI_FASTMAP to 'y' (If there already exists a fastmap on the UBI device). To support some debugging scenarios, attaching process by fastmap can be confirmed in dmesg. If the UBI device is attached by fastmap, logs like following will appear in dmesg: ubi0: attached by fastmap If multiple UBI devices are attached to multiple MTD devices at the same time, how to distinguish which UBI devices are successfully attached by fastmap? Extracting attaching information for each UBI device one by one from dmesg is a way. A better method is to record fastmap existence in sysfs, so it can be obtained by userspace tools. This patch exposes fastmap on sysfs. Suppose you attach an UBI device to a MTD device by fastmap: if fastmap equals to '1', that is, the fastmap generated before last detaching operation is confirmed valid. Else, there may be some problems with old fastmap. Besides, userspace tool can also check whether the fastmap updating triggered by other operations (such as resize volume) is successful by reading this sysfs attribute. Signed-off-by: Zhihao Cheng --- Documentation/ABI/stable/sysfs-class-ubi | 15 +++ drivers/mtd/ubi/build.c | 9 - 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/stable/sysfs-class-ubi b/Documentation/ABI/stable/sysfs-class-ubi index a6b3240..1d96cf0 100644 --- a/Documentation/ABI/stable/sysfs-class-ubi +++ b/Documentation/ABI/stable/sysfs-class-ubi @@ -116,6 +116,21 @@ Description: device, and "0\n" if it is cleared. UBI devices mark themselves as read-only when they detect an unrecoverable error. +What: /sys/class/ubi/ubiX/fastmap +Date: June 2019 +KernelVersion: 5.2 +Contact: linux-...@lists.infradead.org +Description: + Contains ASCII "1\n" if there exists a fastmap on UBI device, + and "0\n" if there not exists a fastmap on UBI device. After + attaching the UBI device to a MTD device via fastmap, userspace + tool can sense that there is a fastmap on UBI device by + checking sysfs attribute 'fastmap', that is, the fastmap + generated before last detaching operation is valid. In addition, + userspace tool can also check whether the fastmap updating + triggered by volume operation is successful by reading this + sysfs attribute. + What: /sys/class/ubi/ubiX/total_eraseblocks Date: July 2006 KernelVersion: 2.6.22 diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c index d636bbe..0cd6b8e 100644 --- a/drivers/mtd/ubi/build.c +++ b/drivers/mtd/ubi/build.c @@ -140,6 +140,8 @@ static struct device_attribute dev_mtd_num = __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL); static struct device_attribute dev_ro_mode = __ATTR(ro_mode, S_IRUGO, dev_attribute_show, NULL); +static struct device_attribute dev_fastmap = + __ATTR(fastmap, S_IRUGO, dev_attribute_show, NULL); /** * ubi_volume_notify - send a volume change notification. @@ -378,7 +380,11 @@ static ssize_t dev_attribute_show(struct device *dev, ret = sprintf(buf, "%d\n", ubi->mtd->index); else if (attr == _ro_mode) ret = sprintf(buf, "%d\n", ubi->ro_mode); - else + else if (attr == _fastmap) { + down_write(>fm_protect); + ret = sprintf(buf, "%d\n", ubi->fm ? 1 : 0); + up_write(>fm_protect); + } else ret = -EINVAL; ubi_put_device(ubi); @@ -398,6 +404,7 @@ static struct attribute *ubi_dev_attrs[] = { _bgt_enabled.attr, _mtd_num.attr, _ro_mode.attr, + _fastmap.attr, NULL }; ATTRIBUTE_GROUPS(ubi_dev); -- 2.7.4
[PATCH RFC] mtd: ubi: Add fastmap sysfs attribute
The UBI device can be attached to a MTD device via fastmap by set CONFIG_MTD_UBI_FASTMAP to 'y' (If there already exists a fastmap on the UBI device). To support some debugging scenarios, attaching process by fastmap can be confirmed in dmesg. If the UBI device is attached by fastmap, logs like following will appear in dmesg: ubi0: attached by fastmap If multiple UBI devices are attached to multiple MTD devices at the same time, how to distinguish which UBI devices are successfully attached by fastmap? Extracting attaching information for each UBI device one by one from dmesg is a way. A better method is to record fastmap existence in sysfs, so it can be obtained by userspace tools. This patch exposes fastmap on sysfs. Suppose you attach an UBI device to a MTD device by fastmap: if fastmap equals to '1', that is, the fastmap generated before last detaching operation is confirmed valid. Else, there may be some problems with old fastmap. Besides, userspace tool can also check whether the fastmap updating triggered by other operations (such as resize volume) is successful by reading this sysfs attribute. Signed-off-by: Zhihao Cheng --- Documentation/ABI/stable/sysfs-class-ubi | 15 +++ drivers/mtd/ubi/build.c | 5 + 2 files changed, 20 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-class-ubi b/Documentation/ABI/stable/sysfs-class-ubi index a6b3240..1d96cf0 100644 --- a/Documentation/ABI/stable/sysfs-class-ubi +++ b/Documentation/ABI/stable/sysfs-class-ubi @@ -116,6 +116,21 @@ Description: device, and "0\n" if it is cleared. UBI devices mark themselves as read-only when they detect an unrecoverable error. +What: /sys/class/ubi/ubiX/fastmap +Date: June 2019 +KernelVersion: 5.2 +Contact: linux-...@lists.infradead.org +Description: + Contains ASCII "1\n" if there exists a fastmap on UBI device, + and "0\n" if there not exists a fastmap on UBI device. After + attaching the UBI device to a MTD device via fastmap, userspace + tool can sense that there is a fastmap on UBI device by + checking sysfs attribute 'fastmap', that is, the fastmap + generated before last detaching operation is valid. In addition, + userspace tool can also check whether the fastmap updating + triggered by volume operation is successful by reading this + sysfs attribute. + What: /sys/class/ubi/ubiX/total_eraseblocks Date: July 2006 KernelVersion: 2.6.22 diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c index d636bbe..7cce44f 100644 --- a/drivers/mtd/ubi/build.c +++ b/drivers/mtd/ubi/build.c @@ -140,6 +140,8 @@ static struct device_attribute dev_mtd_num = __ATTR(mtd_num, S_IRUGO, dev_attribute_show, NULL); static struct device_attribute dev_ro_mode = __ATTR(ro_mode, S_IRUGO, dev_attribute_show, NULL); +static struct device_attribute dev_fastmap = + __ATTR(fastmap, S_IRUGO, dev_attribute_show, NULL); /** * ubi_volume_notify - send a volume change notification. @@ -378,6 +380,8 @@ static ssize_t dev_attribute_show(struct device *dev, ret = sprintf(buf, "%d\n", ubi->mtd->index); else if (attr == _ro_mode) ret = sprintf(buf, "%d\n", ubi->ro_mode); + else if (attr == _fastmap) + ret = sprintf(buf, "%d\n", ubi->fm ? 1 : 0); else ret = -EINVAL; @@ -398,6 +402,7 @@ static struct attribute *ubi_dev_attrs[] = { _bgt_enabled.attr, _mtd_num.attr, _ro_mode.attr, + _fastmap.attr, NULL }; ATTRIBUTE_GROUPS(ubi_dev); -- 2.7.4