Hi Vyacheslav,
On Thu, 25 Apr 2013 14:28:37 +0400, Vyacheslav Dubeyko wrote:
> Hi Ryusuke,
>
> I think that this patch really needs in your review.
>
> Thanks,
> Vyacheslav Dubeyko.
> --
> From: Vyacheslav Dubeyko <[email protected]>
> Subject: [PATCH] nilfs2: fix issue with broken bmap for the case of block
> size lesser than 4 KB
>
> DESCRIPTION:
> There are use-cases when NILFS2 file system (formatted with block size lesser
> than 4 KB) can be remounted in RO mode because of encountering of "broken
> bmap" issue.
>
> The issue was reported by Anthony Doggett <[email protected]>:
> "The machine I've been trialling nilfs on is running Debian Testing, Linux
> version 3.2.0-4-686-pae ([email protected]) (gcc version 4.6.3
> (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2), but I've also reproduced it
> (identically) with Debian Unstable amd64 and Debian Experimental (using the
> 3.8-trunk kernel). The problematic partitions were formatted with
> "mkfs.nilfs2 -b 1024 -B 8192"."
>
> SYMPTOMS:
> (1) System log contains error messages likewise:
>
> [63102.496756] nilfs_direct_assign: invalid pointer: 0
> [63102.496786] NILFS error (device dm-17): nilfs_bmap_assign: broken bmap
> (inode number=28)
> [63102.496798]
> [63102.524403] Remounting filesystem read-only
>
> (2) The NILFS2 file system is remounted in RO mode.
>
> REPRODUSING PATH:
> (1) Create volume group with name "unencrypted" by means of vgcreate utility.
> (2) Run script (prepared by Anthony Doggett <[email protected]>):
>
> ----------------[BEGIN SCRIPT]--------------------
> #!/bin/bash
>
> VG=unencrypted
> #apt-get install nilfs-tools darcs
> lvcreate --size 2G --name ntest $VG
> mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
> mkdir /var/tmp/n
> mkdir /var/tmp/n/ntest
> mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
> mkdir /var/tmp/n/ntest/thedir
> cd /var/tmp/n/ntest/thedir
> sleep 2
> date
> darcs init
> sleep 2
> dmesg|tail -n 5
> date
> darcs whatsnew || true
> date
> sleep 2
> dmesg|tail -n 5
> ----------------[END SCRIPT]--------------------
>
> REPRODUCIBILITY: 100%
>
> INVESTIGATION:
> As it was discovered, the issue takes place during segment construction after
> executing such sequence of user-space operations:
>
> open("_darcs/index", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 7
> fstat(7, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> ftruncate(7, 60)
>
> Factually, the error message "NILFS error (device dm-17): nilfs_bmap_assign:
> broken bmap (inode number=28)" takes place because of trying to get block
> number for third block of the file with logical offset #3072 bytes. As it is
> possible to see from above output, the file has 60 bytes of the whole size.
> So, it is enough one block (1 KB in size) allocation for the whole file.
> Trying to operate with several blocks instead of one takes place because of
> discovering several dirty buffers for this file in nilfs_segctor_scan_file()
> method. The create_empty_buffers() method creates buffers for the whole page
> size. So, for the case of 1 KB block this method creates 4 empty buffers. As
> a result, in the case of block size lesser than 4 KB the NILFS2 code tries to
> operate with all buffers for the page without taking into account really
> allocated buffers count.
>
> FIX:
> The BH_NILFS_Allocated flag is used as a basis for the issue fix. Namely,
> this flag is a info that this buffer is a really allocated buffer. So, every
> buffer is marked as BH_NILFS_Allocated in the case of mapping on real block.
> And only buffers with flag BH_NILFS_Allocated are getting into operations
> with its.
>
> Reported-by: Anthony Doggett <[email protected]>
> Signed-off-by: Vyacheslav Dubeyko <[email protected]>
> CC: Ryusuke Konishi <[email protected]>
Adding BH_NILFS_Allocated flag looks overkill for this issue.
Why not modify btree or direct mapping code so that the buffer dirty
flag is propery set or cleared also for 1 KB block? Basically, the
block mapping code of NILFS is designed to be able to handle 1 KB
block.
If this issue occurs after file size extension with ftruncate(), I
guess handling of hole blocks created by the extension seems to have
problem.
I will look into the issue too.
Regards,
Ryusuke Konishi
> ---
> fs/nilfs2/btnode.c | 2 ++
> fs/nilfs2/btree.c | 2 ++
> fs/nilfs2/gcinode.c | 1 +
> fs/nilfs2/inode.c | 2 ++
> fs/nilfs2/mdt.c | 2 ++
> fs/nilfs2/page.c | 5 +++++
> fs/nilfs2/page.h | 1 +
> fs/nilfs2/recovery.c | 12 +++++++++++-
> fs/nilfs2/segbuf.c | 2 ++
> fs/nilfs2/segment.c | 5 +++--
> fs/nilfs2/super.c | 2 ++
> 11 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
> index a35ae35..cbc0057 100644
> --- a/fs/nilfs2/btnode.c
> +++ b/fs/nilfs2/btnode.c
> @@ -60,6 +60,7 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64
> blocknr)
> bh->b_blocknr = blocknr;
> set_buffer_mapped(bh);
> set_buffer_uptodate(bh);
> + set_buffer_nilfs_allocated(bh);
>
> unlock_page(bh->b_page);
> page_cache_release(bh->b_page);
> @@ -124,6 +125,7 @@ int nilfs_btnode_submit_block(struct address_space *btnc,
> __u64 blocknr,
> *submit_ptr = pblocknr;
> err = 0;
> found:
> + set_buffer_nilfs_allocated(bh);
> *pbh = bh;
>
> out_locked:
> diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
> index b2e3ff3..bbdaa1f 100644
> --- a/fs/nilfs2/btree.c
> +++ b/fs/nilfs2/btree.c
> @@ -2068,6 +2068,8 @@ static void nilfs_btree_lookup_dirty_buffers(struct
> nilfs_bmap *btree,
> for (i = 0; i < pagevec_count(&pvec); i++) {
> bh = head = page_buffers(pvec.pages[i]);
> do {
> + if (!buffer_nilfs_allocated(bh))
> + continue;
> if (buffer_dirty(bh))
> nilfs_btree_add_dirty_buffer(btree,
> lists, bh);
> diff --git a/fs/nilfs2/gcinode.c b/fs/nilfs2/gcinode.c
> index 57ceaf3..f7d714b 100644
> --- a/fs/nilfs2/gcinode.c
> +++ b/fs/nilfs2/gcinode.c
> @@ -111,6 +111,7 @@ int nilfs_gccache_submit_read_data(struct inode *inode,
> sector_t blkoff,
> bh->b_blocknr = vbn;
> out:
> err = 0;
> + set_buffer_nilfs_allocated(bh);
> *out_bh = bh;
>
> failed:
> diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
> index ba7a1da..8992291 100644
> --- a/fs/nilfs2/inode.c
> +++ b/fs/nilfs2/inode.c
> @@ -93,6 +93,7 @@ int nilfs_get_block(struct inode *inode, sector_t blkoff,
> map_bh(bh_result, inode->i_sb, blknum);
> if (ret > 0)
> bh_result->b_size = (ret << inode->i_blkbits);
> + set_buffer_nilfs_allocated(bh_result);
> goto out;
> }
> /* data block was not found */
> @@ -132,6 +133,7 @@ int nilfs_get_block(struct inode *inode, sector_t blkoff,
> set_buffer_delay(bh_result);
> map_bh(bh_result, inode->i_sb, 0); /* dbn must be changed
> to proper value */
> + set_buffer_nilfs_allocated(bh_result);
> } else if (ret == -ENOENT) {
> /* not found is not error (e.g. hole); must return without
> the mapped state flag. */
> diff --git a/fs/nilfs2/mdt.c b/fs/nilfs2/mdt.c
> index c4dcd1d..9156635 100644
> --- a/fs/nilfs2/mdt.c
> +++ b/fs/nilfs2/mdt.c
> @@ -57,6 +57,7 @@ nilfs_mdt_insert_new_block(struct inode *inode, unsigned
> long block,
> return ret;
>
> set_buffer_mapped(bh);
> + set_buffer_nilfs_allocated(bh);
>
> kaddr = kmap_atomic(bh->b_page);
> memset(kaddr + bh_offset(bh), 0, 1 << inode->i_blkbits);
> @@ -160,6 +161,7 @@ nilfs_mdt_submit_block(struct inode *inode, unsigned long
> blkoff,
> ret = 0;
> out:
> get_bh(bh);
> + set_buffer_nilfs_allocated(bh);
> *out_bh = bh;
>
> failed_bh:
> diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c
> index 4f07e0e..b95540e 100644
> --- a/fs/nilfs2/page.c
> +++ b/fs/nilfs2/page.c
> @@ -101,6 +101,7 @@ void nilfs_forget_buffer(struct buffer_head *bh)
> clear_buffer_uptodate(bh);
> clear_buffer_mapped(bh);
> bh->b_blocknr = -1;
> + clear_buffer_nilfs_allocated(bh);
> ClearPageUptodate(page);
> ClearPageMappedToDisk(page);
> unlock_buffer(bh);
> @@ -159,8 +160,11 @@ int nilfs_page_buffers_clean(struct page *page)
>
> bh = head = page_buffers(page);
> do {
> + if (!buffer_nilfs_allocated(bh))
> + goto next_buffer;
> if (buffer_dirty(bh))
> return 0;
> +next_buffer:
> bh = bh->b_this_page;
> } while (bh != head);
> return 1;
> @@ -435,6 +439,7 @@ void nilfs_clear_dirty_page(struct page *page, bool
> silent)
> clear_buffer_nilfs_checked(bh);
> clear_buffer_nilfs_redirected(bh);
> clear_buffer_uptodate(bh);
> + clear_buffer_nilfs_allocated(bh);
> clear_buffer_mapped(bh);
> unlock_buffer(bh);
> } while (bh = bh->b_this_page, bh != head);
> diff --git a/fs/nilfs2/page.h b/fs/nilfs2/page.h
> index ef30c5c..008e234 100644
> --- a/fs/nilfs2/page.h
> +++ b/fs/nilfs2/page.h
> @@ -38,6 +38,7 @@ enum {
> BH_NILFS_Redirected,
> };
>
> +BUFFER_FNS(NILFS_Allocated, nilfs_allocated)
> BUFFER_FNS(NILFS_Node, nilfs_node) /* nilfs node buffers */
> BUFFER_FNS(NILFS_Volatile, nilfs_volatile)
> BUFFER_FNS(NILFS_Checked, nilfs_checked) /* buffer is verified */
> diff --git a/fs/nilfs2/recovery.c b/fs/nilfs2/recovery.c
> index ff00a0b..bac3575 100644
> --- a/fs/nilfs2/recovery.c
> +++ b/fs/nilfs2/recovery.c
> @@ -122,6 +122,7 @@ static int nilfs_compute_checksum(struct the_nilfs *nilfs,
> bh = __bread(nilfs->ns_bdev, ++start, blocksize);
> if (!bh)
> return -EIO;
> + set_buffer_nilfs_allocated(bh);
> check_bytes -= size;
> size = min_t(u64, check_bytes, blocksize);
> crc = crc32_le(crc, bh->b_data, size);
> @@ -153,6 +154,7 @@ int nilfs_read_super_root_block(struct the_nilfs *nilfs,
> sector_t sr_block,
> ret = NILFS_SEG_FAIL_IO;
> goto failed;
> }
> + set_buffer_nilfs_allocated(bh_sr);
>
> sr = (struct nilfs_super_root *)bh_sr->b_data;
> if (check) {
> @@ -196,8 +198,10 @@ nilfs_read_log_header(struct the_nilfs *nilfs, sector_t
> start_blocknr,
> struct buffer_head *bh_sum;
>
> bh_sum = __bread(nilfs->ns_bdev, start_blocknr, nilfs->ns_blocksize);
> - if (bh_sum)
> + if (bh_sum) {
> + set_buffer_nilfs_allocated(bh_sum);
> *sum = (struct nilfs_segment_summary *)bh_sum->b_data;
> + }
> return bh_sum;
> }
>
> @@ -266,6 +270,7 @@ static void *nilfs_read_summary_info(struct the_nilfs
> *nilfs,
> nilfs->ns_blocksize);
> if (unlikely(!*pbh))
> return NULL;
> + set_buffer_nilfs_allocated(*pbh);
> *offset = 0;
> }
> ptr = (*pbh)->b_data + *offset;
> @@ -303,6 +308,8 @@ static void nilfs_skip_summary_info(struct the_nilfs
> *nilfs,
> brelse(*pbh);
> *pbh = __bread(nilfs->ns_bdev, blocknr + bcnt,
> nilfs->ns_blocksize);
> + if (*pbh)
> + set_buffer_nilfs_allocated(*pbh);
> }
> }
>
> @@ -333,6 +340,7 @@ static int nilfs_scan_dsync_log(struct the_nilfs *nilfs,
> sector_t start_blocknr,
> bh = __bread(nilfs->ns_bdev, start_blocknr, nilfs->ns_blocksize);
> if (unlikely(!bh))
> goto out;
> + set_buffer_nilfs_allocated(bh);
>
> offset = le16_to_cpu(sum->ss_bytes);
> for (;;) {
> @@ -492,6 +500,7 @@ static int nilfs_recovery_copy_block(struct the_nilfs
> *nilfs,
> bh_org = __bread(nilfs->ns_bdev, rb->blocknr, nilfs->ns_blocksize);
> if (unlikely(!bh_org))
> return -EIO;
> + set_buffer_nilfs_allocated(bh_org);
>
> kaddr = kmap_atomic(page);
> memcpy(kaddr + bh_offset(bh_org), bh_org->b_data, bh_org->b_size);
> @@ -712,6 +721,7 @@ static void nilfs_finish_roll_forward(struct the_nilfs
> *nilfs,
>
> bh = __getblk(nilfs->ns_bdev, ri->ri_lsegs_start, nilfs->ns_blocksize);
> BUG_ON(!bh);
> + set_buffer_nilfs_allocated(bh);
> memset(bh->b_data, 0, bh->b_size);
> set_buffer_dirty(bh);
> err = sync_dirty_buffer(bh);
> diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
> index dc9a913..c69a2b3 100644
> --- a/fs/nilfs2/segbuf.c
> +++ b/fs/nilfs2/segbuf.c
> @@ -113,6 +113,7 @@ int nilfs_segbuf_extend_segsum(struct
> nilfs_segment_buffer *segbuf)
> segbuf->sb_pseg_start + segbuf->sb_sum.nsumblk);
> if (unlikely(!bh))
> return -ENOMEM;
> + set_buffer_nilfs_allocated(bh);
>
> nilfs_segbuf_add_segsum_buffer(segbuf, bh);
> return 0;
> @@ -127,6 +128,7 @@ int nilfs_segbuf_extend_payload(struct
> nilfs_segment_buffer *segbuf,
> segbuf->sb_pseg_start + segbuf->sb_sum.nblocks);
> if (unlikely(!bh))
> return -ENOMEM;
> + set_buffer_nilfs_allocated(bh);
>
> nilfs_segbuf_add_payload_buffer(segbuf, bh);
> *bhp = bh;
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index a5752a58..3d4cdae 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -665,7 +665,7 @@ static size_t nilfs_lookup_dirty_data_buffers(struct
> inode *inode,
>
> bh = head = page_buffers(page);
> do {
> - if (!buffer_dirty(bh))
> + if (!buffer_dirty(bh) || !buffer_nilfs_allocated(bh))
> continue;
> get_bh(bh);
> list_add_tail(&bh->b_assoc_buffers, listp);
> @@ -699,7 +699,8 @@ static void nilfs_lookup_dirty_node_buffers(struct inode
> *inode,
> for (i = 0; i < pagevec_count(&pvec); i++) {
> bh = head = page_buffers(pvec.pages[i]);
> do {
> - if (buffer_dirty(bh)) {
> + if (buffer_dirty(bh) &&
> + buffer_nilfs_allocated(bh)) {
> get_bh(bh);
> list_add_tail(&bh->b_assoc_buffers,
> listp);
> diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> index c7d1f9f..d190494 100644
> --- a/fs/nilfs2/super.c
> +++ b/fs/nilfs2/super.c
> @@ -384,6 +384,7 @@ static int nilfs_move_2nd_super(struct super_block *sb,
> loff_t sb2off)
> ret = -EIO;
> goto out;
> }
> + set_buffer_nilfs_allocated(nsbh);
> nsbp = (void *)nsbh->b_data + offset;
> memset(nsbp, 0, nilfs->ns_blocksize);
>
> @@ -836,6 +837,7 @@ struct nilfs_super_block *nilfs_read_super_block(struct
> super_block *sb,
> *pbh = sb_bread(sb, sb_index);
> if (!*pbh)
> return NULL;
> + set_buffer_nilfs_allocated(*pbh);
> return (struct nilfs_super_block *)((char *)(*pbh)->b_data + offset);
> }
>
> --
> 1.7.9.5
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html