Re: kernel BUG at fs/btrfs/extent-tree.c:6164!
(2011/06/08 0:46), Chris Mason wrote: > Excerpts from liubo's message of 2011-06-07 04:36:56 -0400: >> On 06/07/2011 04:24 PM, Tsutomu Itoh wrote: >>> (2011/06/07 15:17), Tsutomu Itoh wrote: (2011/06/07 14:59), Tsutomu Itoh wrote: > Hi liubo, > > (2011/06/07 14:31), liubo wrote: >> On 06/06/2011 04:33 PM, Tsutomu Itoh wrote: >>> Hi, >>> >>> I encountered following panic using 'btrfs-unstable + for-linus' >>> kernel. >>> >>> I ran "btrfs fi bal /test5" command, and mount option of /test5 >>> is as follows: >>> >>> /dev/sdc3 on /test5 type btrfs >>> (rw,space_cache,compress=lzo,inode_cache) >>> >> So, just a "btrfs fi bal" would lead to the bug? > I think so. > > It should be specific to the inode caching code. The balancing code is > finding the inode map cache extents, but it doesn't know how to relocate > them. However, the panic has occurred even if inode_cahce is turned off. Is this another problem? --- Tsutomu device fsid a46d03b5cb35c93-4713fead8acc709e devid 1 transid 7 /dev/sdc3 btrfs: enabling disk space caching btrfs: use lzo compression device fsid 914b303425ef9825-e448135c0d20babe devid 1 transid 7 /dev/sdd4 btrfs: disk space caching is enabled btrfs: relocating block group 1103101952 flags 9 btrfs: found 540 extents btrfs: found 540 extents [ cut here ] kernel BUG at fs/btrfs/extent-tree.c:1424! invalid opcode: [#1] SMP last sysfs file: /sys/kernel/mm/ksm/run CPU 0 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode] Pid: 26884, comm: btrfs Not tainted 2.6.39btrfs-test+ #4 FUJITSU-SV PRIMERGY/D2399 RIP: 0010:[] [] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs] RSP: 0018:8801475db748 EFLAGS: 00010202 RAX: 0001 RBX: 880141d1a6d0 RCX: 8801475da000 RDX: 0008 RSI: 8800 RDI: RBP: 8801475db7e8 R08: 0001 R09: 6db6db6db6db6db7 R10: 0001 R11: 0014 R12: 00b8 R13: 880142bc8a08 R14: 0001 R15: 000d FS: 7fbbaa8b0740() GS:88019fc0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0033cfeda340 CR3: 000145c04000 CR4: 06f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process btrfs (pid: 26884, threadinfo 8801475da000, task 880160806ab0) Stack: 8801475db778 a0331ca6 88018c1087c8 8801475db830 0821 000181f43000 8801475db7e8 88012cc27800 082e 0794475db9a9 000181f43000 004000a8 Call Trace: [] ? btrfs_mark_buffer_dirty+0xb6/0x130 [btrfs] [] insert_inline_extent_backref+0x69/0x100 [btrfs] [] ? kmem_cache_alloc+0x186/0x190 [] __btrfs_inc_extent_ref+0xa3/0x1e0 [btrfs] [] ? update_block_group+0xd9/0x2a0 [btrfs] [] run_clustered_refs+0x664/0x7f0 [btrfs] [] btrfs_run_delayed_refs+0xc8/0x210 [btrfs] [] btrfs_commit_transaction+0x7d/0x790 [btrfs] [] ? wake_up_bit+0x40/0x40 [] prepare_to_merge+0x1fd/0x230 [btrfs] [] relocate_block_group+0x476/0x660 [btrfs] [] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs] [] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs] [] ? btrfs_tree_unlock+0x50/0x50 [btrfs] [] btrfs_relocate_chunk+0x8b/0x670 [btrfs] [] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs] [] ? read_extent_buffer+0xd8/0x1d0 [btrfs] [] ? btrfs_previous_item+0xb1/0x150 [btrfs] [] ? read_extent_buffer+0xd8/0x1d0 [btrfs] [] btrfs_balance+0x21a/0x2b0 [btrfs] [] ? path_openat+0x101/0x3d0 [] btrfs_ioctl+0x51c/0xc40 [btrfs] [] ? handle_mm_fault+0x148/0x270 [] ? do_page_fault+0x1d8/0x4b0 [] do_vfs_ioctl+0x9a/0x540 [] sys_ioctl+0xa1/0xb0 [] system_call_fastpath+0x16/0x1b Code: 48 8b 75 20 48 89 c3 48 8b 7d 18 e8 c9 bd ff ff 48 39 d8 77 26 b8 1d 00 00 00 e9 15 ff ff ff a8 01 0f 85 8c fe ff ff 0f 0b eb fe <0f> 0b eb fe 0f 0b 0f 1f 84 00 00 00 00 00 eb f6 4c 89 fb 44 8b RIP [] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs] RSP > > I think we need to switch the inode map cache over to regular extents > that are not preallocated. It will fix the overflow problem and it will > fix the balancing. > > There are a lot of special cases for the free extent cache that don't > apply to the inode map cache, and I think sharing the extent > preallocation is hurting us. > > -chris > > -- To unsubscribe from this list: se
[PATCH] Btrfs: avoid stack bloat in btrfs_ioctl_fs_info()
The size of struct btrfs_ioctl_fs_info_args is as big as 1KB, so don't declare the variable on stack. Signed-off-by: Li Zefan --- fs/btrfs/ioctl.c | 23 ++- 1 files changed, 14 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ac37040..9705c5c 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2054,29 +2054,34 @@ static long btrfs_ioctl_rm_dev(struct btrfs_root *root, void __user *arg) static long btrfs_ioctl_fs_info(struct btrfs_root *root, void __user *arg) { - struct btrfs_ioctl_fs_info_args fi_args; + struct btrfs_ioctl_fs_info_args *fi_args; struct btrfs_device *device; struct btrfs_device *next; struct btrfs_fs_devices *fs_devices = root->fs_info->fs_devices; + int ret = 0; if (!capable(CAP_SYS_ADMIN)) return -EPERM; - fi_args.num_devices = fs_devices->num_devices; - fi_args.max_id = 0; - memcpy(&fi_args.fsid, root->fs_info->fsid, sizeof(fi_args.fsid)); + fi_args = kzalloc(sizeof(*fi_args), GFP_KERNEL); + if (!fi_args) + return -ENOMEM; + + fi_args->num_devices = fs_devices->num_devices; + memcpy(&fi_args->fsid, root->fs_info->fsid, sizeof(fi_args->fsid)); mutex_lock(&fs_devices->device_list_mutex); list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) { - if (device->devid > fi_args.max_id) - fi_args.max_id = device->devid; + if (device->devid > fi_args->max_id) + fi_args->max_id = device->devid; } mutex_unlock(&fs_devices->device_list_mutex); - if (copy_to_user(arg, &fi_args, sizeof(fi_args))) - return -EFAULT; + if (copy_to_user(arg, fi_args, sizeof(fi_args))) + ret = -EFAULT; - return 0; + kfree(fi_args); + return ret; } static long btrfs_ioctl_dev_info(struct btrfs_root *root, void __user *arg) -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: use join_transaction in btrfs_evict_inode()
The WARN_ON() in start_transaction() was triggered while balancing. The cause is btrfs_relocate_chunk() started a transaction and then called iput() on the inode that stores free space cache, and iput() called btrfs_start_transaction() again. Reported-by: Tsutomu Itoh Signed-off-by: Li Zefan --- fs/btrfs/inode.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 02ff4a1..4e9aa28 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3646,7 +3646,7 @@ void btrfs_evict_inode(struct inode *inode) btrfs_i_size_write(inode, 0); while (1) { - trans = btrfs_start_transaction(root, 0); + trans = btrfs_join_transaction(root); BUG_ON(IS_ERR(trans)); trans->block_rsv = root->orphan_block_rsv; -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Delayed inode operations not doing the right thing with enospc
On 06/06/2011 09:39 PM, Miao Xie wrote: > On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote: >> I got a lot of these when running stress.sh on my test box >> >> [ 9792.654889] [ cut here ] >> [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 >> btrfs_alloc_free_block+0xca/0x27c [btrfs]() >> [ 9792.654899] Hardware name: To Be Filled By O.E.M. >> [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c >> ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables >> arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 >> snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill >> pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco >> i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd >> mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi >> ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper >> drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] >> [ 9792.654919] Pid: 2762, comm: rm Tainted: GW 2.6.39+ #1 >> [ 9792.654920] Call Trace: >> [ 9792.654922] [] warn_slowpath_common+0x83/0x9b >> [ 9792.654925] [] warn_slowpath_null+0x1a/0x1c >> [ 9792.654933] [] btrfs_alloc_free_block+0xca/0x27c >> [btrfs] >> [ 9792.654945] [] ? map_extent_buffer+0x6e/0xa8 [btrfs] >> [ 9792.654953] [] __btrfs_cow_block+0xfc/0x30c [btrfs] >> [ 9792.654963] [] ? btrfs_buffer_uptodate+0x47/0x58 >> [btrfs] >> [ 9792.654970] [] ? read_block_for_search+0x94/0x368 >> [btrfs] >> [ 9792.654978] [] btrfs_cow_block+0xfe/0x146 [btrfs] >> [ 9792.654986] [] btrfs_search_slot+0x14d/0x4b6 [btrfs] >> [ 9792.654997] [] ? map_extent_buffer+0x6e/0xa8 [btrfs] >> [ 9792.655022] [] btrfs_lookup_inode+0x2f/0x8f [btrfs] >> [ 9792.655025] [] ? _cond_resched+0xe/0x22 >> [ 9792.655027] [] ? mutex_lock+0x29/0x50 >> [ 9792.655039] [] >> btrfs_update_delayed_inode+0x72/0x137 [btrfs] >> [ 9792.655051] [] btrfs_run_delayed_items+0x90/0xdb >> [btrfs] >> [ 9792.655062] [] >> btrfs_commit_transaction+0x228/0x654 [btrfs] >> [ 9792.655064] [] ? remove_wait_queue+0x3a/0x3a >> [ 9792.655075] [] btrfs_evict_inode+0x14d/0x202 [btrfs] >> [ 9792.655077] [] evict+0x71/0x111 >> [ 9792.655079] [] iput+0x12a/0x132 >> [ 9792.655081] [] do_unlinkat+0x106/0x155 >> [ 9792.655083] [] ? path_put+0x1f/0x23 >> [ 9792.655085] [] ? audit_syscall_entry+0x145/0x171 >> [ 9792.655087] [] ? putname+0x34/0x36 >> [ 9792.655090] [] sys_unlinkat+0x29/0x2b >> [ 9792.655092] [] system_call_fastpath+0x16/0x1b >> [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- >> >> >> This is because use_block_rsv() is having to do a >> reserve_metadata_bytes(), which shouldn't happen as we should have >> reserved enough space for those operations to complete. This is >> happening because use_block_rsv() will call get_block_rsv(), which if >> root->ref_cows is set (which is the case on all fs roots) we will use >> trans->block_rsv, which will only have what the current transaction >> starter had reserved. >> >> What needs to be done instead is we need to have a block reserve that >> any reservation that is done at create time for these inodes is migrated >> to this special reserve, and then when you run the delayed inode items >> stuff you set trans->block_rsv to the special block reserve so the >> accounting is all done properly. >> >> This is just off the top of my head, there may be a better way to do it, >> I've not actually looked that the delayed inode code at all. >> >> I would do this myself but I have a ever increasing list of shit to do >> so will somebody pick this up and fix it please? Thanks, > > Sorry, it's my miss. > I forgot to set trans->block_rsv to global_block_rsv, since we have migrated > the space from trans_block_rsv to global_block_rsv. > > I'll fix it soon. > There is another problem, we're failing xfstest 204. I tried making reserve_metadata_bytes commit the transaction regardless of whether or not there were pinned bytes but the test just hung there. Usually it takes 7 seconds to run and I ctrl+c'ed it after a couple of minutes. 204 just creates a crap ton of files, which is what is killing us. There needs to be a way to start flushing delayed inode items so we can reclaim the space they are holding onto so we don't get enospc, and it needs to be better than just committing the transaction because that is dog slow. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Btrfs: do transaction space reservation before joining the transaction
We have to do weird things when handling enospc in the transaction joining code. Because we've already joined the transaction we cannot commit the transaction within the reservation code since it will deadlock, so we have to return EAGAIN and then make sure we don't retry too many times. Instead of doing this, just do the reservation the normal way before we join the transaction, that way we can do whatever we want to try and reclaim space, and then if it fails we know for sure we are out of space and we can return ENOSPC. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h |3 --- fs/btrfs/extent-tree.c | 20 fs/btrfs/transaction.c | 36 +--- 3 files changed, 17 insertions(+), 42 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0c62c6c..6034a23 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2205,9 +2205,6 @@ void btrfs_set_inode_space_info(struct btrfs_root *root, struct inode *ionde); void btrfs_clear_space_info_full(struct btrfs_fs_info *info); int btrfs_check_data_free_space(struct inode *inode, u64 bytes); void btrfs_free_reserved_data_space(struct inode *inode, u64 bytes); -int btrfs_trans_reserve_metadata(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - int num_items); void btrfs_trans_release_metadata(struct btrfs_trans_handle *trans, struct btrfs_root *root); int btrfs_orphan_reserve_metadata(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index aa2b592a..b1c3ff7 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3878,26 +3878,6 @@ int btrfs_truncate_reserve_metadata(struct btrfs_trans_handle *trans, return 0; } -int btrfs_trans_reserve_metadata(struct btrfs_trans_handle *trans, -struct btrfs_root *root, -int num_items) -{ - u64 num_bytes; - int ret; - - if (num_items == 0 || root->fs_info->chunk_root == root) - return 0; - - num_bytes = btrfs_calc_trans_metadata_size(root, num_items); - ret = btrfs_block_rsv_add(trans, root, &root->fs_info->trans_block_rsv, - num_bytes); - if (!ret) { - trans->bytes_reserved += num_bytes; - trans->block_rsv = &root->fs_info->trans_block_rsv; - } - return ret; -} - void btrfs_trans_release_metadata(struct btrfs_trans_handle *trans, struct btrfs_root *root) { diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index dd71966..c277448 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -203,7 +203,7 @@ static struct btrfs_trans_handle *start_transaction(struct btrfs_root *root, { struct btrfs_trans_handle *h; struct btrfs_transaction *cur_trans; - int retries = 0; + u64 num_bytes = 0; int ret; if (root->fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) @@ -217,6 +217,19 @@ static struct btrfs_trans_handle *start_transaction(struct btrfs_root *root, h->block_rsv = NULL; goto got_it; } + + /* +* Do the reservation before we join the transaction so we can do all +* the appropriate flushing if need be. +*/ + if (num_items > 0 && root != root->fs_info->chunk_root) { + num_bytes = btrfs_calc_trans_metadata_size(root, num_items); + ret = btrfs_block_rsv_add(NULL, root, + &root->fs_info->trans_block_rsv, + num_bytes); + if (ret) + return ERR_PTR(ret); + } again: h = kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS); if (!h) @@ -253,24 +266,9 @@ again: goto again; } - if (num_items > 0) { - ret = btrfs_trans_reserve_metadata(h, root, num_items); - if (ret == -EAGAIN && !retries) { - retries++; - btrfs_commit_transaction(h, root); - goto again; - } else if (ret == -EAGAIN) { - /* -* We have already retried and got EAGAIN, so really we -* don't have space, so set ret to -ENOSPC. -*/ - ret = -ENOSPC; - } - - if (ret < 0) { - btrfs_end_transaction(h, root); - return ERR_PTR(ret); - } + if (num_bytes) { + h->block_rsv = &root->fs_info->trans_block_rsv; + h->bytes_reserved = num_bytes; } got_it: -- 1.7.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in th
[PATCH 2/2] Btrfs: serialize flushers in reserve_metadata_bytes
We keep having problems with early enospc, and that's because our method of making space is inherently racy. The problem is we can have one guy trying to make space for himself, and in the meantime people come in and steal his reservation. In order to stop this we make a waitqueue and put anybody who comes into reserve_metadata_bytes on that waitqueue if somebody is trying to make more space. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h |3 ++ fs/btrfs/extent-tree.c | 69 2 files changed, 49 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 6034a23..8857d82 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -756,6 +756,8 @@ struct btrfs_space_info { chunks for this space */ unsigned int chunk_alloc:1; /* set if we are allocating a chunk */ + unsigned int flush:1; /* set if we are trying to make space */ + unsigned int force_alloc; /* set if we need to force a chunk alloc for this space */ @@ -766,6 +768,7 @@ struct btrfs_space_info { spinlock_t lock; struct rw_semaphore groups_sem; atomic_t caching_threads; + wait_queue_head_t wait; }; struct btrfs_block_rsv { diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index b1c3ff7..d86f7c5 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2932,6 +2932,8 @@ static int update_space_info(struct btrfs_fs_info *info, u64 flags, found->full = 0; found->force_alloc = CHUNK_ALLOC_NO_FORCE; found->chunk_alloc = 0; + found->flush = 0; + init_waitqueue_head(&found->wait); *space_info = found; list_add_rcu(&found->list, &info->space_info); atomic_set(&found->caching_threads, 0); @@ -3314,9 +3316,13 @@ static int shrink_delalloc(struct btrfs_trans_handle *trans, if (reserved == 0) return 0; - /* nothing to shrink - nothing to reclaim */ - if (root->fs_info->delalloc_bytes == 0) + smp_mb(); + if (root->fs_info->delalloc_bytes == 0) { + if (trans) + return 0; + btrfs_wait_ordered_extents(root, 0, 0); return 0; + } max_reclaim = min(reserved, to_reclaim); @@ -3360,6 +3366,8 @@ static int shrink_delalloc(struct btrfs_trans_handle *trans, } } + if (reclaimed >= to_reclaim && !trans) + btrfs_wait_ordered_extents(root, 0, 0); return reclaimed >= to_reclaim; } @@ -3384,15 +3392,36 @@ static int reserve_metadata_bytes(struct btrfs_trans_handle *trans, u64 num_bytes = orig_bytes; int retries = 0; int ret = 0; - bool reserved = false; bool committed = false; + bool flushing = false; again: - ret = -ENOSPC; - if (reserved) - num_bytes = 0; - + ret = 0; spin_lock(&space_info->lock); + /* +* We only want to wait if somebody other than us is flushing and we are +* actually alloed to flush. +*/ + while (flush && !flushing && space_info->flush) { + spin_unlock(&space_info->lock); + /* +* If we have a trans handle we can't wait because the flusher +* may have to commit the transaction, which would mean we would +* deadlock since we are waiting for the flusher to finish, but +* hold the current transaction open. +*/ + if (trans) + return -EAGAIN; + ret = wait_event_interruptible(space_info->wait, + !space_info->flush); + /* Must have been interrupted, return */ + if (ret) + return -EINTR; + + spin_lock(&space_info->lock); + } + + ret = -ENOSPC; unused = space_info->bytes_used + space_info->bytes_reserved + space_info->bytes_pinned + space_info->bytes_readonly + space_info->bytes_may_use; @@ -3407,8 +3436,7 @@ again: if (unused <= space_info->total_bytes) { unused = space_info->total_bytes - unused; if (unused >= num_bytes) { - if (!reserved) - space_info->bytes_may_use += orig_bytes; + space_info->bytes_may_use += orig_bytes; ret = 0; } else { /* @@ -3433,17 +3461,14 @@ again: * to reclaim space we can actually use it instead of somebody else * stealing it from us. */ - if (ret && !reserved) { - space_info->bytes_may_use += orig_bytes; - reserved = true; + if (ret &&
[PATCH 0/2] Fix ENOSPC regression
Sergei accidently introduced a regression with c4f675cd40d955d539180506c09515c90169b15b The problem isn't his patch, it's that we are entirely too touchy to changes in this area because the way we deal with pressure is racy in general. The other problem is even though delalloc bytes are 0, we still may not have reclaimed space, rather we need to wait for the ordered extents to reclaim the space. So this patch set does that and it serialize the flushers to close this race we've always had. This fixes normal enospc cases we were seeing. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: fix btrfs_update_reserved_bytes usage
For some reason btrfs_update_reserved_bytes was only ever updating the bytes_reserved counter of the space info if the space info was data. I assume this is because the original enospc stuff used bytes_reserved to account for space reserved for enospc accounting, but now that we're using bytes_may_use thats incorrect. So this patch fixes btrfs_update_reserved_bytes to always update the space_info as well. Also it fixes a weird case where we tried to add the space to the enospc accounting stuff. Rather than doing that just add it back to the space info and then it can be accounted for later. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h|2 +- fs/btrfs/extent-tree.c | 76 -- fs/btrfs/free-space-cache.c |4 +- 3 files changed, 25 insertions(+), 57 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 93a409f..0c62c6c 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2177,7 +2177,7 @@ int btrfs_free_extent(struct btrfs_trans_handle *trans, int btrfs_free_reserved_extent(struct btrfs_root *root, u64 start, u64 len); int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int sinfo); + u64 num_bytes, int reserve); int btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans, struct btrfs_root *root); int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 933d7dc..aa2b592a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4194,44 +4194,33 @@ int btrfs_pin_extent(struct btrfs_root *root, /* * update size of reserved extents. this function may return -EAGAIN - * if 'reserve' is true or 'sinfo' is false. + * if 'reserve' is true. */ int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int sinfo) + u64 num_bytes, int reserve) { int ret = 0; - if (sinfo) { - struct btrfs_space_info *space_info = cache->space_info; - spin_lock(&space_info->lock); - spin_lock(&cache->lock); - if (reserve) { - if (cache->ro) { - ret = -EAGAIN; - } else { - cache->reserved += num_bytes; - space_info->bytes_reserved += num_bytes; - } - } else { - if (cache->ro) - space_info->bytes_readonly += num_bytes; - cache->reserved -= num_bytes; - space_info->bytes_reserved -= num_bytes; - space_info->reservation_progress++; - } - spin_unlock(&cache->lock); - spin_unlock(&space_info->lock); - } else { - spin_lock(&cache->lock); + struct btrfs_space_info *space_info = cache->space_info; + + spin_lock(&space_info->lock); + spin_lock(&cache->lock); + if (reserve) { if (cache->ro) { ret = -EAGAIN; } else { - if (reserve) - cache->reserved += num_bytes; - else - cache->reserved -= num_bytes; + cache->reserved += num_bytes; + space_info->bytes_reserved += num_bytes; } - spin_unlock(&cache->lock); + } else { + if (cache->ro) + space_info->bytes_readonly += num_bytes; + cache->reserved -= num_bytes; + WARN_ON(space_info->bytes_reserved < num_bytes); + space_info->bytes_reserved -= num_bytes; + space_info->reservation_progress++; } + spin_unlock(&cache->lock); + spin_unlock(&space_info->lock); return ret; } @@ -4679,27 +4668,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, WARN_ON(test_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)); btrfs_add_free_space(cache, buf->start, buf->len); - ret = btrfs_update_reserved_bytes(cache, buf->len, 0, 0); - if (ret == -EAGAIN) { - /* block group became read-only */ - btrfs_update_reserved_bytes(cache, buf->len, 0, 1); - goto out; - } - - ret = 1; - spin_lock(&block_rsv->lock); - if (block_rsv->reserved < block_rsv->size) { - block_rsv->reserved += buf->len; - ret = 0; - } - spin_unlock(&block_rsv->lock); - -
[PATCH] Btrfs: account for space reservations properly V2
We have been using space_info->bytes_reserved in the metadata case to cover our reservations for ENOSPC. The problem with this is thats horribly wrong. We use bytes_reserved to keep track of how many bytes the allocator has outstanding that haven't actually been made into extents yet. So what has been happening is that we've been using bytes_reserved for our ENOSPC reservations and our allocations. Currently that isn't a big deal, everything is being accounted for appropriately. The only thing this affects is how we allocate chunks, so we've grown all these horrible things to make sure we don't end up with a stupid amount of metadata chunks. The problem is we think that the entire space is used up because we use bytes_used and bytes_reserved to get an idea of how much is actually in use by real data, but thats not the case. So switch over to using bytes_may_use, which the data space info stuff has already been using for the same exact reason. This will allow us to go back to pre-emptively allocating chunks in the enospc code. Thanks, Signed-off-by: Josef Bacik --- V1->V2: -fixed updating bytes_reserved in free_tree_block -update bytes_may_use in unpin_extent_range fs/btrfs/ctree.h |2 +- fs/btrfs/extent-tree.c | 22 +++--- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 91806fe..93a409f 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -745,7 +745,7 @@ struct btrfs_space_info { /* * we bump reservation progress every time we decrement -* bytes_reserved. This way people waiting for reservations +* bytes_may_use. This way people waiting for reservations * know something good has happened and they can check * for progress. The number here isn't to be trusted, it * just shows reclaim activity diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index b42efc2..933d7dc 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3308,7 +3308,7 @@ static int shrink_delalloc(struct btrfs_trans_handle *trans, space_info = block_rsv->space_info; smp_mb(); - reserved = space_info->bytes_reserved; + reserved = space_info->bytes_may_use; progress = space_info->reservation_progress; if (reserved == 0) @@ -3328,9 +3328,9 @@ static int shrink_delalloc(struct btrfs_trans_handle *trans, writeback_inodes_sb_nr_if_idle(root->fs_info->sb, nr_pages); spin_lock(&space_info->lock); - if (reserved > space_info->bytes_reserved) - reclaimed += reserved - space_info->bytes_reserved; - reserved = space_info->bytes_reserved; + if (reserved > space_info->bytes_may_use) + reclaimed += reserved - space_info->bytes_may_use; + reserved = space_info->bytes_may_use; spin_unlock(&space_info->lock); loops++; @@ -3408,7 +3408,7 @@ again: unused = space_info->total_bytes - unused; if (unused >= num_bytes) { if (!reserved) - space_info->bytes_reserved += orig_bytes; + space_info->bytes_may_use += orig_bytes; ret = 0; } else { /* @@ -3434,7 +3434,7 @@ again: * stealing it from us. */ if (ret && !reserved) { - space_info->bytes_reserved += orig_bytes; + space_info->bytes_may_use += orig_bytes; reserved = true; } @@ -3495,7 +3495,7 @@ again: out: if (reserved) { spin_lock(&space_info->lock); - space_info->bytes_reserved -= orig_bytes; + space_info->bytes_may_use -= orig_bytes; spin_unlock(&space_info->lock); } @@ -3579,7 +3579,7 @@ static void block_rsv_release_bytes(struct btrfs_block_rsv *block_rsv, } if (num_bytes) { spin_lock(&space_info->lock); - space_info->bytes_reserved -= num_bytes; + space_info->bytes_may_use -= num_bytes; space_info->reservation_progress++; spin_unlock(&space_info->lock); } @@ -3791,12 +3791,12 @@ static void update_global_block_rsv(struct btrfs_fs_info *fs_info) if (sinfo->total_bytes > num_bytes) { num_bytes = sinfo->total_bytes - num_bytes; block_rsv->reserved += num_bytes; - sinfo->bytes_reserved += num_bytes; + sinfo->bytes_may_use += num_bytes; } if (block_rsv->reserved >= block_rsv->size) { num_bytes = block_rsv->reserved - block_rsv->size; - sinfo->bytes_reserved -= num_bytes; + sinfo->bytes_may
Re: New btrfsck status
Me too. I've got a 9TB filesystem that I can't mount since rebooting during a rebalance. I want to get the fs as repaired as possible, but I am not in a hurry, and I have enough space at present to make a duplicate and play with test versions of the repair. --jeff On Mon, Jun 6, 2011 at 9:41 AM, Christian Hesse wrote: > Chris Mason on 10 Feb 13:17: >> Excerpts from Ben Gamari's message of 2011-02-09 21:52:20 -0500: >> > Over the last several months there have been many claims regarding >> > the release of the rewritten btrfsck. Unfortunately, despite >> > numerous claims that it will be released Real Soon Now(c), I have >> > yet to see even a repository with preliminary code. Did I miss an >> > announcement? There is something to be said for "release early, >> > release often." Is there a timeline for getting btrfsck into some >> > sort of usable form? >> >> Yes, but its still real soon now. I've been at about 90% done since >> Christmas. It would have been out last week but I've been chasing a >> debugging a very difficult corruption under load. >> >> I finally found a race in btrfs causing the corruption and now I'm >> back on fsck full time again. > > This mail was about four month ago... > Any news on this topic? > > I really would like to test btrfs on my desktop systems, but I still > hesitate because of the missing fsck. > -- > Schoene Gruesse > Chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at fs/btrfs/extent-tree.c:6164!
Excerpts from liubo's message of 2011-06-07 04:36:56 -0400: > On 06/07/2011 04:24 PM, Tsutomu Itoh wrote: > > (2011/06/07 15:17), Tsutomu Itoh wrote: > >> (2011/06/07 14:59), Tsutomu Itoh wrote: > >>> Hi liubo, > >>> > >>> (2011/06/07 14:31), liubo wrote: > On 06/06/2011 04:33 PM, Tsutomu Itoh wrote: > > Hi, > > > > I encountered following panic using 'btrfs-unstable + for-linus' > > kernel. > > > > I ran "btrfs fi bal /test5" command, and mount option of /test5 > > is as follows: > > > > /dev/sdc3 on /test5 type btrfs > > (rw,space_cache,compress=lzo,inode_cache) > > > So, just a "btrfs fi bal" would lead to the bug? > >>> I think so. It should be specific to the inode caching code. The balancing code is finding the inode map cache extents, but it doesn't know how to relocate them. I think we need to switch the inode map cache over to regular extents that are not preallocated. It will fix the overflow problem and it will fix the balancing. There are a lot of special cases for the free extent cache that don't apply to the inode map cache, and I think sharing the extent preallocation is hurting us. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Delayed inode operations not doing the right thing with enospc
On 06/06/2011 09:39 PM, Miao Xie wrote: On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote: I got a lot of these when running stress.sh on my test box [ 9792.654889] [ cut here ] [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 btrfs_alloc_free_block+0xca/0x27c [btrfs]() [ 9792.654899] Hardware name: To Be Filled By O.E.M. [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] [ 9792.654919] Pid: 2762, comm: rm Tainted: GW 2.6.39+ #1 [ 9792.654920] Call Trace: [ 9792.654922] [] warn_slowpath_common+0x83/0x9b [ 9792.654925] [] warn_slowpath_null+0x1a/0x1c [ 9792.654933] [] btrfs_alloc_free_block+0xca/0x27c [btrfs] [ 9792.654945] [] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.654953] [] __btrfs_cow_block+0xfc/0x30c [btrfs] [ 9792.654963] [] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs] [ 9792.654970] [] ? read_block_for_search+0x94/0x368 [btrfs] [ 9792.654978] [] btrfs_cow_block+0xfe/0x146 [btrfs] [ 9792.654986] [] btrfs_search_slot+0x14d/0x4b6 [btrfs] [ 9792.654997] [] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.655022] [] btrfs_lookup_inode+0x2f/0x8f [btrfs] [ 9792.655025] [] ? _cond_resched+0xe/0x22 [ 9792.655027] [] ? mutex_lock+0x29/0x50 [ 9792.655039] [] btrfs_update_delayed_inode+0x72/0x137 [btrfs] [ 9792.655051] [] btrfs_run_delayed_items+0x90/0xdb [btrfs] [ 9792.655062] [] btrfs_commit_transaction+0x228/0x654 [btrfs] [ 9792.655064] [] ? remove_wait_queue+0x3a/0x3a [ 9792.655075] [] btrfs_evict_inode+0x14d/0x202 [btrfs] [ 9792.655077] [] evict+0x71/0x111 [ 9792.655079] [] iput+0x12a/0x132 [ 9792.655081] [] do_unlinkat+0x106/0x155 [ 9792.655083] [] ? path_put+0x1f/0x23 [ 9792.655085] [] ? audit_syscall_entry+0x145/0x171 [ 9792.655087] [] ? putname+0x34/0x36 [ 9792.655090] [] sys_unlinkat+0x29/0x2b [ 9792.655092] [] system_call_fastpath+0x16/0x1b [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- This is because use_block_rsv() is having to do a reserve_metadata_bytes(), which shouldn't happen as we should have reserved enough space for those operations to complete. This is happening because use_block_rsv() will call get_block_rsv(), which if root->ref_cows is set (which is the case on all fs roots) we will use trans->block_rsv, which will only have what the current transaction starter had reserved. What needs to be done instead is we need to have a block reserve that any reservation that is done at create time for these inodes is migrated to this special reserve, and then when you run the delayed inode items stuff you set trans->block_rsv to the special block reserve so the accounting is all done properly. This is just off the top of my head, there may be a better way to do it, I've not actually looked that the delayed inode code at all. I would do this myself but I have a ever increasing list of shit to do so will somebody pick this up and fix it please? Thanks, Sorry, it's my miss. I forgot to set trans->block_rsv to global_block_rsv, since we have migrated the space from trans_block_rsv to global_block_rsv. I'll fix it soon. Great thanks Miao, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs: remove 64bit alignment padding to allow extent_buffer to fit into one fewer cacheline
Reorder extent_buffer to remove 8 bytes of alignment padding on 64 bit builds. This shrinks its size to 128 bytes allowing it to fit into one fewer cache lines and allows more objects per slab in its kmem_cache. slabinfo extent_buffer reports :- before:- Sizes (bytes) Slabs -- Object : 136 Total : 123 SlabObj: 136 Full : 121 SlabSiz:4096 Partial: 0 Loss : 0 CpuSlab: 2 Align : 8 Objects: 30 after :- Object : 128 Total : 4 SlabObj: 128 Full : 2 SlabSiz:4096 Partial: 0 Loss : 0 CpuSlab: 2 Align : 8 Objects: 32 Signed-off-by: Richard Kennedy --- patch against v3.0-rc2 compiled & tested on x86_64 This has only had a little light testing on a scratch volume but it still seems to work. regards Richard diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 4e8445a..a11a92e 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -126,9 +126,9 @@ struct extent_buffer { unsigned long map_len; struct page *first_page; unsigned long bflags; - atomic_t refs; struct list_head leak_list; struct rcu_head rcu_head; + atomic_t refs; /* the spinlock is used to protect most operations */ spinlock_t lock; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at fs/btrfs/extent-tree.c:6164!
On 06/07/2011 04:24 PM, Tsutomu Itoh wrote: > (2011/06/07 15:17), Tsutomu Itoh wrote: >> (2011/06/07 14:59), Tsutomu Itoh wrote: >>> Hi liubo, >>> >>> (2011/06/07 14:31), liubo wrote: On 06/06/2011 04:33 PM, Tsutomu Itoh wrote: > Hi, > > I encountered following panic using 'btrfs-unstable + for-linus' > kernel. > > I ran "btrfs fi bal /test5" command, and mount option of /test5 > is as follows: > > /dev/sdc3 on /test5 type btrfs (rw,space_cache,compress=lzo,inode_cache) > So, just a "btrfs fi bal" would lead to the bug? >>> I think so. >>> I've figured out the warnings, but not reproduced the bug yet... I used 'btrfs-unstable + for-linus" whose top commit is commit aa0467d8d2a00e75b2bb6a56a4ee6d70c5d1928f Author: David Sterba Date: Fri Jun 3 16:29:08 2011 +0200 btrfs: fix uninitialized variable warning >>> It's same of my environment. >>> and tried on 1) a single disk, 2) 2 disks and 3) 4 disks respectively, but none of them leaded to the below bug. >>> The test script and the volume composition that I am executing are >>> same as following mail. >>> >>> http://marc.info/?l=linux-btrfs&m=130680171426371&w=2 >>> >>> and, in my environment, panic is done within almost 30 minutes when >>> test script is executed. > > I forgot to write. > I am adding '-o inode_cache' to the mount option in my test script. > Yep, I've added this and reproduced it. Seems that there are several bugs. Anyway, thanks for the report. I'm trying to work it out. :) thanks, liubo >> Another panic occurred when I executed it again. >> > > I rebuilt the kernel with 3.0-rc2. but, same problem occurred. > > > <4>[ 131.708325] WARNING: at fs/btrfs/transaction.c:213 > start_transaction+0x74/0x259 [btrfs]() > <4>[ 131.708329] Hardware name: PRIMERGY > <4>[ 131.708330] Modules linked in: autofs4 sunrpc 8021q garp stp llc > cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c > libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev > parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod > crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata > scsi_mod floppy [last unloaded: microcode] > <4>[ 131.708378] Pid: 3041, comm: btrfs Not tainted 3.0.0-rc2test #1 > <4>[ 131.708381] Call Trace: > <4>[ 131.708388] [] warn_slowpath_common+0x85/0x9d > <4>[ 131.708392] [] warn_slowpath_null+0x1a/0x1c > <4>[ 131.708410] [] start_transaction+0x74/0x259 [btrfs] > <4>[ 131.708430] [] ? btrfs_wait_ordered_range+0xf9/0x11d > [btrfs] > <4>[ 131.708448] [] btrfs_start_transaction+0x13/0x15 > [btrfs] > <4>[ 131.708467] [] btrfs_evict_inode+0x113/0x22d [btrfs] > <4>[ 131.708471] [] evict+0x77/0x118 > <4>[ 131.708475] [] iput+0x13d/0x146 > <4>[ 131.708489] [] btrfs_remove_block_group+0x14d/0x35b > [btrfs] > <4>[ 131.708508] [] btrfs_relocate_chunk+0x464/0x50d > [btrfs] > <4>[ 131.708527] [] ? btrfs_item_key_to_cpu+0x2a/0x46 > [btrfs] > <4>[ 131.708545] [] btrfs_balance+0x1ca/0x219 [btrfs] > <4>[ 131.708563] [] btrfs_ioctl+0x890/0xb87 [btrfs] > <4>[ 131.708567] [] ? handle_mm_fault+0x233/0x24a > <4>[ 131.708572] [] ? do_page_fault+0x340/0x3b2 > <4>[ 131.708577] [] do_vfs_ioctl+0x474/0x4c3 > <4>[ 131.708581] [] ? virt_to_head_page+0xe/0x31 > <4>[ 131.708585] [] ? kmem_cache_free+0x20/0xae > <4>[ 131.708588] [] sys_ioctl+0x56/0x79 > <4>[ 131.708592] [] system_call_fastpath+0x16/0x1b > <4>[ 131.708595] ---[ end trace 5f962f46d3ba5425 ]--- > <6>[ 131.708777] btrfs: relocating block group 29360128 flags 20 > <6>[ 132.385682] btrfs: found 85 extents > <0>[ 132.798892] [ cut here ] > <2>[ 132.799014] kernel BUG at fs/btrfs/extent-tree.c:1424! > <0>[ 132.799014] invalid opcode: [#1] SMP > <4>[ 132.799014] CPU 0 > <4>[ 132.799014] Modules linked in: autofs4 sunrpc 8021q garp stp llc > cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c > libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev > parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support > tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod > crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata > scsi_mod floppy [last unloaded: microcode] > <4>[ 132.799014] > <4>[ 132.799014] Pid: 3041, comm: btrfs Tainted: GW 3.0.0-rc2test > #1 FUJITSU-SV PRIMERGY/D2399 > <4>[ 132.799014] RIP: 0010:[] [] > lookup_inline_extent_backref+0xe3/0x3a9 [btrfs] > <4>[ 132.799014] RSP: 0018:880193aa5808 EFLAGS: 00010202 > <4>[ 132.799014] RAX: 0001 RBX: 880192fac000 RCX: > 0002 > <4>[ 132.799014] RDX: 0002 RSI: RDI: > > <4>[ 132.
Re: kernel BUG at fs/btrfs/extent-tree.c:6164!
(2011/06/07 15:17), Tsutomu Itoh wrote: > (2011/06/07 14:59), Tsutomu Itoh wrote: >> Hi liubo, >> >> (2011/06/07 14:31), liubo wrote: >>> On 06/06/2011 04:33 PM, Tsutomu Itoh wrote: Hi, I encountered following panic using 'btrfs-unstable + for-linus' kernel. I ran "btrfs fi bal /test5" command, and mount option of /test5 is as follows: /dev/sdc3 on /test5 type btrfs (rw,space_cache,compress=lzo,inode_cache) >>> >>> So, just a "btrfs fi bal" would lead to the bug? >> >> I think so. >> >>> >>> I've figured out the warnings, but not reproduced the bug yet... >>> I used 'btrfs-unstable + for-linus" whose top commit is >>> >>> commit aa0467d8d2a00e75b2bb6a56a4ee6d70c5d1928f >>> Author: David Sterba >>> Date: Fri Jun 3 16:29:08 2011 +0200 >>> >>> btrfs: fix uninitialized variable warning >> >> It's same of my environment. >> >>> >>> and tried on 1) a single disk, 2) 2 disks and 3) 4 disks respectively, >>> but none of them leaded to the below bug. >> >> The test script and the volume composition that I am executing are >> same as following mail. >> >> http://marc.info/?l=linux-btrfs&m=130680171426371&w=2 >> >> and, in my environment, panic is done within almost 30 minutes when >> test script is executed. I forgot to write. I am adding '-o inode_cache' to the mount option in my test script. > > Another panic occurred when I executed it again. > I rebuilt the kernel with 3.0-rc2. but, same problem occurred. <4>[ 131.708325] WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x74/0x259 [btrfs]() <4>[ 131.708329] Hardware name: PRIMERGY <4>[ 131.708330] Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode] <4>[ 131.708378] Pid: 3041, comm: btrfs Not tainted 3.0.0-rc2test #1 <4>[ 131.708381] Call Trace: <4>[ 131.708388] [] warn_slowpath_common+0x85/0x9d <4>[ 131.708392] [] warn_slowpath_null+0x1a/0x1c <4>[ 131.708410] [] start_transaction+0x74/0x259 [btrfs] <4>[ 131.708430] [] ? btrfs_wait_ordered_range+0xf9/0x11d [btrfs] <4>[ 131.708448] [] btrfs_start_transaction+0x13/0x15 [btrfs] <4>[ 131.708467] [] btrfs_evict_inode+0x113/0x22d [btrfs] <4>[ 131.708471] [] evict+0x77/0x118 <4>[ 131.708475] [] iput+0x13d/0x146 <4>[ 131.708489] [] btrfs_remove_block_group+0x14d/0x35b [btrfs] <4>[ 131.708508] [] btrfs_relocate_chunk+0x464/0x50d [btrfs] <4>[ 131.708527] [] ? btrfs_item_key_to_cpu+0x2a/0x46 [btrfs] <4>[ 131.708545] [] btrfs_balance+0x1ca/0x219 [btrfs] <4>[ 131.708563] [] btrfs_ioctl+0x890/0xb87 [btrfs] <4>[ 131.708567] [] ? handle_mm_fault+0x233/0x24a <4>[ 131.708572] [] ? do_page_fault+0x340/0x3b2 <4>[ 131.708577] [] do_vfs_ioctl+0x474/0x4c3 <4>[ 131.708581] [] ? virt_to_head_page+0xe/0x31 <4>[ 131.708585] [] ? kmem_cache_free+0x20/0xae <4>[ 131.708588] [] sys_ioctl+0x56/0x79 <4>[ 131.708592] [] system_call_fastpath+0x16/0x1b <4>[ 131.708595] ---[ end trace 5f962f46d3ba5425 ]--- <6>[ 131.708777] btrfs: relocating block group 29360128 flags 20 <6>[ 132.385682] btrfs: found 85 extents <0>[ 132.798892] [ cut here ] <2>[ 132.799014] kernel BUG at fs/btrfs/extent-tree.c:1424! <0>[ 132.799014] invalid opcode: [#1] SMP <4>[ 132.799014] CPU 0 <4>[ 132.799014] Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode] <4>[ 132.799014] <4>[ 132.799014] Pid: 3041, comm: btrfs Tainted: GW 3.0.0-rc2test #1 FUJITSU-SV PRIMERGY/D2399 <4>[ 132.799014] RIP: 0010:[] [] lookup_inline_extent_backref+0xe3/0x3a9 [btrfs] <4>[ 132.799014] RSP: 0018:880193aa5808 EFLAGS: 00010202 <4>[ 132.799014] RAX: 0001 RBX: 880192fac000 RCX: 0002 <4>[ 132.799014] RDX: 0002 RSI: RDI: <4>[ 132.799014] RBP: 880193aa58a8 R08: 029c R09: 880193aa56f0 <4>[ 132.799014] R10: 880193aa5648 R11: c2d107e744029d66 R12: 00b2 <4>[ 132.799014] R13: 880195075b88 R14: 0001 R15: <4>[ 132.799014] FS: 7faaaf421740() GS:88019fc0() knlGS: <4>[ 132.799014] CS