Re: [BUG 3.0-rc1] oops during file removal, severe lock contention

2011-06-02 Thread Andi Kleen
Dave Chinner da...@fromorbit.com writes:

 Also, there is massive lock contention while running these workloads.
 perf top shows this for the create after about 5m inodes have been
 created:

We saw pretty much the same thing in some simple tests on large systems
(extent io tree locking and higher level b*tree locks are a problem)
It is being looked at I believe.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG 3.0-rc1] oops during file removal, severe lock contention

2011-06-01 Thread Dave Chinner
Hi Folks,

Running on 3.0-rc1 on an 8p/4G RAM VM with a 16TB filesystem (12
disk DM stripe) a 50 million inode 8-way fsmark creation workload
via:

$ /usr/bin/time ./fs_mark -D 1 -S0 -n 10 -s 0 -L 63 \
 -d /mnt/scratch/0 -d /mnt/scratch/1 \
 -d /mnt/scratch/2 -d /mnt/scratch/3 \
 -d /mnt/scratch/4 -d /mnt/scratch/5 \
 -d /mnt/scratch/6 -d /mnt/scratch/7

followed by an 8-way rm -rf on the result via:

$ for i in /mnt/scratch/*; do /usr/bin/time rm -rf $i 21  done

resulted in this oops:

[ 2671.052861] device fsid 84f7a99b2f193c6-c3228aae4c5a2f8a devid 1 transid 7 
/dev/vda
[ 8626.879250] BUG: unable to handle kernel paging request at 88012000
[ 8626.880020] IP: [81659a43] chksum_update+0x23/0x50
[ 8626.880020] PGD 1ef2063 PUD 11fffa067 PMD 0
[ 8626.880020] Oops:  [#1] SMP
[ 8626.880020] CPU 5
[ 8626.880020] Modules linked in:
[ 8626.880020]
[ 8626.880020] Pid: 3326, comm: btrfs-transacti Not tainted 3.0.0-rc1-dgc+ 
#1272 Bochs Bochs
[ 8626.880020] RIP: 0010:[81659a43]  [81659a43] 
chksum_update+0x23/0x50
[ 8626.880020] RSP: 0018:88010fba7a30  EFLAGS: 00010283
[ 8626.880020] RAX: 2dda3ac0 RBX: 88010fba7a50 RCX: 88012001
[ 8626.880020] RDX: 009d RSI: 88012000 RDI: 88010fba7a50
[ 8626.880020] RBP: 88010fba7a30 R08: 880217846000 R09: 1025
[ 8626.880020] R10: 88011affe0c0 R11: dead00200200 R12: 88010fba7bd0
[ 8626.880020] R13: 880117846025 R14:  R15: 0001
[ 8626.880020] FS:  () GS:88011fd4() 
knlGS:
[ 8626.880020] CS:  0010 DS:  ES:  CR0: 8005003b
[ 8626.880020] CR2: 88012000 CR3: 01ef1000 CR4: 06e0
[ 8626.880020] DR0:  DR1:  DR2: 
[ 8626.880020] DR3:  DR6: 0ff0 DR7: 0400
[ 8626.880020] Process btrfs-transacti (pid: 3326, threadinfo 88010fba6000, 
task 88011affe0c0)
[ 8626.880020] Stack:
[ 8626.880020]  88010fba7a40 81651688 88010fba7a90 
81688967
[ 8626.880020]  880119b065c0 0008 8801 
005005818de0
[ 8626.880020]  88010fba7bd0 880105818de0 88010fba7aa0 
8800cc6c63a0
[ 8626.880020] Call Trace:
[ 8626.880020]  [81651688] crypto_shash_update+0x18/0x30
[ 8626.880020]  [81688967] crc32c+0x47/0x60
[ 8626.880020]  [8159d5e2] btrfs_csum_data+0x12/0x20
[ 8626.880020]  [815df671] __btrfs_write_out_cache+0x601/0xc70
[ 8626.880020]  [815abdc6] ? __btrfs_prealloc_file_range+0x196/0x220
[ 8626.880020]  [81aff90e] ? _raw_spin_lock+0xe/0x20
[ 8626.880020]  [815dfd42] btrfs_write_out_ino_cache+0x62/0xb0
[ 8626.880020]  [8159b33e] btrfs_save_ino_cache+0x11e/0x210
[ 8626.880020]  [815a291d] commit_fs_roots+0xad/0x180
[ 8626.880020]  [81afe77e] ? mutex_lock+0x1e/0x50
[ 8626.880020]  [8158056a] ? btrfs_free_path+0x2a/0x40
[ 8626.880020]  [815a3855] btrfs_commit_transaction+0x375/0x7b0
[ 8626.880020]  [810a40e0] ? wake_up_bit+0x40/0x40
[ 8626.880020]  [8159cf63] transaction_kthread+0x293/0x2b0
[ 8626.880020]  [8159ccd0] ? btrfs_bio_wq_end_io+0x90/0x90
[ 8626.880020]  [810a3b96] kthread+0x96/0xa0
[ 8626.880020]  [81b08724] kernel_thread_helper+0x4/0x10
[ 8626.880020]  [810a3b00] ? kthread_worker_fn+0x190/0x190
[ 8626.880020]  [81b08720] ? gs_change+0x13/0x13
[ 8626.880020] Code: ea ff ff ff c9 c3 66 90 55 48 89 e5 66 66 66 66 90 8b 47 
10 85 d2 74 2d 48 8d 4e 01 44 8d 42 ff 4e 8d 04 01 eb 05 66 90 48 ff c1 0f b6 
16 48 89
[ 8626.880020] RIP  [81659a43] chksum_update+0x23/0x50
[ 8626.880020]  RSP 88010fba7a30
[ 8626.880020] CR2: 88012000
[ 8626.880020] ---[ end trace dad2f8b74a28cc71 ]---

Also, there is massive lock contention while running these workloads.
perf top shows this for the create after about 5m inodes have been
created:

   samples  pcnt function  DSO
 ___ _ _ _

20626.00 25.6% __ticket_spin_lock[kernel.kallsyms]
 5148.00  6.4% _raw_spin_unlock_irqrestore   [kernel.kallsyms]
 3769.00  4.7% test_range_bit[kernel.kallsyms]
 2239.00  2.8% chksum_update [kernel.kallsyms]
 2143.00  2.7% finish_task_switch[kernel.kallsyms]
 1912.00  2.4% inode_tree_add[kernel.kallsyms]
 1825.00  2.3% radix_tree_lookup [kernel.kallsyms]
 1449.00  1.8% generic_bin_search[kernel.kallsyms]
 1205.00  1.5% btrfs_search_slot [kernel.kallsyms]
 1198.00  1.5% btrfs_tree_lock   [kernel.kallsyms]
 1104.00  1.4% mutex_spin_on_owner