Re: btrfs panic problem

2018-09-25 Thread sunny.s.zhang




在 2018年09月25日 16:31, Nikolay Borisov 写道:


On 25.09.2018 11:20, sunny.s.zhang wrote:

在 2018年09月20日 02:36, Liu Bo 写道:

On Mon, Sep 17, 2018 at 5:28 PM, sunny.s.zhang
 wrote:

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
    freelist = 0x2026,   <<< the value is id of one inode
    tid = 29567861,
    page = 0xea0132168d00,
    partial = 0x0
}

And, I found there are two different btrfs inodes pointing
delayed_node. It
means that the same slub is used twice.

I think this slub is freed twice, and then the next pointer of this slub
point itself. So we get the same slub twice.

When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I don't
found what is the process.

Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


By looking at the race,  seems the following commit has addressed it.

btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e=


thanks,
liubo

I don't think so.
this patch has resolved the problem of radix_tree_lookup. I don't think
this can resolve my problem that race occur after
ACCESS_ONCE(btrfs_inode->delayed_node).
Because, if ACCESS_ONCE(btrfs_inode->delayed_node) return the node, then
the function of btrfs_get_delayed_node will return, and don't continue.

Can you reproduce the problem on an upstream kernel with added delays?
The original report is from some RHEL-based distro (presumably oracle
unbreakable linux) so there is no indication currently that this is a
genuine problem in upstream kernels.

Not yet. I will reproduce later.
But I don't have any clue about this race now.
Thanks,
Sunny




Thanks,
Sunny


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


    87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct
inode
*inode)
    88 {
    89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
    90 struct btrfs_root *root = btrfs_inode->root;
    91 u64 ino = btrfs_ino(inode);
    92 struct btrfs_delayed_node *node;
    93
    94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
    95 if (node) {
    96 atomic_inc(&node->refs);
    97 return node;
    98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
   #0 [88130404f940] machine_kexec at 8105ec10
   #1 [88130404f9b0] crash_kexec at 811145b8
   #2 [88130404fa80] oops_end at 8101a868
   #3 [88130404fab0] no_context at 8106ea91
   #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
   #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
   #6 [88130404fb60] __do_page_fault at 8106f328
   #7 [88130404fbd0] do_page_fault at 8106f637
   #8 [88130404fc10] page_fault at 816f6308
  [exception RIP: kmem_cache_alloc+121]
  RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
  RAX:   RBX:   RCX: 01c32b76
  RDX: 01c32b75  RSI:   RDI: 000224b0
  RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
  R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
  R13: 2026  R14: 887e3e230a00  R15: a01abf49
  ORIG_RAX:   CS: 0010  SS: 0018
   #9 [88130404fd10] btrfs_get_or_create_delayed_node at
a01abf49
[btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
[btrfs]
#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#1

Re: btrfs panic problem

2018-09-25 Thread Nikolay Borisov



On 25.09.2018 11:20, sunny.s.zhang wrote:
> 
> 在 2018年09月20日 02:36, Liu Bo 写道:
>> On Mon, Sep 17, 2018 at 5:28 PM, sunny.s.zhang
>>  wrote:
>>> Hi All,
>>>
>>> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
>>> btrfs_get_or_create_delayed_node.
>>>
>>> I found that the freelist of the slub is wrong.
>>>
>>> crash> struct kmem_cache_cpu 887e7d7a24b0
>>>
>>> struct kmem_cache_cpu {
>>>    freelist = 0x2026,   <<< the value is id of one inode
>>>    tid = 29567861,
>>>    page = 0xea0132168d00,
>>>    partial = 0x0
>>> }
>>>
>>> And, I found there are two different btrfs inodes pointing
>>> delayed_node. It
>>> means that the same slub is used twice.
>>>
>>> I think this slub is freed twice, and then the next pointer of this slub
>>> point itself. So we get the same slub twice.
>>>
>>> When use this slub again, that break the freelist.
>>>
>>> Folloing code will make the delayed node being freed twice. But I don't
>>> found what is the process.
>>>
>>> Process A (btrfs_evict_inode) Process B
>>>
>>> call btrfs_remove_delayed_node call  btrfs_get_delayed_node
>>>
>>> node = ACCESS_ONCE(btrfs_inode->delayed_node);
>>>
>>> BTRFS_I(inode)->delayed_node = NULL;
>>> btrfs_release_delayed_node(delayed_node);
>>>
>>> if (node) {
>>> atomic_inc(&node->refs);
>>> return node;
>>> }
>>>
>>> ..
>>>
>>> btrfs_release_delayed_node(delayed_node);
>>>
>> By looking at the race,  seems the following commit has addressed it.
>>
>> btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e=
>>
>>
>> thanks,
>> liubo
> 
> I don't think so.
> this patch has resolved the problem of radix_tree_lookup. I don't think
> this can resolve my problem that race occur after
> ACCESS_ONCE(btrfs_inode->delayed_node).
> Because, if ACCESS_ONCE(btrfs_inode->delayed_node) return the node, then
> the function of btrfs_get_delayed_node will return, and don't continue.

Can you reproduce the problem on an upstream kernel with added delays?
The original report is from some RHEL-based distro (presumably oracle
unbreakable linux) so there is no indication currently that this is a
genuine problem in upstream kernels.

> 
> Thanks,
> Sunny
> 
>>
>>> 1313 void btrfs_remove_delayed_node(struct inode *inode)
>>> 1314 {
>>> 1315 struct btrfs_delayed_node *delayed_node;
>>> 1316
>>> 1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
>>> 1318 if (!delayed_node)
>>> 1319 return;
>>> 1320
>>> 1321 BTRFS_I(inode)->delayed_node = NULL;
>>> 1322 btrfs_release_delayed_node(delayed_node);
>>> 1323 }
>>>
>>>
>>>    87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct
>>> inode
>>> *inode)
>>>    88 {
>>>    89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>>>    90 struct btrfs_root *root = btrfs_inode->root;
>>>    91 u64 ino = btrfs_ino(inode);
>>>    92 struct btrfs_delayed_node *node;
>>>    93
>>>    94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
>>>    95 if (node) {
>>>    96 atomic_inc(&node->refs);
>>>    97 return node;
>>>    98 }
>>>
>>>
>>> Thanks,
>>>
>>> Sunny
>>>
>>>
>>> PS:
>>>
>>> 
>>>
>>> panic informations
>>>
>>> PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
>>>   #0 [88130404f940] machine_kexec at 8105ec10
>>>   #1 [88130404f9b0] crash_kexec at 811145b8
>>>   #2 [88130404fa80] oops_end at 8101a868
>>>   #3 [88130404fab0] no_context at 8106ea91
>>>   #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
>>>   #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
>>>   #6 [88130404fb60] __do_page_fault at 8106f328
>>>   #7 [88130404fbd0] do_page_fault at 8106f637
>>>   #8 [88130404fc10] page_fault at 816f6308
>>>  [exception RIP: kmem_cache_alloc+121]
>>>  RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
>>>  RAX:   RBX:   RCX: 01c32b76
>>>  RDX: 01c32b75  RSI:   RDI: 000224b0
>>>  RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
>>>  R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
>>>  R13: 2026  R14: 887e3e230a00  R15: a01abf49
>>>  ORIG_RAX:   CS: 0010  SS: 0018
>>>   #9 [88130404fd10] btrfs_get_or_create_delayed_node at
>>> a01abf49
>>> [btrfs]
>>> #10 [88130404fd60] btrfs_delayed_update_inode at fff

Re: btrfs panic problem

2018-09-25 Thread sunny.s.zhang



在 2018年09月20日 00:12, Nikolay Borisov 写道:

On 19.09.2018 02:53, sunny.s.zhang wrote:

Hi Duncan,

Thank you for your advice. I understand what you mean.  But i have
reviewed the latest btrfs code, and i think the issue is exist still.

At 71 line, if the function of btrfs_get_delayed_node run over this
line, then switch to other process, which run over the 1282 and release
the delayed node at the end.

And then, switch back to the  btrfs_get_delayed_node. find that the node
is not null, and use it as normal. that mean we used a freed memory.

at some time, this memory will be freed again.

latest code as below.

1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode)
1279 {
1280 struct btrfs_delayed_node *delayed_node;
1281
1282 delayed_node = READ_ONCE(inode->delayed_node);
1283 if (!delayed_node)
1284 return;
1285
1286 inode->delayed_node = NULL;
1287 btrfs_release_delayed_node(delayed_node);
1288 }


   64 static struct btrfs_delayed_node *btrfs_get_delayed_node(
   65 struct btrfs_inode *btrfs_inode)
   66 {
   67 struct btrfs_root *root = btrfs_inode->root;
   68 u64 ino = btrfs_ino(btrfs_inode);
   69 struct btrfs_delayed_node *node;
   70
   71 node = READ_ONCE(btrfs_inode->delayed_node);
   72 if (node) {
   73 refcount_inc(&node->refs);
   74 return node;
   75 }
   76
   77 spin_lock(&root->inode_lock);
   78 node = radix_tree_lookup(&root->delayed_nodes_tree, ino);



You are analysis is correct, however it's missing one crucial point -
btrfs_remove_delayed_node is called only from btrfs_evict_inode. And
inodes are evicted when all other references have been dropped. Check
the code in evict_inodes() - inodes are added to the dispose list when
their i_count is 0 at which point there should be no references in this
inode. This invalidates your analysis...

Thanks.
Yes, I know this.  and I know that other process can not use this inode 
if the inode is in the I_FREEING status.
But,  Chris has fixed a bug, which is similar with this and is found in 
production.  it mean that this will occur in some condition.


btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e=


在 2018年09月18日 13:05, Duncan 写道:

sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:


My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

I found that the freelist of the slub is wrong.

[Not a dev, just a btrfs list regular and user, myself.  But here's a
general btrfs list recommendations reply...]

You appear to mean kernel 4.1.12 -- confirmed by the version reported in
the posted dump:  4.1.12-112.14.13.el6uek.x86_64

OK, so from the perspective of this forward-development-focused list,
kernel 4.1 is pretty ancient history, but you do have a number of
options.

First let's consider the general situation.  Most people choose an
enterprise distro for supported stability, and that's certainly a valid
thing to want.  However, btrfs, while now reaching early maturity for the
basics (single device in single or dup mode, and multi-device in single/
raid0/1/10 modes, note that raid56 mode is newer and less mature),
remains under quite heavy development, and keeping reasonably current is
recommended for that reason.

So you you chose an enterprise distro presumably to lock in supported
stability for several years, but you chose a filesystem, btrfs, that's
still under heavy development, with reasonably current kernels and
userspace recommended as tending to have the known bugs fixed.  There's a
bit of a conflict there, and the /general/ recommendation would thus be
to consider whether one or the other of those choices are inappropriate
for your use-case, because it's really quite likely that if you really
want the stability of an enterprise distro and kernel, that btrfs isn't
as stable a filesystem as you're likely to want to match with it.
Alternatively, if you want something newer to match the still under heavy
development btrfs, you very likely want a distro that's not focused on
years-old stability just for the sake of it.  One or the other is likely
to be a poor match for your needs, and choosing something else that's a
better match is likely to be a much better experience for you.

But perhaps you do have reason to want to run the newer and not quite to
traditional enterprise-distro level stability btrfs, on an otherwise
older and very stable enterprise distro.  That's fine, provided you know
what you're getting yourself into, and are pre

Re: btrfs panic problem

2018-09-25 Thread sunny.s.zhang



在 2018年09月20日 02:36, Liu Bo 写道:

On Mon, Sep 17, 2018 at 5:28 PM, sunny.s.zhang  wrote:

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
   freelist = 0x2026,   <<< the value is id of one inode
   tid = 29567861,
   page = 0xea0132168d00,
   partial = 0x0
}

And, I found there are two different btrfs inodes pointing delayed_node. It
means that the same slub is used twice.

I think this slub is freed twice, and then the next pointer of this slub
point itself. So we get the same slub twice.

When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I don't
found what is the process.

Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


By looking at the race,  seems the following commit has addressed it.

btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e=

thanks,
liubo


I don't think so.
this patch has resolved the problem of radix_tree_lookup. I don't think 
this can resolve my problem that race occur after 
ACCESS_ONCE(btrfs_inode->delayed_node).
Because, if ACCESS_ONCE(btrfs_inode->delayed_node) return the node, then 
the function of btrfs_get_delayed_node will return, and don't continue.


Thanks,
Sunny




1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


   87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode
*inode)
   88 {
   89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
   90 struct btrfs_root *root = btrfs_inode->root;
   91 u64 ino = btrfs_ino(inode);
   92 struct btrfs_delayed_node *node;
   93
   94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
   95 if (node) {
   96 atomic_inc(&node->refs);
   97 return node;
   98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
  #0 [88130404f940] machine_kexec at 8105ec10
  #1 [88130404f9b0] crash_kexec at 811145b8
  #2 [88130404fa80] oops_end at 8101a868
  #3 [88130404fab0] no_context at 8106ea91
  #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
  #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
  #6 [88130404fb60] __do_page_fault at 8106f328
  #7 [88130404fbd0] do_page_fault at 8106f637
  #8 [88130404fc10] page_fault at 816f6308
 [exception RIP: kmem_cache_alloc+121]
 RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
 RAX:   RBX:   RCX: 01c32b76
 RDX: 01c32b75  RSI:   RDI: 000224b0
 RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
 R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
 R13: 2026  R14: 887e3e230a00  R15: a01abf49
 ORIG_RAX:   CS: 0010  SS: 0018
  #9 [88130404fd10] btrfs_get_or_create_delayed_node at a01abf49
[btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
[btrfs]
#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
 RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
 RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
 RDX: 1000  RSI: 00c83da14000  RDI: 0011
 RBP:    R8:    R9: 

Re: btrfs panic problem

2018-09-19 Thread Liu Bo
On Mon, Sep 17, 2018 at 5:28 PM, sunny.s.zhang  wrote:
> Hi All,
>
> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
> btrfs_get_or_create_delayed_node.
>
> I found that the freelist of the slub is wrong.
>
> crash> struct kmem_cache_cpu 887e7d7a24b0
>
> struct kmem_cache_cpu {
>   freelist = 0x2026,   <<< the value is id of one inode
>   tid = 29567861,
>   page = 0xea0132168d00,
>   partial = 0x0
> }
>
> And, I found there are two different btrfs inodes pointing delayed_node. It
> means that the same slub is used twice.
>
> I think this slub is freed twice, and then the next pointer of this slub
> point itself. So we get the same slub twice.
>
> When use this slub again, that break the freelist.
>
> Folloing code will make the delayed node being freed twice. But I don't
> found what is the process.
>
> Process A (btrfs_evict_inode) Process B
>
> call btrfs_remove_delayed_node call  btrfs_get_delayed_node
>
> node = ACCESS_ONCE(btrfs_inode->delayed_node);
>
> BTRFS_I(inode)->delayed_node = NULL;
> btrfs_release_delayed_node(delayed_node);
>
> if (node) {
> atomic_inc(&node->refs);
> return node;
> }
>
> ..
>
> btrfs_release_delayed_node(delayed_node);
>

By looking at the race,  seems the following commit has addressed it.

btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ec35e48b286959991cdbb886f1bdeda4575c80b4

thanks,
liubo


>
> 1313 void btrfs_remove_delayed_node(struct inode *inode)
> 1314 {
> 1315 struct btrfs_delayed_node *delayed_node;
> 1316
> 1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
> 1318 if (!delayed_node)
> 1319 return;
> 1320
> 1321 BTRFS_I(inode)->delayed_node = NULL;
> 1322 btrfs_release_delayed_node(delayed_node);
> 1323 }
>
>
>   87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode
> *inode)
>   88 {
>   89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>   90 struct btrfs_root *root = btrfs_inode->root;
>   91 u64 ino = btrfs_ino(inode);
>   92 struct btrfs_delayed_node *node;
>   93
>   94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
>   95 if (node) {
>   96 atomic_inc(&node->refs);
>   97 return node;
>   98 }
>
>
> Thanks,
>
> Sunny
>
>
> PS:
>
> 
>
> panic informations
>
> PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
>  #0 [88130404f940] machine_kexec at 8105ec10
>  #1 [88130404f9b0] crash_kexec at 811145b8
>  #2 [88130404fa80] oops_end at 8101a868
>  #3 [88130404fab0] no_context at 8106ea91
>  #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
>  #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
>  #6 [88130404fb60] __do_page_fault at 8106f328
>  #7 [88130404fbd0] do_page_fault at 8106f637
>  #8 [88130404fc10] page_fault at 816f6308
> [exception RIP: kmem_cache_alloc+121]
> RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
> RAX:   RBX:   RCX: 01c32b76
> RDX: 01c32b75  RSI:   RDI: 000224b0
> RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
> R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
> R13: 2026  R14: 887e3e230a00  R15: a01abf49
> ORIG_RAX:   CS: 0010  SS: 0018
>  #9 [88130404fd10] btrfs_get_or_create_delayed_node at a01abf49
> [btrfs]
> #10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
> [btrfs]
> #11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
> #12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
> #13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
> #14 [88130404fe50] touch_atime at 812286d3
> #15 [88130404fe90] iterate_dir at 81221929
> #16 [88130404fee0] sys_getdents64 at 81221a19
> #17 [88130404ff50] system_call_fastpath at 816f2594
> RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
> RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
> RDX: 1000  RSI: 00c83da14000  RDI: 0011
> RBP:    R8:    R9: 
> R10:   R11: 0246  R12: 00c7
> R13: 02174e74  R14: 0555  R15: 0038
> ORIG_RAX: 00d9  CS: 0033  SS: 002b
>
>
> We also find the list double add informations, including n_list and p_list:
>
> [8642921.110568] [ cut here ]
> [8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33
> __list_add+0xbe/0xd0()
> [8642921.263780] lis

Re: btrfs panic problem

2018-09-19 Thread Nikolay Borisov



On 19.09.2018 02:53, sunny.s.zhang wrote:
> Hi Duncan,
> 
> Thank you for your advice. I understand what you mean.  But i have
> reviewed the latest btrfs code, and i think the issue is exist still.
> 
> At 71 line, if the function of btrfs_get_delayed_node run over this
> line, then switch to other process, which run over the 1282 and release
> the delayed node at the end.
> 
> And then, switch back to the  btrfs_get_delayed_node. find that the node
> is not null, and use it as normal. that mean we used a freed memory.
> 
> at some time, this memory will be freed again.
> 
> latest code as below.
> 
> 1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode)
> 1279 {
> 1280 struct btrfs_delayed_node *delayed_node;
> 1281
> 1282 delayed_node = READ_ONCE(inode->delayed_node);
> 1283 if (!delayed_node)
> 1284 return;
> 1285
> 1286 inode->delayed_node = NULL;
> 1287 btrfs_release_delayed_node(delayed_node);
> 1288 }
> 
> 
>   64 static struct btrfs_delayed_node *btrfs_get_delayed_node(
>   65 struct btrfs_inode *btrfs_inode)
>   66 {
>   67 struct btrfs_root *root = btrfs_inode->root;
>   68 u64 ino = btrfs_ino(btrfs_inode);
>   69 struct btrfs_delayed_node *node;
>   70
>   71 node = READ_ONCE(btrfs_inode->delayed_node);
>   72 if (node) {
>   73 refcount_inc(&node->refs);
>   74 return node;
>   75 }
>   76
>   77 spin_lock(&root->inode_lock);
>   78 node = radix_tree_lookup(&root->delayed_nodes_tree, ino);
> 
> 

You are analysis is correct, however it's missing one crucial point -
btrfs_remove_delayed_node is called only from btrfs_evict_inode. And
inodes are evicted when all other references have been dropped. Check
the code in evict_inodes() - inodes are added to the dispose list when
their i_count is 0 at which point there should be no references in this
inode. This invalidates your analysis...

> 在 2018年09月18日 13:05, Duncan 写道:
>> sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:
>>
>>> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
>>> btrfs_get_or_create_delayed_node.
>>>
>>> I found that the freelist of the slub is wrong.
>> [Not a dev, just a btrfs list regular and user, myself.  But here's a
>> general btrfs list recommendations reply...]
>>
>> You appear to mean kernel 4.1.12 -- confirmed by the version reported in
>> the posted dump:  4.1.12-112.14.13.el6uek.x86_64
>>
>> OK, so from the perspective of this forward-development-focused list,
>> kernel 4.1 is pretty ancient history, but you do have a number of
>> options.
>>
>> First let's consider the general situation.  Most people choose an
>> enterprise distro for supported stability, and that's certainly a valid
>> thing to want.  However, btrfs, while now reaching early maturity for the
>> basics (single device in single or dup mode, and multi-device in single/
>> raid0/1/10 modes, note that raid56 mode is newer and less mature),
>> remains under quite heavy development, and keeping reasonably current is
>> recommended for that reason.
>>
>> So you you chose an enterprise distro presumably to lock in supported
>> stability for several years, but you chose a filesystem, btrfs, that's
>> still under heavy development, with reasonably current kernels and
>> userspace recommended as tending to have the known bugs fixed.  There's a
>> bit of a conflict there, and the /general/ recommendation would thus be
>> to consider whether one or the other of those choices are inappropriate
>> for your use-case, because it's really quite likely that if you really
>> want the stability of an enterprise distro and kernel, that btrfs isn't
>> as stable a filesystem as you're likely to want to match with it.
>> Alternatively, if you want something newer to match the still under heavy
>> development btrfs, you very likely want a distro that's not focused on
>> years-old stability just for the sake of it.  One or the other is likely
>> to be a poor match for your needs, and choosing something else that's a
>> better match is likely to be a much better experience for you.
>>
>> But perhaps you do have reason to want to run the newer and not quite to
>> traditional enterprise-distro level stability btrfs, on an otherwise
>> older and very stable enterprise distro.  That's fine, provided you know
>> what you're getting yourself into, and are prepared to deal with it.
>>
>> In that case, for best support from the list, we'd recommend running one
>> of the latest two kernels in either the current or mainline LTS tracks.
>>
>> For current track, With 4.18 being the latest kernel, that'd be 4.18 or
>> 4.17, as available on kernel.org (tho 4.17 is already EOL, no further
>> releases, at 4.17.19).
>>
>> For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series
>> kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18
>> and it's just not ou

Re: btrfs panic problem

2018-09-18 Thread Qu Wenruo


On 2018/9/19 上午8:35, sunny.s.zhang wrote:
> 
> 在 2018年09月19日 08:05, Qu Wenruo 写道:
>>
>> On 2018/9/18 上午8:28, sunny.s.zhang wrote:
>>> Hi All,
>>>
>>> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
>>> btrfs_get_or_create_delayed_node.
>> Any reproducer?
>>
>> Anyway we need a reproducer as a testcase.
> 
> I have had a try, but could not  reproduce yet.

Since it's just one hit in production environment, I'm afraid we need to
inject some sleep or delay into this code and try bombing it with fsstress.

Despite that I have no good idea on reproducing it.

Thanks,
Qu

> 
> Any advice to reproduce it?
> 
>>
>> The code looks
>>
>>> I found that the freelist of the slub is wrong.
>>>
>>> crash> struct kmem_cache_cpu 887e7d7a24b0
>>>
>>> struct kmem_cache_cpu {
>>>    freelist = 0x2026,   <<< the value is id of one inode
>>>    tid = 29567861,
>>>    page = 0xea0132168d00,
>>>    partial = 0x0
>>> }
>>>
>>> And, I found there are two different btrfs inodes pointing delayed_node.
>>> It means that the same slub is used twice.
>>>
>>> I think this slub is freed twice, and then the next pointer of this slub
>>> point itself. So we get the same slub twice.
>>>
>>> When use this slub again, that break the freelist.
>>>
>>> Folloing code will make the delayed node being freed twice. But I don't
>>> found what is the process.
>>>
>>> Process A (btrfs_evict_inode) Process B
>>>
>>> call btrfs_remove_delayed_node call  btrfs_get_delayed_node
>>>
>>> node = ACCESS_ONCE(btrfs_inode->delayed_node);
>>>
>>> BTRFS_I(inode)->delayed_node = NULL;
>>> btrfs_release_delayed_node(delayed_node);
>>>
>>> if (node) {
>>> atomic_inc(&node->refs);
>>> return node;
>>> }
>>>
>>> ..
>>>
>>> btrfs_release_delayed_node(delayed_node);
>>>
>>>
>>> 1313 void btrfs_remove_delayed_node(struct inode *inode)
>>> 1314 {
>>> 1315 struct btrfs_delayed_node *delayed_node;
>>> 1316
>>> 1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
>>> 1318 if (!delayed_node)
>>> 1319 return;
>>> 1320
>>> 1321 BTRFS_I(inode)->delayed_node = NULL;
>>> 1322 btrfs_release_delayed_node(delayed_node);
>>> 1323 }
>>>
>>>
>>>    87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct
>>> inode *inode)
>>>    88 {
>>>    89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>>>    90 struct btrfs_root *root = btrfs_inode->root;
>>>    91 u64 ino = btrfs_ino(inode);
>>>    92 struct btrfs_delayed_node *node;
>>>    93
>>>    94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
>>>    95 if (node) {
>>>    96 atomic_inc(&node->refs);
>>>    97 return node;
>>>    98 }
>>>
>> The analyse looks valid.
>> Can be fixed by adding a spinlock.
>>
>> Just wondering why we didn't hit it.
> 
> It just appeared once in our production environment.
> 
> Thanks,
> Sunny
>>
>> Thanks,
>> Qu
>>
>>> Thanks,
>>>
>>> Sunny
>>>
>>>
>>> PS:
>>>
>>> 
>>>
>>> panic informations
>>>
>>> PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
>>>   #0 [88130404f940] machine_kexec at 8105ec10
>>>   #1 [88130404f9b0] crash_kexec at 811145b8
>>>   #2 [88130404fa80] oops_end at 8101a868
>>>   #3 [88130404fab0] no_context at 8106ea91
>>>   #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
>>>   #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
>>>   #6 [88130404fb60] __do_page_fault at 8106f328
>>>   #7 [88130404fbd0] do_page_fault at 8106f637
>>>   #8 [88130404fc10] page_fault at 816f6308
>>>  [exception RIP: kmem_cache_alloc+121]
>>>  RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
>>>  RAX:   RBX:   RCX: 01c32b76
>>>  RDX: 01c32b75  RSI:   RDI: 000224b0
>>>  RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
>>>  R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
>>>  R13: 2026  R14: 887e3e230a00  R15: a01abf49
>>>  ORIG_RAX:   CS: 0010  SS: 0018
>>>   #9 [88130404fd10] btrfs_get_or_create_delayed_node at
>>> a01abf49 [btrfs]
>>> #10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
>>> [btrfs]
>>> #11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
>>> #12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
>>> #13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
>>> #14 [88130404fe50] touch_atime at 812286d3
>>> #15 [88130404fe90] iterate_dir at 81221929
>>> #16 [88130404fee0] sys_getdents64 at 81221a19
>>> #17 [88130404ff50] system_call_fastpath at 816f2594
>>>  RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
>>> 

Re: btrfs panic problem

2018-09-18 Thread sunny.s.zhang



在 2018年09月19日 08:05, Qu Wenruo 写道:


On 2018/9/18 上午8:28, sunny.s.zhang wrote:

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

Any reproducer?

Anyway we need a reproducer as a testcase.


I have had a try, but could not  reproduce yet.

Any advice to reproduce it?



The code looks


I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
   freelist = 0x2026,   <<< the value is id of one inode
   tid = 29567861,
   page = 0xea0132168d00,
   partial = 0x0
}

And, I found there are two different btrfs inodes pointing delayed_node.
It means that the same slub is used twice.

I think this slub is freed twice, and then the next pointer of this slub
point itself. So we get the same slub twice.

When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I don't
found what is the process.

Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


   87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct
inode *inode)
   88 {
   89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
   90 struct btrfs_root *root = btrfs_inode->root;
   91 u64 ino = btrfs_ino(inode);
   92 struct btrfs_delayed_node *node;
   93
   94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
   95 if (node) {
   96 atomic_inc(&node->refs);
   97 return node;
   98 }


The analyse looks valid.
Can be fixed by adding a spinlock.

Just wondering why we didn't hit it.


It just appeared once in our production environment.

Thanks,
Sunny


Thanks,
Qu


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
  #0 [88130404f940] machine_kexec at 8105ec10
  #1 [88130404f9b0] crash_kexec at 811145b8
  #2 [88130404fa80] oops_end at 8101a868
  #3 [88130404fab0] no_context at 8106ea91
  #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
  #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
  #6 [88130404fb60] __do_page_fault at 8106f328
  #7 [88130404fbd0] do_page_fault at 8106f637
  #8 [88130404fc10] page_fault at 816f6308
     [exception RIP: kmem_cache_alloc+121]
     RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
     RAX:   RBX:   RCX: 01c32b76
     RDX: 01c32b75  RSI:   RDI: 000224b0
     RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
     R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
     R13: 2026  R14: 887e3e230a00  R15: a01abf49
     ORIG_RAX:   CS: 0010  SS: 0018
  #9 [88130404fd10] btrfs_get_or_create_delayed_node at
a01abf49 [btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
[btrfs]
#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
     RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
     RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
     RDX: 1000  RSI: 00c83da14000  RDI: 0011
     RBP:    R8:    R9: 
     R10:   R11: 0246  R12: 00c7
     R13: 02174e74  R14: 0555  R15: 0038
     ORIG_RAX: 00d9  CS: 0033  SS: 002b


We also find the list double add informations, including n_list and p_list:

[8642921.110568] [ cut here ]
[8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33
__list_add+0xbe/0xd0()
[8642921.263780] list_add corruption. prev->next should be next
(887e40fa5368), but was ff:ff884c85a362

Re: btrfs panic problem

2018-09-18 Thread Qu Wenruo


On 2018/9/18 上午8:28, sunny.s.zhang wrote:
> Hi All,
> 
> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
> btrfs_get_or_create_delayed_node.

Any reproducer?

Anyway we need a reproducer as a testcase.

The code looks

> 
> I found that the freelist of the slub is wrong.
> 
> crash> struct kmem_cache_cpu 887e7d7a24b0
> 
> struct kmem_cache_cpu {
>   freelist = 0x2026,   <<< the value is id of one inode
>   tid = 29567861,
>   page = 0xea0132168d00,
>   partial = 0x0
> }
> 
> And, I found there are two different btrfs inodes pointing delayed_node.
> It means that the same slub is used twice.
> 
> I think this slub is freed twice, and then the next pointer of this slub
> point itself. So we get the same slub twice.
> 
> When use this slub again, that break the freelist.
> 
> Folloing code will make the delayed node being freed twice. But I don't
> found what is the process.
> 
> Process A (btrfs_evict_inode) Process B
> 
> call btrfs_remove_delayed_node call  btrfs_get_delayed_node
> 
> node = ACCESS_ONCE(btrfs_inode->delayed_node);
> 
> BTRFS_I(inode)->delayed_node = NULL;
> btrfs_release_delayed_node(delayed_node);
> 
> if (node) {
> atomic_inc(&node->refs);
> return node;
> }
> 
> ..
> 
> btrfs_release_delayed_node(delayed_node);
> 
> 
> 1313 void btrfs_remove_delayed_node(struct inode *inode)
> 1314 {
> 1315 struct btrfs_delayed_node *delayed_node;
> 1316
> 1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
> 1318 if (!delayed_node)
> 1319 return;
> 1320
> 1321 BTRFS_I(inode)->delayed_node = NULL;
> 1322 btrfs_release_delayed_node(delayed_node);
> 1323 }
> 
> 
>   87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct
> inode *inode)
>   88 {
>   89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>   90 struct btrfs_root *root = btrfs_inode->root;
>   91 u64 ino = btrfs_ino(inode);
>   92 struct btrfs_delayed_node *node;
>   93
>   94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
>   95 if (node) {
>   96 atomic_inc(&node->refs);
>   97 return node;
>   98 }
> 

The analyse looks valid.
Can be fixed by adding a spinlock.

Just wondering why we didn't hit it.

Thanks,
Qu

> 
> Thanks,
> 
> Sunny
> 
> 
> PS:
> 
> 
> 
> panic informations
> 
> PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
>  #0 [88130404f940] machine_kexec at 8105ec10
>  #1 [88130404f9b0] crash_kexec at 811145b8
>  #2 [88130404fa80] oops_end at 8101a868
>  #3 [88130404fab0] no_context at 8106ea91
>  #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
>  #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
>  #6 [88130404fb60] __do_page_fault at 8106f328
>  #7 [88130404fbd0] do_page_fault at 8106f637
>  #8 [88130404fc10] page_fault at 816f6308
>     [exception RIP: kmem_cache_alloc+121]
>     RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
>     RAX:   RBX:   RCX: 01c32b76
>     RDX: 01c32b75  RSI:   RDI: 000224b0
>     RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
>     R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
>     R13: 2026  R14: 887e3e230a00  R15: a01abf49
>     ORIG_RAX:   CS: 0010  SS: 0018
>  #9 [88130404fd10] btrfs_get_or_create_delayed_node at
> a01abf49 [btrfs]
> #10 [88130404fd60] btrfs_delayed_update_inode at a01aea12
> [btrfs]
> #11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
> #12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
> #13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
> #14 [88130404fe50] touch_atime at 812286d3
> #15 [88130404fe90] iterate_dir at 81221929
> #16 [88130404fee0] sys_getdents64 at 81221a19
> #17 [88130404ff50] system_call_fastpath at 816f2594
>     RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
>     RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
>     RDX: 1000  RSI: 00c83da14000  RDI: 0011
>     RBP:    R8:    R9: 
>     R10:   R11: 0246  R12: 00c7
>     R13: 02174e74  R14: 0555  R15: 0038
>     ORIG_RAX: 00d9  CS: 0033  SS: 002b
> 
> 
> We also find the list double add informations, including n_list and p_list:
> 
> [8642921.110568] [ cut here ]
> [8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33
> __list_add+0xbe/0xd0()
> [8642921.263780] list_add corruption. prev->next should be next
> (887e40fa53

Re: btrfs panic problem

2018-09-18 Thread sunny.s.zhang

Hi Duncan,

Thank you for your advice. I understand what you mean.  But i have 
reviewed the latest btrfs code, and i think the issue is exist still.


At 71 line, if the function of btrfs_get_delayed_node run over this 
line, then switch to other process, which run over the 1282 and release 
the delayed node at the end.


And then, switch back to the  btrfs_get_delayed_node. find that the node 
is not null, and use it as normal. that mean we used a freed memory.


at some time, this memory will be freed again.

latest code as below.

1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode)
1279 {
1280 struct btrfs_delayed_node *delayed_node;
1281
1282 delayed_node = READ_ONCE(inode->delayed_node);
1283 if (!delayed_node)
1284 return;
1285
1286 inode->delayed_node = NULL;
1287 btrfs_release_delayed_node(delayed_node);
1288 }


  64 static struct btrfs_delayed_node *btrfs_get_delayed_node(
  65 struct btrfs_inode *btrfs_inode)
  66 {
  67 struct btrfs_root *root = btrfs_inode->root;
  68 u64 ino = btrfs_ino(btrfs_inode);
  69 struct btrfs_delayed_node *node;
  70
  71 node = READ_ONCE(btrfs_inode->delayed_node);
  72 if (node) {
  73 refcount_inc(&node->refs);
  74 return node;
  75 }
  76
  77 spin_lock(&root->inode_lock);
  78 node = radix_tree_lookup(&root->delayed_nodes_tree, ino);


在 2018年09月18日 13:05, Duncan 写道:

sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:


My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

I found that the freelist of the slub is wrong.

[Not a dev, just a btrfs list regular and user, myself.  But here's a
general btrfs list recommendations reply...]

You appear to mean kernel 4.1.12 -- confirmed by the version reported in
the posted dump:  4.1.12-112.14.13.el6uek.x86_64

OK, so from the perspective of this forward-development-focused list,
kernel 4.1 is pretty ancient history, but you do have a number of options.

First let's consider the general situation.  Most people choose an
enterprise distro for supported stability, and that's certainly a valid
thing to want.  However, btrfs, while now reaching early maturity for the
basics (single device in single or dup mode, and multi-device in single/
raid0/1/10 modes, note that raid56 mode is newer and less mature),
remains under quite heavy development, and keeping reasonably current is
recommended for that reason.

So you you chose an enterprise distro presumably to lock in supported
stability for several years, but you chose a filesystem, btrfs, that's
still under heavy development, with reasonably current kernels and
userspace recommended as tending to have the known bugs fixed.  There's a
bit of a conflict there, and the /general/ recommendation would thus be
to consider whether one or the other of those choices are inappropriate
for your use-case, because it's really quite likely that if you really
want the stability of an enterprise distro and kernel, that btrfs isn't
as stable a filesystem as you're likely to want to match with it.
Alternatively, if you want something newer to match the still under heavy
development btrfs, you very likely want a distro that's not focused on
years-old stability just for the sake of it.  One or the other is likely
to be a poor match for your needs, and choosing something else that's a
better match is likely to be a much better experience for you.

But perhaps you do have reason to want to run the newer and not quite to
traditional enterprise-distro level stability btrfs, on an otherwise
older and very stable enterprise distro.  That's fine, provided you know
what you're getting yourself into, and are prepared to deal with it.

In that case, for best support from the list, we'd recommend running one
of the latest two kernels in either the current or mainline LTS tracks.

For current track, With 4.18 being the latest kernel, that'd be 4.18 or
4.17, as available on kernel.org (tho 4.17 is already EOL, no further
releases, at 4.17.19).

For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series
kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18
and it's just not out of normal stable range yet so not yet marked LTS?),
so it'll be coming up soon and 4.9 will then be dropping to third LTS
series and thus out of our best recommended range.  4.4 was the previous
LTS and while still in LTS support, is outside the two newest LTS series
that this list recommends.

And of course 4.1 is older than 4.4, so as I said, in btrfs development
terms, it's quite ancient indeed... quite out of practical support range
here, tho of course we'll still try, but in many cases the first question
when any problem's reported is going to be whether it's reproducible on
something closer to current.

But... you ARE on an enterprise kernel, likely on an enterpr

Re: btrfs panic problem

2018-09-18 Thread sunny.s.zhang

Add Junxiao


在 2018年09月18日 13:05, Duncan 写道:

sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:


My OS(4.1.12) panic in kmem_cache_alloc, which is called by
btrfs_get_or_create_delayed_node.

I found that the freelist of the slub is wrong.

[Not a dev, just a btrfs list regular and user, myself.  But here's a
general btrfs list recommendations reply...]

You appear to mean kernel 4.1.12 -- confirmed by the version reported in
the posted dump:  4.1.12-112.14.13.el6uek.x86_64

OK, so from the perspective of this forward-development-focused list,
kernel 4.1 is pretty ancient history, but you do have a number of options.

First let's consider the general situation.  Most people choose an
enterprise distro for supported stability, and that's certainly a valid
thing to want.  However, btrfs, while now reaching early maturity for the
basics (single device in single or dup mode, and multi-device in single/
raid0/1/10 modes, note that raid56 mode is newer and less mature),
remains under quite heavy development, and keeping reasonably current is
recommended for that reason.

So you you chose an enterprise distro presumably to lock in supported
stability for several years, but you chose a filesystem, btrfs, that's
still under heavy development, with reasonably current kernels and
userspace recommended as tending to have the known bugs fixed.  There's a
bit of a conflict there, and the /general/ recommendation would thus be
to consider whether one or the other of those choices are inappropriate
for your use-case, because it's really quite likely that if you really
want the stability of an enterprise distro and kernel, that btrfs isn't
as stable a filesystem as you're likely to want to match with it.
Alternatively, if you want something newer to match the still under heavy
development btrfs, you very likely want a distro that's not focused on
years-old stability just for the sake of it.  One or the other is likely
to be a poor match for your needs, and choosing something else that's a
better match is likely to be a much better experience for you.

But perhaps you do have reason to want to run the newer and not quite to
traditional enterprise-distro level stability btrfs, on an otherwise
older and very stable enterprise distro.  That's fine, provided you know
what you're getting yourself into, and are prepared to deal with it.

In that case, for best support from the list, we'd recommend running one
of the latest two kernels in either the current or mainline LTS tracks.

For current track, With 4.18 being the latest kernel, that'd be 4.18 or
4.17, as available on kernel.org (tho 4.17 is already EOL, no further
releases, at 4.17.19).

For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series
kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18
and it's just not out of normal stable range yet so not yet marked LTS?),
so it'll be coming up soon and 4.9 will then be dropping to third LTS
series and thus out of our best recommended range.  4.4 was the previous
LTS and while still in LTS support, is outside the two newest LTS series
that this list recommends.

And of course 4.1 is older than 4.4, so as I said, in btrfs development
terms, it's quite ancient indeed... quite out of practical support range
here, tho of course we'll still try, but in many cases the first question
when any problem's reported is going to be whether it's reproducible on
something closer to current.

But... you ARE on an enterprise kernel, likely on an enterprise distro,
and very possibly actually paying /them/ for support.  So you're not
without options if you prefer to stay with your supported enterprise
kernel.  If you're paying them for support, you might as well use it, and
of course of the very many fixes since 4.1, they know what they've
backported and what they haven't, so they're far better placed to provide
that support in any case.

Or, given what you posted, you appear to be reasonably able to do at
least limited kernel-dev-level analysis yourself.  Given that, you're
already reasonably well placed to simply decide to stick with what you
have and take the support you can get, diving into things yourself if
necessary.


So those are your kernel options.  What about userspace btrfs-progs?

Generally speaking, while the filesystem's running, it's the kernel code
doing most of the work.  If you have old userspace, it simply means you
can't take advantage of some of the newer features as the old userspace
doesn't know how to call for them.

But the situation changes as soon as you have problems and can't mount,
because it's userspace code that runs to try to fix that sort of problem,
or failing that, it's userspace code that btrfs restore runs to try to
grab what files can be grabbed off of the unmountable filesystem.

So for routine operation, it's no big deal if userspace is a bit old, at
least as long as it's new enough to have all the newer command formats,
etc, that you need, and for com

Re: btrfs panic problem

2018-09-17 Thread Duncan
sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:

> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
> btrfs_get_or_create_delayed_node.
> 
> I found that the freelist of the slub is wrong.

[Not a dev, just a btrfs list regular and user, myself.  But here's a 
general btrfs list recommendations reply...]

You appear to mean kernel 4.1.12 -- confirmed by the version reported in 
the posted dump:  4.1.12-112.14.13.el6uek.x86_64

OK, so from the perspective of this forward-development-focused list, 
kernel 4.1 is pretty ancient history, but you do have a number of options.

First let's consider the general situation.  Most people choose an 
enterprise distro for supported stability, and that's certainly a valid 
thing to want.  However, btrfs, while now reaching early maturity for the 
basics (single device in single or dup mode, and multi-device in single/
raid0/1/10 modes, note that raid56 mode is newer and less mature), 
remains under quite heavy development, and keeping reasonably current is 
recommended for that reason.

So you you chose an enterprise distro presumably to lock in supported 
stability for several years, but you chose a filesystem, btrfs, that's 
still under heavy development, with reasonably current kernels and 
userspace recommended as tending to have the known bugs fixed.  There's a 
bit of a conflict there, and the /general/ recommendation would thus be 
to consider whether one or the other of those choices are inappropriate 
for your use-case, because it's really quite likely that if you really 
want the stability of an enterprise distro and kernel, that btrfs isn't 
as stable a filesystem as you're likely to want to match with it.  
Alternatively, if you want something newer to match the still under heavy 
development btrfs, you very likely want a distro that's not focused on 
years-old stability just for the sake of it.  One or the other is likely 
to be a poor match for your needs, and choosing something else that's a 
better match is likely to be a much better experience for you.

But perhaps you do have reason to want to run the newer and not quite to 
traditional enterprise-distro level stability btrfs, on an otherwise 
older and very stable enterprise distro.  That's fine, provided you know 
what you're getting yourself into, and are prepared to deal with it.

In that case, for best support from the list, we'd recommend running one 
of the latest two kernels in either the current or mainline LTS tracks. 

For current track, With 4.18 being the latest kernel, that'd be 4.18 or 
4.17, as available on kernel.org (tho 4.17 is already EOL, no further 
releases, at 4.17.19).

For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series 
kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18 
and it's just not out of normal stable range yet so not yet marked LTS?), 
so it'll be coming up soon and 4.9 will then be dropping to third LTS 
series and thus out of our best recommended range.  4.4 was the previous 
LTS and while still in LTS support, is outside the two newest LTS series 
that this list recommends.

And of course 4.1 is older than 4.4, so as I said, in btrfs development 
terms, it's quite ancient indeed... quite out of practical support range 
here, tho of course we'll still try, but in many cases the first question 
when any problem's reported is going to be whether it's reproducible on 
something closer to current.

But... you ARE on an enterprise kernel, likely on an enterprise distro, 
and very possibly actually paying /them/ for support.  So you're not 
without options if you prefer to stay with your supported enterprise 
kernel.  If you're paying them for support, you might as well use it, and 
of course of the very many fixes since 4.1, they know what they've 
backported and what they haven't, so they're far better placed to provide 
that support in any case.

Or, given what you posted, you appear to be reasonably able to do at 
least limited kernel-dev-level analysis yourself.  Given that, you're 
already reasonably well placed to simply decide to stick with what you 
have and take the support you can get, diving into things yourself if 
necessary.


So those are your kernel options.  What about userspace btrfs-progs?

Generally speaking, while the filesystem's running, it's the kernel code 
doing most of the work.  If you have old userspace, it simply means you 
can't take advantage of some of the newer features as the old userspace 
doesn't know how to call for them.

But the situation changes as soon as you have problems and can't mount, 
because it's userspace code that runs to try to fix that sort of problem, 
or failing that, it's userspace code that btrfs restore runs to try to 
grab what files can be grabbed off of the unmountable filesystem.

So for routine operation, it's no big deal if userspace is a bit old, at 
least as long as it's new enough to have all the newer command formats, 
etc, that you n

Re: btrfs panic problem

2018-09-17 Thread sunny.s.zhang

Sorry, modify some errors:

Process A (btrfs_evict_inode)   Process B

call btrfs_remove_delayed_node   call 
btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);
在 2018年09月18日 08:28, sunny.s.zhang 写道:

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by 
btrfs_get_or_create_delayed_node.


I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
  freelist = 0x2026,   <<< the value is id of one inode
  tid = 29567861,
  page = 0xea0132168d00,
  partial = 0x0
}

And, I found there are two different btrfs inodes pointing 
delayed_node. It means that the same slub is used twice.


I think this slub is freed twice, and then the next pointer of this 
slub point itself. So we get the same slub twice.


When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I 
don't found what is the process.


Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


  87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct 
inode *inode)

  88 {
  89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
  90 struct btrfs_root *root = btrfs_inode->root;
  91 u64 ino = btrfs_ino(inode);
  92 struct btrfs_delayed_node *node;
  93
  94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
  95 if (node) {
  96 atomic_inc(&node->refs);
  97 return node;
  98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
 #0 [88130404f940] machine_kexec at 8105ec10
 #1 [88130404f9b0] crash_kexec at 811145b8
 #2 [88130404fa80] oops_end at 8101a868
 #3 [88130404fab0] no_context at 8106ea91
 #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
 #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
 #6 [88130404fb60] __do_page_fault at 8106f328
 #7 [88130404fbd0] do_page_fault at 8106f637
 #8 [88130404fc10] page_fault at 816f6308
    [exception RIP: kmem_cache_alloc+121]
    RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
    RAX:   RBX:   RCX: 01c32b76
    RDX: 01c32b75  RSI:   RDI: 000224b0
    RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
    R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
    R13: 2026  R14: 887e3e230a00  R15: a01abf49
    ORIG_RAX:   CS: 0010  SS: 0018
 #9 [88130404fd10] btrfs_get_or_create_delayed_node at 
a01abf49 [btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12 
[btrfs]

#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
    RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
    RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
    RDX: 1000  RSI: 00c83da14000  RDI: 0011
    RBP:    R8:    R9: 
    R10:   R11: 0246  R12: 00c7
    R13: 02174e74  R14: 0555  R15: 0038
    ORIG_RAX: 00d9  CS: 0033  SS: 002b


We also find the list double add informations, including n_list and 
p_list:


[8642921.110568] [ cut here ]
[8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33 
__list_add+0xbe/0xd0()
[8642921.263780] list_add corruption. prev->next should be next 
(887e40fa5368), but was ff:ff884c85a36288.

btrfs panic problem

2018-09-17 Thread sunny.s.zhang

Hi All,

My OS(4.1.12) panic in kmem_cache_alloc, which is called by 
btrfs_get_or_create_delayed_node.


I found that the freelist of the slub is wrong.

crash> struct kmem_cache_cpu 887e7d7a24b0

struct kmem_cache_cpu {
  freelist = 0x2026,   <<< the value is id of one inode
  tid = 29567861,
  page = 0xea0132168d00,
  partial = 0x0
}

And, I found there are two different btrfs inodes pointing delayed_node. 
It means that the same slub is used twice.


I think this slub is freed twice, and then the next pointer of this slub 
point itself. So we get the same slub twice.


When use this slub again, that break the freelist.

Folloing code will make the delayed node being freed twice. But I don't 
found what is the process.


Process A (btrfs_evict_inode) Process B

call btrfs_remove_delayed_node call  btrfs_get_delayed_node

node = ACCESS_ONCE(btrfs_inode->delayed_node);

BTRFS_I(inode)->delayed_node = NULL;
btrfs_release_delayed_node(delayed_node);

if (node) {
atomic_inc(&node->refs);
return node;
}

..

btrfs_release_delayed_node(delayed_node);


1313 void btrfs_remove_delayed_node(struct inode *inode)
1314 {
1315 struct btrfs_delayed_node *delayed_node;
1316
1317 delayed_node = ACCESS_ONCE(BTRFS_I(inode)->delayed_node);
1318 if (!delayed_node)
1319 return;
1320
1321 BTRFS_I(inode)->delayed_node = NULL;
1322 btrfs_release_delayed_node(delayed_node);
1323 }


  87 static struct btrfs_delayed_node *btrfs_get_delayed_node(struct 
inode *inode)

  88 {
  89 struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
  90 struct btrfs_root *root = btrfs_inode->root;
  91 u64 ino = btrfs_ino(inode);
  92 struct btrfs_delayed_node *node;
  93
  94 node = ACCESS_ONCE(btrfs_inode->delayed_node);
  95 if (node) {
  96 atomic_inc(&node->refs);
  97 return node;
  98 }


Thanks,

Sunny


PS:



panic informations

PID: 73638  TASK: 887deb586200  CPU: 38  COMMAND: "dockerd"
 #0 [88130404f940] machine_kexec at 8105ec10
 #1 [88130404f9b0] crash_kexec at 811145b8
 #2 [88130404fa80] oops_end at 8101a868
 #3 [88130404fab0] no_context at 8106ea91
 #4 [88130404fb00] __bad_area_nosemaphore at 8106ec8d
 #5 [88130404fb50] bad_area_nosemaphore at 8106eda3
 #6 [88130404fb60] __do_page_fault at 8106f328
 #7 [88130404fbd0] do_page_fault at 8106f637
 #8 [88130404fc10] page_fault at 816f6308
    [exception RIP: kmem_cache_alloc+121]
    RIP: 811ef019  RSP: 88130404fcc8  RFLAGS: 00010286
    RAX:   RBX:   RCX: 01c32b76
    RDX: 01c32b75  RSI:   RDI: 000224b0
    RBP: 88130404fd08   R8: 887e7d7a24b0   R9: 
    R10: 8802668b6618  R11: 0002  R12: 887e3e230a00
    R13: 2026  R14: 887e3e230a00  R15: a01abf49
    ORIG_RAX:   CS: 0010  SS: 0018
 #9 [88130404fd10] btrfs_get_or_create_delayed_node at 
a01abf49 [btrfs]
#10 [88130404fd60] btrfs_delayed_update_inode at a01aea12 
[btrfs]

#11 [88130404fdb0] btrfs_update_inode at a015b199 [btrfs]
#12 [88130404fdf0] btrfs_dirty_inode at a015cd11 [btrfs]
#13 [88130404fe20] btrfs_update_time at a015fa25 [btrfs]
#14 [88130404fe50] touch_atime at 812286d3
#15 [88130404fe90] iterate_dir at 81221929
#16 [88130404fee0] sys_getdents64 at 81221a19
#17 [88130404ff50] system_call_fastpath at 816f2594
    RIP: 006b68e4  RSP: 00c866259080  RFLAGS: 0246
    RAX: ffda  RBX: 00c828dbbe00  RCX: 006b68e4
    RDX: 1000  RSI: 00c83da14000  RDI: 0011
    RBP:    R8:    R9: 
    R10:   R11: 0246  R12: 00c7
    R13: 02174e74  R14: 0555  R15: 0038
    ORIG_RAX: 00d9  CS: 0033  SS: 002b


We also find the list double add informations, including n_list and p_list:

[8642921.110568] [ cut here ]
[8642921.167929] WARNING: CPU: 38 PID: 73638 at lib/list_debug.c:33 
__list_add+0xbe/0xd0()
[8642921.263780] list_add corruption. prev->next should be next 
(887e40fa5368), but was ff:ff884c85a36288. (prev=884c85a36288).
[8642921.405490] Modules linked in: ipt_MASQUERADE 
nf_nat_masquerade_ipv4 xt_conntrack iptable_filter arc4 ecb ppp_mppe 
ppp_async crc_ccitt ppp_generic slhc nfsv3 nfs_acl rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs fscache lockd sunrpc grace veth xt_nat xt_addrtype 
br_netfilter bridge tcp_diag inet_diag oracleacfs(POE) oracleadvm(POE) 
oracleoks(POE) oracleasm autofs4 dm_queue_length cpufreq_powersave 
be2iscsi iscsi_bo

Re: btrfs panic in 3.5.0

2012-08-08 Thread Jan Schmidt
On Thu, August 09, 2012 at 08:42 (+0200), Arne Jansen wrote:
> On 09.08.2012 04:52, Marc MERLIN wrote:
>> On Tue, Aug 07, 2012 at 11:47:36AM -0700, Marc MERLIN wrote:
>>> On Tue, Aug 07, 2012 at 08:14:23PM +0200, Arne Jansen wrote:
 On 08/07/2012 07:40 PM, Marc MERLIN wrote:
> Unfortunately I only have a screenshot.
>
> Apparently the panic was in 
> btrfs_set_lock_blocking_rw
> with a RIP in btrfs_cow_block

 Can you please resolve btrfs_cow_block+0x3b to a line number?

 gdb btrfs.ko
 (gdb) info line *btrfs_cow_block+0x3b
>>>
>>> So, I'm not very good at this, sorry if I'm doing it wrong:
>>> gandalfthegreat:~# gdb 
>>> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko
>>> Reading symbols from 
>>> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko...(no
>>>  debugging symbols found)...done.
>>> (gdb) info line *btrfs_cow_block+0x3b
>>> No line number information available for address 0x9a6e
>>>
>>> Mmmh, it seems that I'm missing a kernel option that adds symbols in 
>>> modules?
>>>
>>> I can add it for my next kernel compile. Do you have the config option name
>>> off hand?
>>>
>>> I put my module here if that helps:
>>> http://marc.merlins.org/tmp/btrfs.ko
>>
>> I felt bad for having a kernel without debug symbols it seems, so I looked
>> at my kernel config and I do have:
>> CONFIG_DEBUG_BUGVERBOSE=y
>> CONFIG_DEBUG_INFO=y
>> # CONFIG_DEBUG_INFO_REDUCED is not set
>>
>> Any idea what else I'm missing to provide better debug info if I have a
>> problem again?
>>
>> And is it reasonably easy to take the .ko apparently without line numbers,
>> like the one I gave you, and infer the line of code for a function offset?
> 
> The .ko is fine. It crashes here:
> 
> noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
> struct btrfs_root *root, struct extent_buffer *buf,
> struct extent_buffer *parent, int parent_slot,
> struct extent_buffer **cow_ret)
> {
> u64 search_start;
> int ret;
> 
> if (trans->transaction != root->fs_info->running_transaction) {
> printk(KERN_CRIT "trans %llu running %llu\n",
>(unsigned long long)trans->transid,
>(unsigned long long)
>root->fs_info->running_transaction->transid);
>   ^^
> 
> WARN_ON(1);
> }
> 
> fs_info->running_transaction is probably NULL.

Agreed. Which means, that we probably came through btrfs_cleanup_transaction,
which explicitly sets it to NULL.

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-08 Thread Arne Jansen
On 09.08.2012 04:52, Marc MERLIN wrote:
> On Tue, Aug 07, 2012 at 11:47:36AM -0700, Marc MERLIN wrote:
>> On Tue, Aug 07, 2012 at 08:14:23PM +0200, Arne Jansen wrote:
>>> On 08/07/2012 07:40 PM, Marc MERLIN wrote:
 Unfortunately I only have a screenshot.

 Apparently the panic was in 
 btrfs_set_lock_blocking_rw
 with a RIP in btrfs_cow_block
>>>
>>> Can you please resolve btrfs_cow_block+0x3b to a line number?
>>>
>>> gdb btrfs.ko
>>> (gdb) info line *btrfs_cow_block+0x3b
>>
>> So, I'm not very good at this, sorry if I'm doing it wrong:
>> gandalfthegreat:~# gdb 
>> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko
>> Reading symbols from 
>> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko...(no
>>  debugging symbols found)...done.
>> (gdb) info line *btrfs_cow_block+0x3b
>> No line number information available for address 0x9a6e
>>
>> Mmmh, it seems that I'm missing a kernel option that adds symbols in modules?
>>
>> I can add it for my next kernel compile. Do you have the config option name
>> off hand?
>>
>> I put my module here if that helps:
>> http://marc.merlins.org/tmp/btrfs.ko
> 
> I felt bad for having a kernel without debug symbols it seems, so I looked
> at my kernel config and I do have:
> CONFIG_DEBUG_BUGVERBOSE=y
> CONFIG_DEBUG_INFO=y
> # CONFIG_DEBUG_INFO_REDUCED is not set
> 
> Any idea what else I'm missing to provide better debug info if I have a
> problem again?
> 
> And is it reasonably easy to take the .ko apparently without line numbers,
> like the one I gave you, and infer the line of code for a function offset?

The .ko is fine. It crashes here:

noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct extent_buffer *buf,
struct extent_buffer *parent, int parent_slot,
struct extent_buffer **cow_ret)
{
u64 search_start;
int ret;

if (trans->transaction != root->fs_info->running_transaction) {
printk(KERN_CRIT "trans %llu running %llu\n",
   (unsigned long long)trans->transid,
   (unsigned long long)
   root->fs_info->running_transaction->transid);
  ^^

WARN_ON(1);
}

fs_info->running_transaction is probably NULL.


> 
> Thanks,
> Marc

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-08 Thread Marc MERLIN
On Tue, Aug 07, 2012 at 11:47:36AM -0700, Marc MERLIN wrote:
> On Tue, Aug 07, 2012 at 08:14:23PM +0200, Arne Jansen wrote:
> > On 08/07/2012 07:40 PM, Marc MERLIN wrote:
> > > Unfortunately I only have a screenshot.
> > > 
> > > Apparently the panic was in 
> > > btrfs_set_lock_blocking_rw
> > > with a RIP in btrfs_cow_block
> > 
> > Can you please resolve btrfs_cow_block+0x3b to a line number?
> > 
> > gdb btrfs.ko
> > (gdb) info line *btrfs_cow_block+0x3b
> 
> So, I'm not very good at this, sorry if I'm doing it wrong:
> gandalfthegreat:~# gdb 
> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko
> Reading symbols from 
> /lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko...(no
>  debugging symbols found)...done.
> (gdb) info line *btrfs_cow_block+0x3b
> No line number information available for address 0x9a6e
> 
> Mmmh, it seems that I'm missing a kernel option that adds symbols in modules?
> 
> I can add it for my next kernel compile. Do you have the config option name
> off hand?
> 
> I put my module here if that helps:
> http://marc.merlins.org/tmp/btrfs.ko

I felt bad for having a kernel without debug symbols it seems, so I looked
at my kernel config and I do have:
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set

Any idea what else I'm missing to provide better debug info if I have a
problem again?

And is it reasonably easy to take the .ko apparently without line numbers,
like the one I gave you, and infer the line of code for a function offset?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-07 Thread Marc MERLIN
On Tue, Aug 07, 2012 at 10:38:28PM -0400, Jérôme Poulin wrote:
> On Tue, Aug 7, 2012 at 1:40 PM, Marc MERLIN  wrote:
> >
> > System rebooted ok.
> 
> I just want to be sure that you are aware that your hard drive is
> currently killing itself. Those READ FPDMA QUEUED mean that your hard
> disk is relocatting bad sectors and has problem reading those.

Yeah, I saw that, so it's actually an SSD (the wretched samsung one I've
been posting about), and I'm just about to return it.

What's interesting is that smart shows no such error:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always   
-   0
  9 Power_On_Hours  0x0032   099   099   000Old_age   Always   
-   132
 12 Power_Cycle_Count   0x0032   099   099   000Old_age   Always   
-   19
177 Wear_Leveling_Count 0x0013   099   099   000Pre-fail  Always   
-   29
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010Pre-fail  Always   
-   0
181 Program_Fail_Cnt_Total  0x0032   100   100   010Old_age   Always   
-   0
182 Erase_Fail_Count_Total  0x0032   100   100   010Old_age   Always   
-   0
183 Runtime_Bad_Block   0x0013   100   100   010Pre-fail  Always   
-   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always   
-   0
190 Airflow_Temperature_Cel 0x0032   051   040   000Old_age   Always   
-   49
195 Hardware_ECC_Recovered  0x001a   200   200   000Old_age   Always   
-   0
199 UDMA_CRC_Error_Count0x003e   253   253   000Old_age   Always   
-   2
235 Unknown_Attribute   0x0012   099   099   000Old_age   Always   
-   6
241 Total_LBAs_Written  0x0032   099   099   000Old_age   Always   
-   627681656

I'm not saying nothing is wrong with the drive, but that it's not a magnetic
bad sector.

Either way, I'm going to get a different SSD soon, although I guess this 
faliure mode
was useful in finding a bug in the btrfs code in the meantime :)

Thanks for the heads up
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-07 Thread Jérôme Poulin
On Tue, Aug 7, 2012 at 1:40 PM, Marc MERLIN  wrote:
>
> System rebooted ok.

I just want to be sure that you are aware that your hard drive is
currently killing itself. Those READ FPDMA QUEUED mean that your hard
disk is relocatting bad sectors and has problem reading those.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-07 Thread Marc MERLIN
On Tue, Aug 07, 2012 at 08:14:23PM +0200, Arne Jansen wrote:
> On 08/07/2012 07:40 PM, Marc MERLIN wrote:
> > Unfortunately I only have a screenshot.
> > 
> > Apparently the panic was in 
> > btrfs_set_lock_blocking_rw
> > with a RIP in btrfs_cow_block
> 
> Can you please resolve btrfs_cow_block+0x3b to a line number?
> 
> gdb btrfs.ko
> (gdb) info line *btrfs_cow_block+0x3b

So, I'm not very good at this, sorry if I'm doing it wrong:
gandalfthegreat:~# gdb 
/lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko
Reading symbols from 
/lib/modules/3.5.0-amd64-preempt-noide-20120410/kernel/fs/btrfs/btrfs.ko...(no 
debugging symbols found)...done.
(gdb) info line *btrfs_cow_block+0x3b
No line number information available for address 0x9a6e

Mmmh, it seems that I'm missing a kernel option that adds symbols in modules?

I can add it for my next kernel compile. Do you have the config option name
off hand?

I put my module here if that helps:
http://marc.merlins.org/tmp/btrfs.ko

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic in 3.5.0

2012-08-07 Thread Arne Jansen
On 08/07/2012 07:40 PM, Marc MERLIN wrote:
> Unfortunately I only have a screenshot.
> 
> Apparently the panic was in 
> btrfs_set_lock_blocking_rw
> with a RIP in btrfs_cow_block
> 

Can you please resolve btrfs_cow_block+0x3b to a line number?

gdb btrfs.ko
(gdb) info line *btrfs_cow_block+0x3b

Thanks,
Arne

> Screenshot here:
> http://marc.merlins.org/tmp/btrfs_oops.jpg
> 
> Because the display looks a bit messed up, I can't tell if the ata error
> happened before or after the oops.
> 
> System rebooted ok.
> 
> Was there a better way to get this ooops if I didn't have serial console?
> 
> Marc
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs panic in 3.5.0

2012-08-07 Thread Marc MERLIN
Unfortunately I only have a screenshot.

Apparently the panic was in 
btrfs_set_lock_blocking_rw
with a RIP in btrfs_cow_block

Screenshot here:
http://marc.merlins.org/tmp/btrfs_oops.jpg

Because the display looks a bit messed up, I can't tell if the ata error
happened before or after the oops.

System rebooted ok.

Was there a better way to get this ooops if I didn't have serial console?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 03/35] btrfs: Panic on bad rbtree operations

2012-03-21 Thread Jeff Mahoney
The ordered data and relocation trees have BUG_ONs to protect against
bad tree operations.

This patch replaces them with a panic that will report the problem.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ordered-data.c |   12 ++--
 fs/btrfs/relocation.c   |   36 +---
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index a1c9404..2857f28 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -59,6 +59,14 @@ static struct rb_node *tree_insert(struct rb_root *root, u64 
file_offset,
return NULL;
 }
 
+static void ordered_data_tree_panic(struct inode *inode, int errno,
+  u64 offset)
+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+   btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at offset "
+   "%llu\n", (unsigned long long)offset);
+}
+
 /*
  * look for a given offset in the tree, and if it can't be found return the
  * first lesser offset
@@ -207,7 +215,8 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
spin_lock(&tree->lock);
node = tree_insert(&tree->tree, file_offset,
   &entry->rb_node);
-   BUG_ON(node);
+   if (node)
+   ordered_data_tree_panic(inode, -EEXIST, file_offset);
spin_unlock(&tree->lock);
 
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
@@ -215,7 +224,6 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
  &BTRFS_I(inode)->root->fs_info->ordered_extents);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
 
-   BUG_ON(node);
return 0;
 }
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 8c1aae2..e5996ff 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -326,6 +326,19 @@ static struct rb_node *tree_search(struct rb_root *root, 
u64 bytenr)
return NULL;
 }
 
+void backref_tree_panic(struct rb_node *rb_node, int errno,
+ u64 bytenr)
+{
+
+   struct btrfs_fs_info *fs_info = NULL;
+   struct backref_node *bnode = rb_entry(rb_node, struct backref_node,
+ rb_node);
+   if (bnode->root)
+   fs_info = bnode->root->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in backref cache "
+   "found at offset %llu\n", (unsigned long long)bytenr);
+}
+
 /*
  * walk up backref nodes until reach node presents tree root
  */
@@ -452,7 +465,8 @@ static void update_backref_node(struct backref_cache *cache,
rb_erase(&node->rb_node, &cache->rb_root);
node->bytenr = bytenr;
rb_node = tree_insert(&cache->rb_root, node->bytenr, &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, bytenr);
 }
 
 /*
@@ -999,7 +1013,8 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, node->bytenr,
  &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
list_add_tail(&node->lower, &cache->leaves);
}
 
@@ -1034,7 +1049,9 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, upper->bytenr,
  &upper->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST,
+  upper->bytenr);
}
 
list_add_tail(&edge->list[UPPER], &upper->lower);
@@ -1180,7 +1197,8 @@ static int clone_backref_node(struct btrfs_trans_handle 
*trans,
 
rb_node = tree_insert(&cache->rb_root, new_node->bytenr,
  &new_node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, new_node->bytenr);
 
if (!new_node->lowest) {
list_for_each_entry(new_edge, &new_node->lower, list[UPPER]) {
@@ -1252,7 +1270,8 @@ static int __update_reloc_root(struct btrfs_root *root, 
int del)
rb_node = tree_insert(&rc->reloc_root_tree.rb_root,
  node->bytenr, &node->rb_node);
spin_unlock(&rc->reloc_root_tree.lock);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
} else {
list_del_init(&root->root_list);
kfree(node);
@@ -3154,7 +3173,8 @@ static int add_tree_block(struct reloc_control *rc,
block->key_ready = 0;
 
rb_node = tree_insert(bl

Re: [patch 03/99] btrfs: Panic on bad rbtree operations

2011-11-24 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/24/2011 06:41 PM, David Sterba wrote:
> On Wed, Nov 23, 2011 at 07:35:36PM -0500, Jeff Mahoney wrote:
>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c 
>> index a1c9404..5a53d94 100644 --- a/fs/btrfs/ordered-data.c +++
>> b/fs/btrfs/ordered-data.c @@ -59,6 +59,14 @@ static struct
>> rb_node *tree_insert(struct rb_root *root, u64 file_offset, 
>> return NULL; }
>> 
>> +NORET_TYPE static void ordered_data_tree_panic(struct inode
>> *inode, int errno, +u64 offset) +{ + 
>> struct
>> btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb)->fs_info; +
>> btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at
>> offset " +   "%llu\n", offset);
>  this will need a cast to (unsigned long long)
> 
>> +} + /* * look for a given offset in the tree, and if it can't be
>> found return the * first lesser offset ---
>> a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -326,6
>> +326,19 @@ static struct rb_node *tree_search(struct rb_root
>> *root, u64 bytenr) return NULL; }
>> 
>> +NORET_TYPE static void backref_tree_panic(struct rb_node
>> *rb_node, int errno, + u64 bytenr) 
>> +{ + +struct
>> btrfs_fs_info *fs_info = NULL; + struct backref_node *bnode =
>> rb_entry(rb_node, struct backref_node, + 
>>   rb_node); +   if
>> (bnode->root) +  fs_info = bnode->root->fs_info; +
>> btrfs_panic(fs_info, errno, "Inconsistency in backref cache " +
>> "found at offset %llu\n", bytenr);
> 
> same here
> 
>> +} +
> 


Thanks. Fixed.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJOzvm5AAoJEB57S2MheeWyQfcQAJXzeVYbLwLoJ0EqUSZBRzSZ
L5s4qWNTgdf6XVl4ZE8WgDJNDq1gMbdsjCug40QtQR8/f9btqbcz7oPnNeQDgfD8
PxZCiarOsm4fAiWDkchm/JDah9YTQCRzvV7Pg/362FnJJl7+a2muecvEMuXXgPdD
otE5BgYz7mJ5imZxpg3JnGwGXUhSiQD4tsprorY8A5I64QUSfGDBkdHNqRe2sVWn
PdN95UZb1z1wz3KZokslczJFsiOQkiOGurvnO+2J8L/+HH6pItKymT7j2F9q3EzQ
vtFP7tFFINfgdJUJyhpDRanhETfuAfwAuSqKVDFmujsPM38zdglSk3nXhh6yIucz
k067pYzHBA2gSJ2ZjRUgMlSMfcbiiYLuXhgFSMZosemoKBpn9RNW8hfxvX3kvBuh
w+oPmaOobRnwQV+ImPQlug2k7a1XpZUbrnJHoflbzEs2APrsmL863B4xHhb8vp+C
7SnlbGmW1Fk2vmsDfTWZHz7/Eb8atTZSdz3m/8lO6S420oBJ3xh7NIWq3sQLUnvg
+kDUfn3FjSRUwq4J/ETAf8fxarCuhDLpUo9MU11oaJ2qz50QUQI5W21bgrvBhbq7
fkQTZoAGirBIK3KIeV1cqeVDFkc5WilOc/moECX1Vicf1TUaUYe/A+Jko+6ObE5K
RH+fOkuY34cb34ccxV9r
=jxMY
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 03/99] btrfs: Panic on bad rbtree operations

2011-11-24 Thread David Sterba
On Wed, Nov 23, 2011 at 07:35:36PM -0500, Jeff Mahoney wrote:
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index a1c9404..5a53d94 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -59,6 +59,14 @@ static struct rb_node *tree_insert(struct rb_root *root, 
> u64 file_offset,
>   return NULL;
>  }
>  
> +NORET_TYPE static void ordered_data_tree_panic(struct inode *inode, int 
> errno,
> +u64 offset)
> +{
> + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb)->fs_info;
> + btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at offset "
> + "%llu\n", offset);
 
this will need a cast to (unsigned long long)

> +}
> +
>  /*
>   * look for a given offset in the tree, and if it can't be found return the
>   * first lesser offset
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -326,6 +326,19 @@ static struct rb_node *tree_search(struct rb_root *root, 
> u64 bytenr)
>   return NULL;
>  }
>  
> +NORET_TYPE static void backref_tree_panic(struct rb_node *rb_node, int errno,
> +   u64 bytenr)
> +{
> +
> + struct btrfs_fs_info *fs_info = NULL;
> + struct backref_node *bnode = rb_entry(rb_node, struct backref_node,
> +   rb_node);
> + if (bnode->root)
> + fs_info = bnode->root->fs_info;
> + btrfs_panic(fs_info, errno, "Inconsistency in backref cache "
> + "found at offset %llu\n", bytenr);

same here

> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 03/99] btrfs: Panic on bad rbtree operations

2011-11-23 Thread Jeff Mahoney
 The ordered data and relocation trees have BUG_ONs to protect against
 bad tree operations.

 This patch replaces them with a panic that will report the problem.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ordered-data.c |   12 ++--
 fs/btrfs/relocation.c   |   36 +---
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index a1c9404..5a53d94 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -59,6 +59,14 @@ static struct rb_node *tree_insert(struct rb_root *root, u64 
file_offset,
return NULL;
 }
 
+NORET_TYPE static void ordered_data_tree_panic(struct inode *inode, int errno,
+  u64 offset)
+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb)->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at offset "
+   "%llu\n", offset);
+}
+
 /*
  * look for a given offset in the tree, and if it can't be found return the
  * first lesser offset
@@ -207,7 +215,8 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
spin_lock(&tree->lock);
node = tree_insert(&tree->tree, file_offset,
   &entry->rb_node);
-   BUG_ON(node);
+   if (node)
+   ordered_data_tree_panic(inode, -EEXIST, file_offset);
spin_unlock(&tree->lock);
 
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
@@ -215,7 +224,6 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
  &BTRFS_I(inode)->root->fs_info->ordered_extents);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
 
-   BUG_ON(node);
return 0;
 }
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 7fa090f..a222957 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -326,6 +326,19 @@ static struct rb_node *tree_search(struct rb_root *root, 
u64 bytenr)
return NULL;
 }
 
+NORET_TYPE static void backref_tree_panic(struct rb_node *rb_node, int errno,
+ u64 bytenr)
+{
+
+   struct btrfs_fs_info *fs_info = NULL;
+   struct backref_node *bnode = rb_entry(rb_node, struct backref_node,
+ rb_node);
+   if (bnode->root)
+   fs_info = bnode->root->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in backref cache "
+   "found at offset %llu\n", bytenr);
+}
+
 /*
  * walk up backref nodes until reach node presents tree root
  */
@@ -452,7 +465,8 @@ static void update_backref_node(struct backref_cache *cache,
rb_erase(&node->rb_node, &cache->rb_root);
node->bytenr = bytenr;
rb_node = tree_insert(&cache->rb_root, node->bytenr, &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, bytenr);
 }
 
 /*
@@ -999,7 +1013,8 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, node->bytenr,
  &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
list_add_tail(&node->lower, &cache->leaves);
}
 
@@ -1034,7 +1049,9 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, upper->bytenr,
  &upper->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST,
+  upper->bytenr);
}
 
list_add_tail(&edge->list[UPPER], &upper->lower);
@@ -1178,7 +1195,8 @@ static int clone_backref_node(struct btrfs_trans_handle 
*trans,
 
rb_node = tree_insert(&cache->rb_root, new_node->bytenr,
  &new_node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, new_node->bytenr);
 
if (!new_node->lowest) {
list_for_each_entry(new_edge, &new_node->lower, list[UPPER]) {
@@ -1250,7 +1268,8 @@ static int __update_reloc_root(struct btrfs_root *root, 
int del)
rb_node = tree_insert(&rc->reloc_root_tree.rb_root,
  node->bytenr, &node->rb_node);
spin_unlock(&rc->reloc_root_tree.lock);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
} else {
list_del_init(&root->root_list);
kfree(node);
@@ -3151,7 +3170,8 @@ static int add_tree_block(struct reloc_control *rc,
block->key_ready = 0;
 
rb_node = tree_insert(b

[patch 03/66] btrfs: Panic on bad rbtree operations

2011-10-24 Thread Jeff Mahoney
 The ordered data and relocation trees have BUG_ONs to protect against
 bad tree operations.

 This patch replaces them with a panic that will report the problem.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ordered-data.c |   12 ++--
 fs/btrfs/relocation.c   |   36 +---
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index a1c9404..5a53d94 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -59,6 +59,14 @@ static struct rb_node *tree_insert(struct rb_root *root, u64 
file_offset,
return NULL;
 }
 
+NORET_TYPE static void ordered_data_tree_panic(struct inode *inode, int errno,
+  u64 offset)
+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb)->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at offset "
+   "%llu\n", offset);
+}
+
 /*
  * look for a given offset in the tree, and if it can't be found return the
  * first lesser offset
@@ -207,7 +215,8 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
spin_lock(&tree->lock);
node = tree_insert(&tree->tree, file_offset,
   &entry->rb_node);
-   BUG_ON(node);
+   if (node)
+   ordered_data_tree_panic(inode, -EEXIST, file_offset);
spin_unlock(&tree->lock);
 
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
@@ -215,7 +224,6 @@ static int __btrfs_add_ordered_extent(struct inode *inode, 
u64 file_offset,
  &BTRFS_I(inode)->root->fs_info->ordered_extents);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
 
-   BUG_ON(node);
return 0;
 }
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 7fa090f..a222957 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -326,6 +326,19 @@ static struct rb_node *tree_search(struct rb_root *root, 
u64 bytenr)
return NULL;
 }
 
+NORET_TYPE static void backref_tree_panic(struct rb_node *rb_node, int errno,
+ u64 bytenr)
+{
+
+   struct btrfs_fs_info *fs_info = NULL;
+   struct backref_node *bnode = rb_entry(rb_node, struct backref_node,
+ rb_node);
+   if (bnode->root)
+   fs_info = bnode->root->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in backref cache "
+   "found at offset %llu\n", bytenr);
+}
+
 /*
  * walk up backref nodes until reach node presents tree root
  */
@@ -452,7 +465,8 @@ static void update_backref_node(struct backref_cache *cache,
rb_erase(&node->rb_node, &cache->rb_root);
node->bytenr = bytenr;
rb_node = tree_insert(&cache->rb_root, node->bytenr, &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, bytenr);
 }
 
 /*
@@ -999,7 +1013,8 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, node->bytenr,
  &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
list_add_tail(&node->lower, &cache->leaves);
}
 
@@ -1034,7 +1049,9 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, upper->bytenr,
  &upper->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST,
+  upper->bytenr);
}
 
list_add_tail(&edge->list[UPPER], &upper->lower);
@@ -1178,7 +1195,8 @@ static int clone_backref_node(struct btrfs_trans_handle 
*trans,
 
rb_node = tree_insert(&cache->rb_root, new_node->bytenr,
  &new_node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, new_node->bytenr);
 
if (!new_node->lowest) {
list_for_each_entry(new_edge, &new_node->lower, list[UPPER]) {
@@ -1250,7 +1268,8 @@ static int __update_reloc_root(struct btrfs_root *root, 
int del)
rb_node = tree_insert(&rc->reloc_root_tree.rb_root,
  node->bytenr, &node->rb_node);
spin_unlock(&rc->reloc_root_tree.lock);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
} else {
list_del_init(&root->root_list);
kfree(node);
@@ -3151,7 +3170,8 @@ static int add_tree_block(struct reloc_control *rc,
block->key_ready = 0;
 
rb_node = tree_insert(b

[patch 03/65] btrfs: Panic on bad rbtree operations

2011-10-04 Thread Jeff Mahoney
 The ordered data and relocation trees have BUG_ONs to protect against
 bad tree operations.

 This patch replaces them with a panic that will report the problem.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ordered-data.c |   12 ++--
 fs/btrfs/relocation.c   |   36 +---
 2 files changed, 39 insertions(+), 9 deletions(-)

--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -59,6 +59,14 @@ static struct rb_node *tree_insert(struc
return NULL;
 }
 
+NORET_TYPE static void ordered_data_tree_panic(struct inode *inode, int errno,
+  u64 offset)
+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb)->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in ordered tree at offset "
+   "%llu\n", offset);
+}
+
 /*
  * look for a given offset in the tree, and if it can't be found return the
  * first lesser offset
@@ -207,7 +215,8 @@ static int __btrfs_add_ordered_extent(st
spin_lock(&tree->lock);
node = tree_insert(&tree->tree, file_offset,
   &entry->rb_node);
-   BUG_ON(node);
+   if (node)
+   ordered_data_tree_panic(inode, -EEXIST, file_offset);
spin_unlock(&tree->lock);
 
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
@@ -215,7 +224,6 @@ static int __btrfs_add_ordered_extent(st
  &BTRFS_I(inode)->root->fs_info->ordered_extents);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
 
-   BUG_ON(node);
return 0;
 }
 
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -326,6 +326,19 @@ static struct rb_node *tree_search(struc
return NULL;
 }
 
+NORET_TYPE static void backref_tree_panic(struct rb_node *rb_node, int errno,
+ u64 bytenr)
+{
+
+   struct btrfs_fs_info *fs_info = NULL;
+   struct backref_node *bnode = rb_entry(rb_node, struct backref_node,
+ rb_node);
+   if (bnode->root)
+   fs_info = bnode->root->fs_info;
+   btrfs_panic(fs_info, errno, "Inconsistency in backref cache "
+   "found at offset %llu\n", bytenr);
+}
+
 /*
  * walk up backref nodes until reach node presents tree root
  */
@@ -452,7 +465,8 @@ static void update_backref_node(struct b
rb_erase(&node->rb_node, &cache->rb_root);
node->bytenr = bytenr;
rb_node = tree_insert(&cache->rb_root, node->bytenr, &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, bytenr);
 }
 
 /*
@@ -999,7 +1013,8 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, node->bytenr,
  &node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
list_add_tail(&node->lower, &cache->leaves);
}
 
@@ -1034,7 +1049,9 @@ next:
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, upper->bytenr,
  &upper->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST,
+  upper->bytenr);
}
 
list_add_tail(&edge->list[UPPER], &upper->lower);
@@ -1178,7 +1195,8 @@ static int clone_backref_node(struct btr
 
rb_node = tree_insert(&cache->rb_root, new_node->bytenr,
  &new_node->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, new_node->bytenr);
 
if (!new_node->lowest) {
list_for_each_entry(new_edge, &new_node->lower, list[UPPER]) {
@@ -1254,7 +1272,8 @@ static int __update_reloc_root(struct bt
rb_node = tree_insert(&rc->reloc_root_tree.rb_root,
  node->bytenr, &node->rb_node);
spin_unlock(&rc->reloc_root_tree.lock);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, node->bytenr);
} else {
list_del_init(&root->root_list);
kfree(node);
@@ -3168,7 +3187,8 @@ static int add_tree_block(struct reloc_c
block->key_ready = 0;
 
rb_node = tree_insert(blocks, block->bytenr, &block->rb_node);
-   BUG_ON(rb_node);
+   if (rb_node)
+   backref_tree_panic(rb_node, -EEXIST, block->bytenr);
 
return 0;
 }
@@ -3437,7 +3457,9 @@ static int find_data_references(struct r
block->key_ready = 1;
rb_node = tree_insert(blocks, block->bytenr,
 

btrfs panic

2011-07-14 Thread Xiao Guangrong
When xfstests 224 was running, the box was panic, and i got this message:

[ 1998.327235] =
[ 1998.329940] [ INFO: possible recursive locking detected ]
[ 1998.329940] 2.6.39+ #3
[ 1998.329940] -
[ 1998.329940] dd/25718 is trying to acquire lock:
[ 1998.329940]  (&(&eb->lock)->rlock){+.+...}, at: [] 
btrfs_try_spin_lock+0x2a/0x89 [btrfs]
[ 1998.329940] 
[ 1998.329940] but task is already holding lock:
[ 1998.329940]  (&(&eb->lock)->rlock){+.+...}, at: [] 
btrfs_clear_lock_blocking+0x22/0x2b [btrfs]
[ 1998.478275] 
[ 1998.478275] other info that might help us debug this:
[ 1998.478275] 2 locks held by dd/25718:
[ 1998.478275]  #0:  (&sb->s_type->i_mutex_key#13){+.+.+.}, at: 
[] btrfs_file_aio_write+0xdc/0x49a [btrfs]
[ 1998.478275]  #1:  (&(&eb->lock)->rlock){+.+...}, at: [] 
btrfs_clear_lock_blocking+0x22/0x2b [btrfs]
[ 1998.478275] 
[ 1998.478275] stack backtrace:
[ 1998.478275] Pid: 25718, comm: dd Not tainted 2.6.39+ #3
[ 1998.478275] Call Trace:
[ 1998.478275]  [] __lock_acquire+0xd47/0xdcf
[ 1998.478275]  [] ? sched_clock+0x9/0xd
[ 1998.478275]  [] ? sched_clock_local+0x12/0x75
[ 1998.478275]  [] ? btrfs_clear_lock_blocking+0x22/0x2b 
[btrfs]
[ 1998.478275]  [] ? btrfs_try_spin_lock+0x2a/0x89 [btrfs]
[ 1998.478275]  [] lock_acquire+0xd1/0xfb
[ 1998.478275]  [] ? btrfs_try_spin_lock+0x2a/0x89 [btrfs]
[ 1998.478275]  [] _raw_spin_lock+0x36/0x69
[ 1998.478275]  [] ? btrfs_try_spin_lock+0x2a/0x89 [btrfs]
[ 1998.478275]  [] btrfs_try_spin_lock+0x2a/0x89 [btrfs]
[ 1998.478275]  [] btrfs_search_slot+0x39c/0x4c0 [btrfs]
[ 1998.478275]  [] btrfs_lookup_xattr+0x76/0xd7 [btrfs]
[ 1998.478275]  [] ? btrfs_alloc_path+0x1a/0x1c [btrfs]
[ 1998.478275]  [] ? kmem_cache_alloc+0x57/0xfc
[ 1998.478275]  [] ? btrfs_file_aio_write+0x45/0x49a [btrfs]
[ 1998.478275]  [] __btrfs_getxattr+0x86/0x11c [btrfs]
[ 1998.478275]  [] btrfs_getxattr+0x77/0x82 [btrfs]
[ 1998.478275]  [] cap_inode_need_killpriv+0x2d/0x37
[ 1998.478275]  [] file_remove_suid+0x27/0x64
[ 1998.478275]  [] btrfs_file_aio_write+0x159/0x49a [btrfs]
[ 1998.478275]  [] ? trace_hardirqs_off+0xd/0xf
[ 1998.478275]  [] ? local_clock+0x36/0x4d
[ 1998.478275]  [] ? lock_release_non_nested+0xdb/0x263
[ 1998.478275]  [] do_sync_write+0xcb/0x108
[ 1998.478275]  [] ? might_fault+0x5c/0xac
[ 1998.478275]  [] ? lock_is_held+0x8d/0x98
[ 1998.478275]  [] vfs_write+0xaf/0x102
[ 1998.478275]  [] ? fget_light+0x3a/0xa1
[ 1998.478275]  [] sys_write+0x4d/0x74
[ 1998.478275]  [] system_call_fastpath+0x16/0x1b

[ 2160.937580] INFO: task xfs_io:22734 blocked for more than 120 seconds.
[ 2160.953899] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 2160.978494] xfs_io  D  0 22734  21963 0x
[ 2160.996597]  88000ac8dc68 0046 88000ac8dc08 

[ 2161.107976]  001d3ec0 001d3ec0 001d3ec0 
882423a0
[ 2161.117511]  001d3ec0 88000ac8dfd8 001d3ec0 
001d3ec0
[ 2161.127543] Call Trace:
[ 2161.131247]  [] ? do_last+0x1d2/0x59d
[ 2161.136678]  [] ? do_last+0x1d2/0x59d
[ 2161.142181]  [] __mutex_lock_common+0x22b/0x35b
[ 2161.148104]  [] ? do_last+0x1d2/0x59d
[ 2161.153578]  [] mutex_lock_nested+0x3e/0x43
[ 2161.159300]  [] do_last+0x1d2/0x59d
[ 2161.164589]  [] path_openat+0xcb/0x33a
[ 2161.170358]  [] ? sched_clock+0x9/0xd
[ 2161.175941]  [] ? sched_clock_local+0x12/0x75
[ 2161.182033]  [] do_filp_open+0x3d/0x89
[ 2161.187301]  [] ? _raw_spin_unlock+0x2b/0x2f
[ 2161.192937]  [] ? alloc_fd+0x181/0x193
[ 2161.198541]  [] do_sys_open+0x74/0x106
[ 2161.204058]  [] sys_open+0x20/0x22
[ 2161.209488]  [] system_call_fastpath+0x16/0x1b
[ 2161.215279] INFO: lockdep is turned off.
[ 2161.219841] Kernel panic - not syncing: hung_task: blocked tasks
[ 2161.225647] Pid: 42, comm: khungtaskd Not tainted 2.6.39+ #3
[ 2161.231535] Call Trace:
[ 2161.235146]  [] panic+0x91/0x1a9
[ 2161.240266]  [] watchdog+0x1ae/0x219
[ 2161.244863]  [] ? rcu_read_unlock+0x23/0x23
[ 2161.250816]  [] kthread+0xa0/0xa8
[ 2161.255995]  [] ? trace_hardirqs_on_caller+0x13f/0x172
[ 2161.262333]  [] kernel_thread_helper+0x4/0x10
[ 2161.268211]  [] ? retint_restore_args+0x13/0x13
[ 2161.274113]  [] ? __init_kthread_worker+0x5b/0x5b
[ 2161.280135]  [] ? gs_change+0x13/0x13


I am happy to answer any questions.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-10 Thread Mingming Cao
On Mon, 2008-06-09 at 20:47 -0400, Chris Mason wrote:
> On Mon, 2008-06-09 at 17:10 -0700, Mingming Cao wrote:
> > On Sun, 2008-06-08 at 22:37 -0400, Chris Mason wrote:
> > > On Thu, 05 Jun 2008 13:43:48 -0400
> > > Ric Wheeler <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Chris Mason wrote:
> > > > > On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
> > > > >   
> > > > >> I can reliably get btrfs to panic by running my fs_mark code on a
> > > > >> newly created file system with lots of threads on an 8-way box. If
> > > > >> this is too aggressive, let me know ;-)
> > > > >>
> > > > >> Here is a summary of the panic:
> > > > >> 
> > > > >
> > > > > BTW, exactly how are you running fs_mark?  Mingming reminded me that
> > > > > strictly speaking this patch shouldn't be required, so there might
> > > > > be other related problems.
> > > > >
> > > > > -chris
> > > > >
> > > > >   
> > > > It still crashes, Mingming is clearly correct ;-)
> > > > 
> > > 
> > > Grin, I never should have doubted her.
> > > 
> > :) 
> > 
> > > So, the actual fix should be below.  It looks like the problem is that 
> > > I've got
> > > a race in setting the pointer to a new transaction, which makes the
> > > data=ordered code take a spin lock that hasn't yet been setup.
> > > 
> > 
> > Just to be clear, so the data=ordered code(btrfs_del_ordered_inode())
> > takes a spin lock (new_trans_lock) and assume the new transaction has
> > been setup, that races with join_transaction resetting the current
> > running transaction()? 
> > 
> Yes
> 
> > I also see the btrfs_commit_transaction() could reset the
> > root->fs_info->running_transaction to be NULL, but we did not check NULL
> > pointer in the data=ordered mode code, is this a potential Bug? Or it is
> > covered somewhere else?
> > 
> 
> Thanks for double checking these.
> 
> We don't check it in btrfs_add_ordered_inode because that must be called
> with the transaction running.
> 
Thanks for clarifying, I missed this.

> btrfs_ordered_throttle is safe because it doesn't actually deref the
> pointer, it just checks for changes to it.  The important part of
> ordered_throttle is the writeback count.
> 
> So, the others should be safe, but please let me know if you see any
> holes there.
> 

Looks pretty safe to me now, I should not doubt you earlier:)

Mingming

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-09 Thread Chris Mason
On Mon, 2008-06-09 at 17:10 -0700, Mingming Cao wrote:
> On Sun, 2008-06-08 at 22:37 -0400, Chris Mason wrote:
> > On Thu, 05 Jun 2008 13:43:48 -0400
> > Ric Wheeler <[EMAIL PROTECTED]> wrote:
> > 
> > > Chris Mason wrote:
> > > > On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
> > > >   
> > > >> I can reliably get btrfs to panic by running my fs_mark code on a
> > > >> newly created file system with lots of threads on an 8-way box. If
> > > >> this is too aggressive, let me know ;-)
> > > >>
> > > >> Here is a summary of the panic:
> > > >> 
> > > >
> > > > BTW, exactly how are you running fs_mark?  Mingming reminded me that
> > > > strictly speaking this patch shouldn't be required, so there might
> > > > be other related problems.
> > > >
> > > > -chris
> > > >
> > > >   
> > > It still crashes, Mingming is clearly correct ;-)
> > > 
> > 
> > Grin, I never should have doubted her.
> > 
> :) 
> 
> > So, the actual fix should be below.  It looks like the problem is that I've 
> > got
> > a race in setting the pointer to a new transaction, which makes the
> > data=ordered code take a spin lock that hasn't yet been setup.
> > 
> 
> Just to be clear, so the data=ordered code(btrfs_del_ordered_inode())
> takes a spin lock (new_trans_lock) and assume the new transaction has
> been setup, that races with join_transaction resetting the current
> running transaction()? 
> 
Yes

> I also see the btrfs_commit_transaction() could reset the
> root->fs_info->running_transaction to be NULL, but we did not check NULL
> pointer in the data=ordered mode code, is this a potential Bug? Or it is
> covered somewhere else?
> 

Thanks for double checking these.

We don't check it in btrfs_add_ordered_inode because that must be called
with the transaction running.

btrfs_ordered_throttle is safe because it doesn't actually deref the
pointer, it just checks for changes to it.  The important part of
ordered_throttle is the writeback count.

So, the others should be safe, but please let me know if you see any
holes there.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-09 Thread Mingming Cao
On Sun, 2008-06-08 at 22:37 -0400, Chris Mason wrote:
> On Thu, 05 Jun 2008 13:43:48 -0400
> Ric Wheeler <[EMAIL PROTECTED]> wrote:
> 
> > Chris Mason wrote:
> > > On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
> > >   
> > >> I can reliably get btrfs to panic by running my fs_mark code on a
> > >> newly created file system with lots of threads on an 8-way box. If
> > >> this is too aggressive, let me know ;-)
> > >>
> > >> Here is a summary of the panic:
> > >> 
> > >
> > > BTW, exactly how are you running fs_mark?  Mingming reminded me that
> > > strictly speaking this patch shouldn't be required, so there might
> > > be other related problems.
> > >
> > > -chris
> > >
> > >   
> > It still crashes, Mingming is clearly correct ;-)
> > 
> 
> Grin, I never should have doubted her.
> 
:) 

> So, the actual fix should be below.  It looks like the problem is that I've 
> got
> a race in setting the pointer to a new transaction, which makes the
> data=ordered code take a spin lock that hasn't yet been setup.
> 

Just to be clear, so the data=ordered code(btrfs_del_ordered_inode())
takes a spin lock (new_trans_lock) and assume the new transaction has
been setup, that races with join_transaction resetting the current
running transaction()? 

I also see the btrfs_commit_transaction() could reset the
root->fs_info->running_transaction to be NULL, but we did not check NULL
pointer in the data=ordered mode code, is this a potential Bug? Or it is
covered somewhere else?

Mingming
> Before this patch my test box got into an infinite loop with fs_mark.  Now it
> seems to run to completion.
> 
> -chris
> 
> diff -r 0b4ab489ffe1 transaction.c
> --- a/transaction.c   Tue May 27 10:55:43 2008 -0400
> +++ b/transaction.c   Sun Jun 08 22:23:50 2008 -0400
> @@ -56,7 +56,6 @@ static noinline int join_transaction(str
>   total_trans++;
>   BUG_ON(!cur_trans);
>   root->fs_info->generation++;
> - root->fs_info->running_transaction = cur_trans;
>   root->fs_info->last_alloc = 0;
>   root->fs_info->last_data_alloc = 0;
>   cur_trans->num_writers = 1;
> @@ -74,6 +73,9 @@ static noinline int join_transaction(str
>   extent_io_tree_init(&cur_trans->dirty_pages,
>root->fs_info->btree_inode->i_mapping,
>GFP_NOFS);
> + spin_lock(&root->fs_info->new_trans_lock);
> + root->fs_info->running_transaction = cur_trans;
> + spin_unlock(&root->fs_info->new_trans_lock);
>   } else {
>   cur_trans->num_writers++;
>   cur_trans->num_joined++;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-09 Thread Ric Wheeler

Chris Mason wrote:

On Thu, 05 Jun 2008 13:43:48 -0400
Ric Wheeler <[EMAIL PROTECTED]> wrote:

  

Chris Mason wrote:


On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
  
  

I can reliably get btrfs to panic by running my fs_mark code on a
newly created file system with lots of threads on an 8-way box. If
this is too aggressive, let me know ;-)

Here is a summary of the panic:



BTW, exactly how are you running fs_mark?  Mingming reminded me that
strictly speaking this patch shouldn't be required, so there might
be other related problems.

-chris

  
  

It still crashes, Mingming is clearly correct ;-)




Grin, I never should have doubted her.

So, the actual fix should be below.  It looks like the problem is that I've got
a race in setting the pointer to a new transaction, which makes the
data=ordered code take a spin lock that hasn't yet been setup.

Before this patch my test box got into an infinite loop with fs_mark.  Now it
seems to run to completion.

-chris
  


Thanks Chris - this patch works for me as well,

ric


diff -r 0b4ab489ffe1 transaction.c
--- a/transaction.c Tue May 27 10:55:43 2008 -0400
+++ b/transaction.c Sun Jun 08 22:23:50 2008 -0400
@@ -56,7 +56,6 @@ static noinline int join_transaction(str
total_trans++;
BUG_ON(!cur_trans);
root->fs_info->generation++;
-   root->fs_info->running_transaction = cur_trans;
root->fs_info->last_alloc = 0;
root->fs_info->last_data_alloc = 0;
cur_trans->num_writers = 1;
@@ -74,6 +73,9 @@ static noinline int join_transaction(str
extent_io_tree_init(&cur_trans->dirty_pages,
 root->fs_info->btree_inode->i_mapping,
 GFP_NOFS);
+   spin_lock(&root->fs_info->new_trans_lock);
+   root->fs_info->running_transaction = cur_trans;
+   spin_unlock(&root->fs_info->new_trans_lock);
} else {
cur_trans->num_writers++;
cur_trans->num_joined++;
  


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-08 Thread Chris Mason
On Thu, 05 Jun 2008 13:43:48 -0400
Ric Wheeler <[EMAIL PROTECTED]> wrote:

> Chris Mason wrote:
> > On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
> >   
> >> I can reliably get btrfs to panic by running my fs_mark code on a
> >> newly created file system with lots of threads on an 8-way box. If
> >> this is too aggressive, let me know ;-)
> >>
> >> Here is a summary of the panic:
> >> 
> >
> > BTW, exactly how are you running fs_mark?  Mingming reminded me that
> > strictly speaking this patch shouldn't be required, so there might
> > be other related problems.
> >
> > -chris
> >
> >   
> It still crashes, Mingming is clearly correct ;-)
> 

Grin, I never should have doubted her.

So, the actual fix should be below.  It looks like the problem is that I've got
a race in setting the pointer to a new transaction, which makes the
data=ordered code take a spin lock that hasn't yet been setup.

Before this patch my test box got into an infinite loop with fs_mark.  Now it
seems to run to completion.

-chris

diff -r 0b4ab489ffe1 transaction.c
--- a/transaction.c Tue May 27 10:55:43 2008 -0400
+++ b/transaction.c Sun Jun 08 22:23:50 2008 -0400
@@ -56,7 +56,6 @@ static noinline int join_transaction(str
total_trans++;
BUG_ON(!cur_trans);
root->fs_info->generation++;
-   root->fs_info->running_transaction = cur_trans;
root->fs_info->last_alloc = 0;
root->fs_info->last_data_alloc = 0;
cur_trans->num_writers = 1;
@@ -74,6 +73,9 @@ static noinline int join_transaction(str
extent_io_tree_init(&cur_trans->dirty_pages,
 root->fs_info->btree_inode->i_mapping,
 GFP_NOFS);
+   spin_lock(&root->fs_info->new_trans_lock);
+   root->fs_info->running_transaction = cur_trans;
+   spin_unlock(&root->fs_info->new_trans_lock);
} else {
cur_trans->num_writers++;
cur_trans->num_joined++;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-05 Thread Ric Wheeler

Chris Mason wrote:

On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
  
I can reliably get btrfs to panic by running my fs_mark code on a newly  
created file system with lots of threads on an 8-way box. If this is too  
aggressive, let me know ;-)


Here is a summary of the panic:



BTW, exactly how are you running fs_mark?  Mingming reminded me that
strictly speaking this patch shouldn't be required, so there might be
other related problems.

-chris

  

This was the actual command:

./fs_mark -d /mnt/test -D 512 -t 16 -s 409600 -F

I will give your patch a spin right after lunch ;-)

ric


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-05 Thread Chris Mason
On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
>
> I can reliably get btrfs to panic by running my fs_mark code on a newly  
> created file system with lots of threads on an 8-way box. If this is too  
> aggressive, let me know ;-)
>
> Here is a summary of the panic:

BTW, exactly how are you running fs_mark?  Mingming reminded me that
strictly speaking this patch shouldn't be required, so there might be
other related problems.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-04 Thread Chris Mason
On Wed, Jun 04, 2008 at 03:46:10PM -0400, Ric Wheeler wrote:
> Chris Mason wrote:
>> On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
>>   
>>> I can reliably get btrfs to panic by running my fs_mark code on a 
>>> newly  created file system with lots of threads on an 8-way box. If 
>>> this is too  aggressive, let me know ;-)
>>>
>>> Here is a summary of the panic:
>>> 
>>
>> I think this is due to a corruption on the data=ordered list.  I'm
>> testing a patch out here.
>>
>> -chris
>>   
> I can test it tomorrow if you send it on... Thanks!

Patch is below, but I don't have access to my test rig so I haven't
hammered on it yet.  I'm willing to corrupt Ric's test box, but everyone
else may want to wait ;)

-chris

diff -r 0b4ab489ffe1 file.c
--- a/file.cTue May 27 10:55:43 2008 -0400
+++ b/file.cWed Jun 04 16:10:40 2008 -0400
@@ -980,7 +980,7 @@ out_nolock:
 
 static int btrfs_release_file (struct inode * inode, struct file * filp)
 {
-   btrfs_del_ordered_inode(inode);
+   btrfs_del_ordered_inode(inode, 0);
return 0;
 }
 
diff -r 0b4ab489ffe1 inode.c
--- a/inode.c   Tue May 27 10:55:43 2008 -0400
+++ b/inode.c   Wed Jun 04 16:10:40 2008 -0400
@@ -861,7 +861,7 @@ static int btrfs_unlink(struct inode *di
 * we don't need to worry about
 * data=ordered
 */
-   btrfs_del_ordered_inode(inode);
+   btrfs_del_ordered_inode(inode, 0);
}
 
btrfs_end_transaction(trans, root);
@@ -3352,6 +3352,7 @@ void btrfs_destroy_inode(struct inode *i
WARN_ON(!list_empty(&inode->i_dentry));
WARN_ON(inode->i_data.nrpages);
 
+   btrfs_del_ordered_inode(inode, 1);
btrfs_drop_extent_cache(inode, 0, (u64)-1);
kmem_cache_free(btrfs_inode_cachep, BTRFS_I(inode));
 }
diff -r 0b4ab489ffe1 ordered-data.c
--- a/ordered-data.cTue May 27 10:55:43 2008 -0400
+++ b/ordered-data.cWed Jun 04 16:10:40 2008 -0400
@@ -254,7 +254,7 @@ static void __btrfs_del_ordered_inode(st
return;
 }
 
-void btrfs_del_ordered_inode(struct inode *inode)
+void btrfs_del_ordered_inode(struct inode *inode, int force)
 {
struct btrfs_root *root = BTRFS_I(inode)->root;
u64 root_objectid = root->root_key.objectid;
@@ -263,8 +263,8 @@ void btrfs_del_ordered_inode(struct inod
return;
}
 
-   if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY) ||
-   mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK))
+   if (!force && (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY) ||
+   mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK)))
return;
 
spin_lock(&root->fs_info->new_trans_lock);
diff -r 0b4ab489ffe1 ordered-data.h
--- a/ordered-data.hTue May 27 10:55:43 2008 -0400
+++ b/ordered-data.hWed Jun 04 16:10:40 2008 -0400
@@ -38,6 +38,6 @@ int btrfs_find_first_ordered_inode(struc
 int btrfs_find_first_ordered_inode(struct btrfs_ordered_inode_tree *tree,
   u64 *root_objectid, u64 *objectid,
   struct inode **inode);
-void btrfs_del_ordered_inode(struct inode *inode);
+void btrfs_del_ordered_inode(struct inode *inode, int force);
 int btrfs_ordered_throttle(struct btrfs_root *root, struct inode *inode);
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-04 Thread Ric Wheeler

Chris Mason wrote:

On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
  
I can reliably get btrfs to panic by running my fs_mark code on a newly  
created file system with lots of threads on an 8-way box. If this is too  
aggressive, let me know ;-)


Here is a summary of the panic:



I think this is due to a corruption on the data=ordered list.  I'm
testing a patch out here.

-chris
  

I can test it tomorrow if you send it on... Thanks!

ric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-03 Thread Chris Mason
On Mon, Jun 02, 2008 at 01:52:47PM -0400, Ric Wheeler wrote:
>
> I can reliably get btrfs to panic by running my fs_mark code on a newly  
> created file system with lots of threads on an 8-way box. If this is too  
> aggressive, let me know ;-)
>
> Here is a summary of the panic:

I think this is due to a corruption on the data=ordered list.  I'm
testing a patch out here.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs panic - BUG: soft lockup - CPU#0 stuck for 61s! [fs_mark:4573]

2008-06-02 Thread Ric Wheeler


I can reliably get btrfs to panic by running my fs_mark code on a newly 
created file system with lots of threads on an 8-way box. If this is too 
aggressive, let me know ;-)


Here is a summary of the panic:

EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
device fsid 814e6131acbfcbec-7a2a40df880929bb devid 1 transid 495 /dev/sdb1
BUG: soft lockup - CPU#1 stuck for 61s! [fs_mark:4572]
CPU 1:
Modules linked in: btrfs libcrc32c ipt_MASQUERADE iptable_nat nf_nat 
bridge bnep rfcomm l2cap bluetooth ib_iser rdma_cm ib_cm iw_cm ib_sa 
ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi fuse 
sunrpc ipt_REJECT nf_conntrack_ipv4 iptable_filter ip_tables ip6t_REJECT 
xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter 
ip6_tables x_tables ipv6 dm_mirror dm_multipath dm_mod kvm_intel kvm 
snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
snd_seq_device snd_pcm_oss iTCO_wdt snd_mixer_oss iTCO_vendor_support 
nvidia(P) pata_acpi button ata_piix snd_pcm ppdev firewire_ohci i2c_i801 
ata_generic firewire_core pcspkr dcdbas snd_timer sr_mod cdrom 
snd_page_alloc snd_hwdep snd tg3 serio_raw i2c_core shpchp crc_itu_t sg 
parport_pc soundcore parport floppy ahci libata sd_mod scsi_mod ext3 jbd 
mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]

Pid: 4572, comm: fs_mark Tainted: P 2.6.25.3-18.fc9.x86_64 #1
RIP: 0010:[]  [] 
__write_lock_failed+0x9/0x20

RSP: 0018:81000c529e60  EFLAGS: 0206
RAX: 810015c0e000 RBX: 81000c529e68 RCX: 0016
RDX: 81003d019e00 RSI: 0001 RDI: 8100100e24f0
RBP: 8100189bef00 R08:  R09: 0016
R10: 12750e57 R11: 0246 R12: 0202
R13: 81000c529de8 R14: 11dc R15: 81000c529f58
FS:  0159b850(0063) GS:81003f802680() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f55301940a8 CR3: 0c513000 CR4: 26e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400

Call Trace:
[] ? _write_lock+0x12/0x14
[] ? :btrfs:btrfs_del_ordered_inode+0xc0/0x13f
[] ? :btrfs:btrfs_release_file+0x9/0xd
[] ? __fput+0xca/0x189
[] ? fput+0x14/0x16
[] ? filp_close+0x66/0x71
[] ? sys_close+0x99/0xd2
[] ? tracesys+0xd5/0xda


ric




Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.25.3-18.fc9.x86_64 (mockbuild@) (gcc version 4.3.0 20080428 
(Red Hat 4.3.0-8) (GCC) ) #1 SMP Tue May 13 04:54:47 EDT 2008
Command line: ro root=UUID=1b44ce19-eab7-43ce-ba66-510fd2e3ef5b rhgb quiet
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009ec00 (usable)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fe0ac00 (usable)
 BIOS-e820: 3fe0ac00 - 3fe5cc00 (ACPI NVS)
 BIOS-e820: 3fe5cc00 - 3fe5ec00 (ACPI data)
 BIOS-e820: 3fe5ec00 - 4000 (reserved)
 BIOS-e820: e000 - f000 (reserved)
 BIOS-e820: fe00 - ff00 (reserved)
 BIOS-e820: ffb0 - 0001 (reserved)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 261642) 1 entries of 3200 used
end_pfn_map = 1048576
DMI 2.5 present.
ACPI: RSDP 000FEBF0, 0024 (r2 DELL  )
ACPI: XSDT 000FCD54, 0074 (r1 DELLB9K   15 ASL61)
ACPI: FACP 000FCE84, 00F4 (r3 DELLB9K   15 ASL61)
ACPI: DSDT FFF62D72, 3A12 (r1   DELLdt_ex 1000 INTL 20050624)
ACPI: FACS 3FE0AC00, 0040
ACPI: SSDT FFF668A5, 00AC (r1   DELLst_ex 1000 INTL 20050624)
ACPI: APIC 000FCF78, 00AA (r1 DELLB9K   15 ASL61)
ACPI: BOOT 000FD022, 0028 (r1 DELLB9K   15 ASL61)
ACPI: ASF! 000FD04A, 0096 (r32 DELLB9K   15 ASL61)
ACPI: MCFG 000FD0E0, 003E (r1 DELLB9K   15 ASL61)
ACPI: HPET 000FD11E, 0038 (r1 DELLB9K   15 ASL61)
ACPI: TCPA 000FD37A, 0032 (r1 DELLB9K   15 ASL61)
ACPI:  000FD3AC, 0030 (r1 DELLB9K   15 ASL61)
ACPI: SLIC 000FD156, 00C0 (r1 DELLB9K   15 ASL61)
No NUMA configuration found
Faking a node at -3fe0a000
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 261642) 1 entries of 3200 used
Bootmem setup node 0 -3fe0a000
  NODE_DATA [b000 - 00012fff]
  bootmap [00013000 -  0001afc7] pages 8
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] SMP_TRAMPOLINE
early res: 2 [20-75db3b] TEXT DATA BSS
early res: 3 [37d19000-37fef182] RAMDISK
early res: 4 [9ec00-a0bff] EBDA
early res: 5 [8000-afff] PGTABLE
 [e200-e21f] PMD ->81000120 on node 0
 [e220