RE: A race condition between debugfs and seq_file operation

2015-06-10 Thread Lisa Du


> -Original Message-
> From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
> Sent: 2015年6月10日 13:20
> To: Lisa Du
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: A race condition between debugfs and seq_file operation
> 
> On Wed, Jun 10, 2015 at 05:00:03AM +, Lisa Du wrote:
> > > -Original Message-
> > > From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
> > > Sent: 2015年6月10日 5:12
> > > To: Lisa Du
> > > Cc: linux-kernel@vger.kernel.org
> > > Subject: Re: A race condition between debugfs and seq_file operation
> > >
> > > On Mon, Jun 08, 2015 at 04:28:10AM +, Lisa Du wrote:
> > > > Hi, All
> > > > Recently I met one race condition related to debugfs.
> > > >
> > > > Take an example from ion.c in kernel3.14:
> > > > static int ion_debug_client_open(struct inode *inode, struct file
> > > > *file) {
> > > > return single_open(file, ion_debug_client_show,
> > > > inode->i_private); }
> > > >
> > > > static const struct file_operations debug_client_fops = {
> > > > .open = ion_debug_client_open,
> > > > .read = seq_read,
> > > > .llseek = seq_lseek,
> > > > .release = single_release,
> > > > };
> > > > client->debug_root = debugfs_create_file(client->display_name,
> > > > client->0664,
> > > > dev->clients_debug_root,
> > > > client, _client_fops);
> > > >
> > > > I find during I read the debugfs node, driver can do
> > > > debugfs_remove_recursive(dentry); Is it expected?
> > >
> > > Yes.  Well, not "expected", but a mess, yes.
> > >
> > > Removing debugfs files are known to have lots of races, this isn't
> > > the only one :(
> > Thanks for the reply!
> > Not sure if there is any plan to resolve such races in the future?
> 
> Yes, I have "plans", but it's on my very long todo list behind lots of
> other things...
> 
> If you want to look into it, please, that would be wonderful.
Ok, I see, thanks!
> 
> thanks,
> 
> greg k-h


RE: A race condition between debugfs and seq_file operation

2015-06-10 Thread Lisa Du


 -Original Message-
 From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
 Sent: 2015年6月10日 13:20
 To: Lisa Du
 Cc: linux-kernel@vger.kernel.org
 Subject: Re: A race condition between debugfs and seq_file operation
 
 On Wed, Jun 10, 2015 at 05:00:03AM +, Lisa Du wrote:
   -Original Message-
   From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
   Sent: 2015年6月10日 5:12
   To: Lisa Du
   Cc: linux-kernel@vger.kernel.org
   Subject: Re: A race condition between debugfs and seq_file operation
  
   On Mon, Jun 08, 2015 at 04:28:10AM +, Lisa Du wrote:
Hi, All
Recently I met one race condition related to debugfs.
   
Take an example from ion.c in kernel3.14:
static int ion_debug_client_open(struct inode *inode, struct file
*file) {
return single_open(file, ion_debug_client_show,
inode-i_private); }
   
static const struct file_operations debug_client_fops = {
.open = ion_debug_client_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
client-debug_root = debugfs_create_file(client-display_name,
client-0664,
dev-clients_debug_root,
client, debug_client_fops);
   
I find during I read the debugfs node, driver can do
debugfs_remove_recursive(dentry); Is it expected?
  
   Yes.  Well, not expected, but a mess, yes.
  
   Removing debugfs files are known to have lots of races, this isn't
   the only one :(
  Thanks for the reply!
  Not sure if there is any plan to resolve such races in the future?
 
 Yes, I have plans, but it's on my very long todo list behind lots of
 other things...
 
 If you want to look into it, please, that would be wonderful.
Ok, I see, thanks!
 
 thanks,
 
 greg k-h


RE: A race condition between debugfs and seq_file operation

2015-06-09 Thread Lisa Du
> -Original Message-
> From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
> Sent: 2015年6月10日 5:12
> To: Lisa Du
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: A race condition between debugfs and seq_file operation
> 
> On Mon, Jun 08, 2015 at 04:28:10AM +, Lisa Du wrote:
> > Hi, All
> > Recently I met one race condition related to debugfs.
> >
> > Take an example from ion.c in kernel3.14:
> > static int ion_debug_client_open(struct inode *inode, struct file
> > *file) {
> > return single_open(file, ion_debug_client_show, inode->i_private); }
> >
> > static const struct file_operations debug_client_fops = {
> > .open = ion_debug_client_open,
> > .read = seq_read,
> > .llseek = seq_lseek,
> > .release = single_release,
> > };
> > client->debug_root = debugfs_create_file(client->display_name, 0664,
> > dev->clients_debug_root,
> > client, _client_fops);
> >
> > I find during I read the debugfs node, driver can do
> > debugfs_remove_recursive(dentry); Is it expected?
> 
> Yes.  Well, not "expected", but a mess, yes.
> 
> Removing debugfs files are known to have lots of races, this isn't the only
> one :(
Thanks for the reply! 
Not sure if there is any plan to resolve such races in the future?
> 
> thanks,
> 
> greg k-h
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

RE: A race condition between debugfs and seq_file operation

2015-06-09 Thread Lisa Du
 -Original Message-
 From: gre...@linuxfoundation.org [mailto:gre...@linuxfoundation.org]
 Sent: 2015年6月10日 5:12
 To: Lisa Du
 Cc: linux-kernel@vger.kernel.org
 Subject: Re: A race condition between debugfs and seq_file operation
 
 On Mon, Jun 08, 2015 at 04:28:10AM +, Lisa Du wrote:
  Hi, All
  Recently I met one race condition related to debugfs.
 
  Take an example from ion.c in kernel3.14:
  static int ion_debug_client_open(struct inode *inode, struct file
  *file) {
  return single_open(file, ion_debug_client_show, inode-i_private); }
 
  static const struct file_operations debug_client_fops = {
  .open = ion_debug_client_open,
  .read = seq_read,
  .llseek = seq_lseek,
  .release = single_release,
  };
  client-debug_root = debugfs_create_file(client-display_name, 0664,
  dev-clients_debug_root,
  client, debug_client_fops);
 
  I find during I read the debugfs node, driver can do
  debugfs_remove_recursive(dentry); Is it expected?
 
 Yes.  Well, not expected, but a mess, yes.
 
 Removing debugfs files are known to have lots of races, this isn't the only
 one :(
Thanks for the reply! 
Not sure if there is any plan to resolve such races in the future?
 
 thanks,
 
 greg k-h
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

A race condition between debugfs and seq_file operation

2015-06-07 Thread Lisa Du
Hi, All
Recently I met one race condition related to debugfs.

Take an example from ion.c in kernel3.14:
static int ion_debug_client_open(struct inode *inode, struct file *file)
{
return single_open(file, ion_debug_client_show, inode->i_private);
}

static const struct file_operations debug_client_fops = {
.open = ion_debug_client_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
client->debug_root = debugfs_create_file(client->display_name, 0664,
dev->clients_debug_root,
client, _client_fops);

I find during I read the debugfs node, driver can do 
debugfs_remove_recursive(dentry);
Is it expected?  

In this case, when do the seq_file read, it grabs the seq_file->lock;
While in debugfs_remove_recursive(), it graps "parent->d_inode->i_mutex".
So there seems no protection between the ion_debug_client_show() and 
debugfs_remove_recursive().

Please let me know if I didn't describe the issue clear.
Would you help to comment if there's method to avoid such issue?

Best Regards.
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

A race condition between debugfs and seq_file operation

2015-06-07 Thread Lisa Du
Hi, All
Recently I met one race condition related to debugfs.

Take an example from ion.c in kernel3.14:
static int ion_debug_client_open(struct inode *inode, struct file *file)
{
return single_open(file, ion_debug_client_show, inode-i_private);
}

static const struct file_operations debug_client_fops = {
.open = ion_debug_client_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
client-debug_root = debugfs_create_file(client-display_name, 0664,
dev-clients_debug_root,
client, debug_client_fops);

I find during I read the debugfs node, driver can do 
debugfs_remove_recursive(dentry);
Is it expected?  

In this case, when do the seq_file read, it grabs the seq_file-lock;
While in debugfs_remove_recursive(), it graps parent-d_inode-i_mutex.
So there seems no protection between the ion_debug_client_show() and 
debugfs_remove_recursive().

Please let me know if I didn't describe the issue clear.
Would you help to comment if there's method to avoid such issue?

Best Regards.
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

RE: 6e543d5780e fixed a boot hang

2013-10-09 Thread Lisa Du
>-Original Message-
>From: Fengguang Wu [mailto:fengguang...@intel.com]
>Sent: 2013年10月9日 22:12
>To: Lisa Du
>Cc: KOSAKI Motohiro; linux...@kvack.org; linux-kernel@vger.kernel.org
>Subject: 6e543d5780e fixed a boot hang
>
>Greetings,
>
>FYI, this commit seem to fix a boot hang problem here.
>
>commit 6e543d5780e36ff5ee56c44d7e2e30db3457a7ed
>Author: Lisa Du 
>Date:   Wed Sep 11 14:22:36 2013 -0700
>
>mm: vmscan: fix do_try_to_free_pages() livelock
>
>
>[1.394871] pci :00:02.0: Boot video device
>[1.395883] PCI: CLS 0 bytes, default 64
>
>In parent commit, it will hang right here.
>
>With this commit, it will continue to emit the below OOM messages (which is 
>not a surprise to me because the boot test runs in a small
>memory KVM and the kconfig builds in lots of drivers).
I think you may meet the same issue as mine.
Direct reclaim loop forever with zone->all_unreclaimable = 0(as kswapd sleeps 
forever).
And at the boot stage, no one detect and terminate it, so you see the boot 
hang. 
After apply this patch, you see there's oom-killer invoked as direct reclaim 
would break when zone was unreclaimable.
>
>[1.631892] swapper/0 invoked oom-killer: gfp_mask=0x2000d0, 
> order=1, oom_score_adj=0
>[1.633549] swapper/0 cpuset=/ mems_allowed=0
>[1.634443] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
> 3.12.0-rc4-00019-g8b5ede6 #126
>[1.635982] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>[1.637088]  0002 88001dd41b28 82c8d78f 
> 88001ef7c040
>[1.638955]  88001dd41ba8 82c8395f 83c54680 
> 88001dd41b60
>[1.640830]  810f3f06 1eb4 0246 
> 88001dd41b98
>[1.642687] Call Trace:
>[1.643313]  [] dump_stack+0x54/0x74
>[1.644331]  [] dump_header.isra.10+0x7a/0x1ba
>[1.645443]  [] ? 
> lock_release_holdtime.part.27+0x4c/0x50
>[1.646685]  [] ? lock_release+0x189/0x1d1
>[1.647744]  [] out_of_memory+0x39e/0x3ee
>[1.648882]  [] __alloc_pages_nodemask+0x668/0x7de
>[1.650385]  [] kmem_getpages+0x75/0x16c
>[1.651429]  [] fallback_alloc+0x12c/0x1ea
>[1.652528]  [] ? trace_hardirqs_off+0xd/0xf
>[1.653627]  [] cache_alloc_node+0x14a/0x159
>[1.654783]  [] ? dma_debug_init+0x1ef/0x29a
>[1.655928]  [] kmem_cache_alloc_trace+0x83/0x11a
>[1.657108]  [] dma_debug_init+0x1ef/0x29a
>[1.658182]  [] pci_iommu_init+0x16/0x52
>[1.659263]  [] ? iommu_setup+0x27d/0x27d
>[1.660342]  [] do_one_initcall+0x93/0x137
>[1.661415]  [] ? param_set_charp+0x92/0xd8
>[1.662503]  [] ? parse_args+0x189/0x247
>[1.663555]  [] kernel_init_freeable+0x15e/0x1df
>[1.664724]  [] ? do_early_param+0x88/0x88
>[1.665814]  [] ? rest_init+0xdb/0xdb
>[1.666824]  [] kernel_init+0xe/0xdb
>[1.667824]  [] ret_from_fork+0x7c/0xb0
>[1.668911]  [] ? rest_init+0xdb/0xdb
>[1.669925] Mem-Info:
>[1.670508] Node 0 DMA per-cpu:
>
>Thanks,
>Fengguang
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

RE: 6e543d5780e fixed a boot hang

2013-10-09 Thread Lisa Du
-Original Message-
From: Fengguang Wu [mailto:fengguang...@intel.com]
Sent: 2013年10月9日 22:12
To: Lisa Du
Cc: KOSAKI Motohiro; linux...@kvack.org; linux-kernel@vger.kernel.org
Subject: 6e543d5780e fixed a boot hang

Greetings,

FYI, this commit seem to fix a boot hang problem here.

commit 6e543d5780e36ff5ee56c44d7e2e30db3457a7ed
Author: Lisa Du c...@marvell.com
Date:   Wed Sep 11 14:22:36 2013 -0700

mm: vmscan: fix do_try_to_free_pages() livelock


[1.394871] pci :00:02.0: Boot video device
[1.395883] PCI: CLS 0 bytes, default 64

In parent commit, it will hang right here.

With this commit, it will continue to emit the below OOM messages (which is 
not a surprise to me because the boot test runs in a small
memory KVM and the kconfig builds in lots of drivers).
I think you may meet the same issue as mine.
Direct reclaim loop forever with zone-all_unreclaimable = 0(as kswapd sleeps 
forever).
And at the boot stage, no one detect and terminate it, so you see the boot 
hang. 
After apply this patch, you see there's oom-killer invoked as direct reclaim 
would break when zone was unreclaimable.

[1.631892] swapper/0 invoked oom-killer: gfp_mask=0x2000d0, 
 order=1, oom_score_adj=0
[1.633549] swapper/0 cpuset=/ mems_allowed=0
[1.634443] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
 3.12.0-rc4-00019-g8b5ede6 #126
[1.635982] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[1.637088]  0002 88001dd41b28 82c8d78f 
 88001ef7c040
[1.638955]  88001dd41ba8 82c8395f 83c54680 
 88001dd41b60
[1.640830]  810f3f06 1eb4 0246 
 88001dd41b98
[1.642687] Call Trace:
[1.643313]  [82c8d78f] dump_stack+0x54/0x74
[1.644331]  [82c8395f] dump_header.isra.10+0x7a/0x1ba
[1.645443]  [810f3f06] ? 
 lock_release_holdtime.part.27+0x4c/0x50
[1.646685]  [810f795a] ? lock_release+0x189/0x1d1
[1.647744]  [811530a8] out_of_memory+0x39e/0x3ee
[1.648882]  [811579f5] __alloc_pages_nodemask+0x668/0x7de
[1.650385]  [8118eb53] kmem_getpages+0x75/0x16c
[1.651429]  [81190d20] fallback_alloc+0x12c/0x1ea
[1.652528]  [810f38e8] ? trace_hardirqs_off+0xd/0xf
[1.653627]  [81190be5] cache_alloc_node+0x14a/0x159
[1.654783]  [817059fb] ? dma_debug_init+0x1ef/0x29a
[1.655928]  [8119162c] kmem_cache_alloc_trace+0x83/0x11a
[1.657108]  [817059fb] dma_debug_init+0x1ef/0x29a
[1.658182]  [841ac38b] pci_iommu_init+0x16/0x52
[1.659263]  [841ac375] ? iommu_setup+0x27d/0x27d
[1.660342]  [810020d2] do_one_initcall+0x93/0x137
[1.661415]  [810bd300] ? param_set_charp+0x92/0xd8
[1.662503]  [810bd52e] ? parse_args+0x189/0x247
[1.663555]  [8419fed1] kernel_init_freeable+0x15e/0x1df
[1.664724]  [8419f729] ? do_early_param+0x88/0x88
[1.665814]  [82c77867] ? rest_init+0xdb/0xdb
[1.666824]  [82c77875] kernel_init+0xe/0xdb
[1.667824]  [82cbc57c] ret_from_fork+0x7c/0xb0
[1.668911]  [82c77867] ? rest_init+0xdb/0xdb
[1.669925] Mem-Info:
[1.670508] Node 0 DMA per-cpu:

Thanks,
Fengguang
N�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i