A problem with the load_elf_interp() in fs/binfmt_elf

2007-02-14 Thread shic

Hi ,All

When I did some NFS mount operations, I found a problem with the load_elf_interp() in the fs/binfmt_elf. 


I mounted a number of NFS, after lots of continuous mount operations, the fatal error 
"Segmentation fault" happened with the mount.nfs4, as the count of the mount 
operations reached about 1000~3000.

Using the strace, I found the mount.nfs4 fails just at the execve().The strace 
log is as follows.
execve("/sbin/mount.nfs4", ["mount.nfs4", "192.168.236.8:/", 
"/mnt/nfspt/num_mount_014847/nfs-"...], [/* 24 vars */]) = -1 EINVAL (Invalid argument)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

I have investigate it, and found the problem exists in the load_elf_interp() when the 
"Segmentation fault" happened.

When the mount.nfs4 is executed, the execve() in the kernel is invoked as follows.  
sys_execve
 | - 
 do_execve

  |
  | - search_binary_handler   
   |-  linux_binfmt= elf_format
   |-  
   |- elf_format-> load_elf_binary

| -  elf_entry = load_elf_interp()
|-
|  if  (BAD_ADDR(elf_entry)) 
|  force_sig(SIGSEGV, 
| retval =-EINVAL;


In the do_execve(), after setting up some data structure, the do_execve() will 
invoke the search_binary_handler() to get the corresponding ELF binary loader 
for the mount.nfs4, and read the ELF executable image into memory.
In my test, when the "segment fault" of mount.nfs4 happened, in the procedure of 
load_elf_binary(),the address elf_entry of the interp segment read from the load_elf_interp() was judged a 
BAD_ADDR and afterwards the kernel send a forcible signal "SIGSEGV" to the process of mount.nfs4, 
and exit with the retval "EINVAL".
After I debug the kernel, I have found the cause is the address map_addr in the load_elf_interp(). 
When the problem happens, the map_addr returned from the elf_map() is judged a valid address by BAD_ADDR() in load_elf_interp() , but unluckily the address elf_entry returned by  "map_addr - ELF_PAGESTART(eppnt->p_vaddr)" is judged an invalid address by BAD_ADDR(), then the problem happens in the load_elf_binary().


By now, I've still not got a good method to resolve this problem in the 
load_elf_interp().
Any good ideas?

Thanks.

Best Regards.
Shi Chao









-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


A problem with the load_elf_interp() in fs/binfmt_elf

2007-02-14 Thread shic

Hi ,All

When I did some NFS mount operations, I found a problem with the load_elf_interp() in the fs/binfmt_elf. 


I mounted a number of NFS, after lots of continuous mount operations, the fatal error 
Segmentation fault happened with the mount.nfs4, as the count of the mount 
operations reached about 1000~3000.

Using the strace, I found the mount.nfs4 fails just at the execve().The strace 
log is as follows.
execve(/sbin/mount.nfs4, [mount.nfs4, 192.168.236.8:/, 
/mnt/nfspt/num_mount_014847/nfs-...], [/* 24 vars */]) = -1 EINVAL (Invalid argument)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

I have investigate it, and found the problem exists in the load_elf_interp() when the 
Segmentation fault happened.

When the mount.nfs4 is executed, the execve() in the kernel is invoked as follows.  
sys_execve
 | - 
 do_execve

  |
  | - search_binary_handler   
   |-  linux_binfmt= elf_format
   |-  
   |- elf_format- load_elf_binary

| -  elf_entry = load_elf_interp()
|-
|  if  (BAD_ADDR(elf_entry)) 
|  force_sig(SIGSEGV, 
| retval =-EINVAL;


In the do_execve(), after setting up some data structure, the do_execve() will 
invoke the search_binary_handler() to get the corresponding ELF binary loader 
for the mount.nfs4, and read the ELF executable image into memory.
In my test, when the segment fault of mount.nfs4 happened, in the procedure of 
load_elf_binary(),the address elf_entry of the interp segment read from the load_elf_interp() was judged a 
BAD_ADDR and afterwards the kernel send a forcible signal SIGSEGV to the process of mount.nfs4, 
and exit with the retval EINVAL.
After I debug the kernel, I have found the cause is the address map_addr in the load_elf_interp(). 
When the problem happens, the map_addr returned from the elf_map() is judged a valid address by BAD_ADDR() in load_elf_interp() , but unluckily the address elf_entry returned by  map_addr - ELF_PAGESTART(eppnt-p_vaddr) is judged an invalid address by BAD_ADDR(), then the problem happens in the load_elf_binary().


By now, I've still not got a good method to resolve this problem in the 
load_elf_interp().
Any good ideas?

Thanks.

Best Regards.
Shi Chao









-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch][NFSv4]: Kernel panic (handle kernel NULL pointer)occurred in NFSv4 by the nfs_update_inode()

2007-01-28 Thread fj-shic

Hi, all

  Is there any good idea about this issue?
  I really think this is a problem in the NFSv4, which should be resolved 
in the latest kernel.


  Thank you for any information on this issue.


Hi, all

 I have been working on trying to evaluate the robustness of the NFS.
 When I run the files creating test on NFSv4, the kernel panic (handle
kernel NULL pointer) happened during the test when
the /proc/sys/sunrpc/nfs_debug was set to 65535.

The test I run is:
  Mount:
  mount -t nfs4  192.168.0.21:/  /mnt
  Test script:
A process creates and opens 1024 files on NFSv4
The process writes some test message to each file.
The process closes all the opening files and exit.
After the process exit, the test begins to umount the NFSv4
immediately.

In normal case, I haven't found the problem in the test; but when
the /proc/sys/sunrpc/nfs_debug is set to debug mode(65535), the panic
problem will usually happens.

The kernel panic log is as follows:

NFS: nfs_update_inode(0:18/2049980 ct=1 info=0xe)
BUG: unable to handle kernel NULL pointer dereference at virtual address
000c
printing eip: deb82605
*pde = 001f3067
Oops:  [#1]
SMP
last sysfs file: /block/hda/removable
Modules linked in: nfs fscache nfsd ¡­. ¡­¡­.
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2747.el5 #1)
EIP is at nfs_update_inode+0xb0/0x692 [nfs]
¡­skip¡­.
Process rpciod/0 (pid: 1865, ti=dd3ac000 task=ddd47550 task.ti=dd3ac000)
Stack: deba0609 ¡­ ¡­
Call Trace:
[] nfs_refresh_inode+0x38/0x1b0 [nfs]
[] rpc_exit_task+0x1e/0x6c [sunrpc]
[] __rpc_execute+0x82/0x1b3 [sunrpc]
[] run_workqueue+0x83/0xc5
[] worker_thread+0xd9/0x10c
[] kthread+0xc0/0xec
[] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

I have investigated the problem, and found the cause was the NULL
pointer i_sb->s_root in the "nfs_update_inode()" when the panic
happened.

 In the kernel, at the end of the file closing operation, the
nfs_file_release() will be invoked.

 I have found the operation process of the kernel is as follows:
 nfs_file_release()
  |-- NFS_PROTO(inode)->file_release ()
|
  nfs_file_clear_open_context()
|
   put_nfs_open_context()
  | -- nfs4_close_state
  |   | -- nfs4_close_ops
  | |
  |   nfs4_do_close()
  |  |
  | nfs_update_inode()
  ||-- inode == inode->i_sb-

s_root->d_inode

  |
  | -- mntput(ctx_vfsmnt)
 |
atomic_dec_and_lock(>mnt_count,
_lock)

After the asynchronous RPC call "nfs4_close_ops" is invoked in
put_nfs_open_context(), the kernel invokes the mntput(), and the mnt-

mnt_count is decreased. In my test, after the file closing operation,

the sys_umont() was executed immediately. In normal case, the
asynchronous RPC call "nfs4_close_ops" can be completed quickly, and it
rarely ever happens that the sys_umount() is invoked before the end of
nfs_update_inode() operation. But when the sunrpc/nfs_debug is set to
debug mode, a lot of printk operations will be invoked in NFS. Due to
the lots of prink operations, the RPC call "nfs4_close_ops" will easily
be delayed, then it is possible that the sys_umount() is invoked before
the end of nfs_update_inode() operation. In the do_umont() (Because mnt-

mnt_count has been decreased, umount can be executed successfully), the

sb->s_root will be set to NULL in the shrink_dcache_for_umount () which
is invoked by the nfs_kill_super().
Therefore, kernel panic occurred by the NULL pointer access when
nfs_update_date() accessed inode->i_sb->s_root.

Because there is a possibility that sb->s_root of a super_block is set
to NULL with umount when nfs4_close_ops () is not finished in the NFS.
It is really necessary to check an empty pointer for the inode->i_sb-

s_root in the nfs_update_date().


This problem still exists in the latest linux kernel.
To resolve this problem, I have made the patch below for the
linux-2.6.19.2. After the patch is applied, the problem can be resolved.
signed-off-by:ShiChao <[EMAIL PROTECTED]>

--- old/fs/nfs/inode.c 2007-01-17 17:19:48.994020288 -0500
+++ new/fs/nfs/inode.c 2007-01-17 17:20:26.570307824 -0500
@@ -914,7 +914,7 @@

 server = NFS_SERVER(inode);
 /* Update the fsid if and only if this is the root directory */
- if (inode == inode->i_sb->s_root->d_inode
+ if ( inode->i_sb->s_root && inode == inode->i_sb->s_root->d_inode
 && !nfs_fsid_equal(>fsid, >fsid))
 server->fsid = fattr->fsid;


___
NFSv4 mailing list
[EMAIL PROTECTED]
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: [Patch][NFSv4]: Kernel panic (handle kernel NULL pointer) occurred in NFSv4 by the nfs_update_inode()

2007-01-28 Thread fj-shic

Hi, all

  Is there any good idea about this issue?
  I really think this is a problem in the NFSv4, which should be resolved 
in the latest kernel.


  Thank you for any information on this issue.


Hi, all

 I have been working on trying to evaluate the robustness of the NFS.
 When I run the files creating test on NFSv4, the kernel panic (handle
kernel NULL pointer) happened during the test when
the /proc/sys/sunrpc/nfs_debug was set to 65535. The test I run is:
  Mount: mount -t nfs4  192.168.0.21:/  /mnt
  Test script:
A process creates and opens 1024 files on NFSv4
The process writes some test message to each file.
The process closes all the opening files and exit.
After the process exit, the test begins to umount the NFSv4
immediately.

In normal case, I haven't found the problem in the test; but when
the /proc/sys/sunrpc/nfs_debug is set to debug mode(65535), the panic
problem will usually happens.

The kernel panic log is as follows:

NFS: nfs_update_inode(0:18/2049980 ct=1 info=0xe)
BUG: unable to handle kernel NULL pointer dereference at virtual address
000c
printing eip: deb82605
*pde = 001f3067
Oops:  [#1]
SMP
last sysfs file: /block/hda/removable
Modules linked in: nfs fscache nfsd ¡­. ¡­¡­.
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2747.el5 #1)
EIP is at nfs_update_inode+0xb0/0x692 [nfs]
¡­skip¡­.
Process rpciod/0 (pid: 1865, ti=dd3ac000 task=ddd47550 task.ti=dd3ac000)
Stack: deba0609 ¡­ ¡­
Call Trace:
[] nfs_refresh_inode+0x38/0x1b0 [nfs]
[] rpc_exit_task+0x1e/0x6c [sunrpc]
[] __rpc_execute+0x82/0x1b3 [sunrpc]
[] run_workqueue+0x83/0xc5
[] worker_thread+0xd9/0x10c
[] kthread+0xc0/0xec
[] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

I have investigated the problem, and found the cause was the NULL
pointer i_sb->s_root in the "nfs_update_inode()" when the panic
happened.

 In the kernel, at the end of the file closing operation, the
nfs_file_release() will be invoked.

 I have found the operation process of the kernel is as follows: 
nfs_file_release()

  |-- NFS_PROTO(inode)->file_release ()   |
  nfs_file_clear_open_context()
|
   put_nfs_open_context()
  | -- nfs4_close_state
  |   | -- nfs4_close_ops
  | |
  |   nfs4_do_close()
  |  |
  | nfs_update_inode()
  ||-- inode == inode->i_sb-

s_root->d_inode

  |
  | -- mntput(ctx_vfsmnt)
 |
atomic_dec_and_lock(>mnt_count,
_lock)

After the asynchronous RPC call "nfs4_close_ops" is invoked in
put_nfs_open_context(), the kernel invokes the mntput(), and the mnt-

mnt_count is decreased. In my test, after the file closing operation,

the sys_umont() was executed immediately. In normal case, the
asynchronous RPC call "nfs4_close_ops" can be completed quickly, and it
rarely ever happens that the sys_umount() is invoked before the end of
nfs_update_inode() operation. But when the sunrpc/nfs_debug is set to
debug mode, a lot of printk operations will be invoked in NFS. Due to
the lots of prink operations, the RPC call "nfs4_close_ops" will easily
be delayed, then it is possible that the sys_umount() is invoked before
the end of nfs_update_inode() operation. In the do_umont() (Because mnt-

mnt_count has been decreased, umount can be executed successfully), the

sb->s_root will be set to NULL in the shrink_dcache_for_umount () which
is invoked by the nfs_kill_super().
Therefore, kernel panic occurred by the NULL pointer access when
nfs_update_date() accessed inode->i_sb->s_root.

Because there is a possibility that sb->s_root of a super_block is set
to NULL with umount when nfs4_close_ops () is not finished in the NFS. It 
is really necessary to check an empty pointer for the inode->i_sb-

s_root in the nfs_update_date().


This problem still exists in the latest linux kernel.
To resolve this problem, I have made the patch below for the
linux-2.6.19.2. After the patch is applied, the problem can be resolved.
signed-off-by:ShiChao <[EMAIL PROTECTED]>

--- old/fs/nfs/inode.c 2007-01-17 17:19:48.994020288 -0500
+++ new/fs/nfs/inode.c 2007-01-17 17:20:26.570307824 -0500
@@ -914,7 +914,7 @@

 server = NFS_SERVER(inode);
 /* Update the fsid if and only if this is the root directory */
- if (inode == inode->i_sb->s_root->d_inode
+ if ( inode->i_sb->s_root && inode == inode->i_sb->s_root->d_inode
 && !nfs_fsid_equal(>fsid, >fsid))
 server->fsid = fattr->fsid;




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch][NFSv4]: Kernel panic (handle kernel NULL pointer) occurred in NFSv4 by the nfs_update_inode()

2007-01-28 Thread fj-shic

Hi, all

  Is there any good idea about this issue?
  I really think this is a problem in the NFSv4, which should be resolved 
in the latest kernel.


  Thank you for any information on this issue.


Hi, all

 I have been working on trying to evaluate the robustness of the NFS.
 When I run the files creating test on NFSv4, the kernel panic (handle
kernel NULL pointer) happened during the test when
the /proc/sys/sunrpc/nfs_debug was set to 65535. The test I run is:
  Mount: mount -t nfs4  192.168.0.21:/  /mnt
  Test script:
A process creates and opens 1024 files on NFSv4
The process writes some test message to each file.
The process closes all the opening files and exit.
After the process exit, the test begins to umount the NFSv4
immediately.

In normal case, I haven't found the problem in the test; but when
the /proc/sys/sunrpc/nfs_debug is set to debug mode(65535), the panic
problem will usually happens.

The kernel panic log is as follows:

NFS: nfs_update_inode(0:18/2049980 ct=1 info=0xe)
BUG: unable to handle kernel NULL pointer dereference at virtual address
000c
printing eip: deb82605
*pde = 001f3067
Oops:  [#1]
SMP
last sysfs file: /block/hda/removable
Modules linked in: nfs fscache nfsd ¡­. ¡­¡­.
CPU:0
EIP:0060:[deb82605]Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2747.el5 #1)
EIP is at nfs_update_inode+0xb0/0x692 [nfs]
¡­skip¡­.
Process rpciod/0 (pid: 1865, ti=dd3ac000 task=ddd47550 task.ti=dd3ac000)
Stack: deba0609 ¡­ ¡­
Call Trace:
[deb82c1f] nfs_refresh_inode+0x38/0x1b0 [nfs]
[dea9f602] rpc_exit_task+0x1e/0x6c [sunrpc]
[dea9f314] __rpc_execute+0x82/0x1b3 [sunrpc]
[c0433899] run_workqueue+0x83/0xc5
[c0434171] worker_thread+0xd9/0x10c
[c0436620] kthread+0xc0/0xec
[c0404d63] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

I have investigated the problem, and found the cause was the NULL
pointer i_sb-s_root in the nfs_update_inode() when the panic
happened.

 In the kernel, at the end of the file closing operation, the
nfs_file_release() will be invoked.

 I have found the operation process of the kernel is as follows: 
nfs_file_release()

  |-- NFS_PROTO(inode)-file_release ()   |
  nfs_file_clear_open_context()
|
   put_nfs_open_context()
  | -- nfs4_close_state
  |   | -- nfs4_close_ops
  | |
  |   nfs4_do_close()
  |  |
  | nfs_update_inode()
  ||-- inode == inode-i_sb-

s_root-d_inode

  |
  | -- mntput(ctx_vfsmnt)
 |
atomic_dec_and_lock(mnt-mnt_count,
vfsmount_lock)

After the asynchronous RPC call nfs4_close_ops is invoked in
put_nfs_open_context(), the kernel invokes the mntput(), and the mnt-

mnt_count is decreased. In my test, after the file closing operation,

the sys_umont() was executed immediately. In normal case, the
asynchronous RPC call nfs4_close_ops can be completed quickly, and it
rarely ever happens that the sys_umount() is invoked before the end of
nfs_update_inode() operation. But when the sunrpc/nfs_debug is set to
debug mode, a lot of printk operations will be invoked in NFS. Due to
the lots of prink operations, the RPC call nfs4_close_ops will easily
be delayed, then it is possible that the sys_umount() is invoked before
the end of nfs_update_inode() operation. In the do_umont() (Because mnt-

mnt_count has been decreased, umount can be executed successfully), the

sb-s_root will be set to NULL in the shrink_dcache_for_umount () which
is invoked by the nfs_kill_super().
Therefore, kernel panic occurred by the NULL pointer access when
nfs_update_date() accessed inode-i_sb-s_root.

Because there is a possibility that sb-s_root of a super_block is set
to NULL with umount when nfs4_close_ops () is not finished in the NFS. It 
is really necessary to check an empty pointer for the inode-i_sb-

s_root in the nfs_update_date().


This problem still exists in the latest linux kernel.
To resolve this problem, I have made the patch below for the
linux-2.6.19.2. After the patch is applied, the problem can be resolved.
signed-off-by:ShiChao [EMAIL PROTECTED]

--- old/fs/nfs/inode.c 2007-01-17 17:19:48.994020288 -0500
+++ new/fs/nfs/inode.c 2007-01-17 17:20:26.570307824 -0500
@@ -914,7 +914,7 @@

 server = NFS_SERVER(inode);
 /* Update the fsid if and only if this is the root directory */
- if (inode == inode-i_sb-s_root-d_inode
+ if ( inode-i_sb-s_root  inode == inode-i_sb-s_root-d_inode
  !nfs_fsid_equal(server-fsid, fattr-fsid))
 server-fsid = fattr-fsid;




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch][NFSv4]: Kernel panic (handle kernel NULL pointer)occurred in NFSv4 by the nfs_update_inode()

2007-01-28 Thread fj-shic

Hi, all

  Is there any good idea about this issue?
  I really think this is a problem in the NFSv4, which should be resolved 
in the latest kernel.


  Thank you for any information on this issue.


Hi, all

 I have been working on trying to evaluate the robustness of the NFS.
 When I run the files creating test on NFSv4, the kernel panic (handle
kernel NULL pointer) happened during the test when
the /proc/sys/sunrpc/nfs_debug was set to 65535.

The test I run is:
  Mount:
  mount -t nfs4  192.168.0.21:/  /mnt
  Test script:
A process creates and opens 1024 files on NFSv4
The process writes some test message to each file.
The process closes all the opening files and exit.
After the process exit, the test begins to umount the NFSv4
immediately.

In normal case, I haven't found the problem in the test; but when
the /proc/sys/sunrpc/nfs_debug is set to debug mode(65535), the panic
problem will usually happens.

The kernel panic log is as follows:

NFS: nfs_update_inode(0:18/2049980 ct=1 info=0xe)
BUG: unable to handle kernel NULL pointer dereference at virtual address
000c
printing eip: deb82605
*pde = 001f3067
Oops:  [#1]
SMP
last sysfs file: /block/hda/removable
Modules linked in: nfs fscache nfsd ¡­. ¡­¡­.
CPU:0
EIP:0060:[deb82605]Not tainted VLI
EFLAGS: 00010246   (2.6.18-1.2747.el5 #1)
EIP is at nfs_update_inode+0xb0/0x692 [nfs]
¡­skip¡­.
Process rpciod/0 (pid: 1865, ti=dd3ac000 task=ddd47550 task.ti=dd3ac000)
Stack: deba0609 ¡­ ¡­
Call Trace:
[deb82c1f] nfs_refresh_inode+0x38/0x1b0 [nfs]
[dea9f602] rpc_exit_task+0x1e/0x6c [sunrpc]
[dea9f314] __rpc_execute+0x82/0x1b3 [sunrpc]
[c0433899] run_workqueue+0x83/0xc5
[c0434171] worker_thread+0xd9/0x10c
[c0436620] kthread+0xc0/0xec
[c0404d63] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

I have investigated the problem, and found the cause was the NULL
pointer i_sb-s_root in the nfs_update_inode() when the panic
happened.

 In the kernel, at the end of the file closing operation, the
nfs_file_release() will be invoked.

 I have found the operation process of the kernel is as follows:
 nfs_file_release()
  |-- NFS_PROTO(inode)-file_release ()
|
  nfs_file_clear_open_context()
|
   put_nfs_open_context()
  | -- nfs4_close_state
  |   | -- nfs4_close_ops
  | |
  |   nfs4_do_close()
  |  |
  | nfs_update_inode()
  ||-- inode == inode-i_sb-

s_root-d_inode

  |
  | -- mntput(ctx_vfsmnt)
 |
atomic_dec_and_lock(mnt-mnt_count,
vfsmount_lock)

After the asynchronous RPC call nfs4_close_ops is invoked in
put_nfs_open_context(), the kernel invokes the mntput(), and the mnt-

mnt_count is decreased. In my test, after the file closing operation,

the sys_umont() was executed immediately. In normal case, the
asynchronous RPC call nfs4_close_ops can be completed quickly, and it
rarely ever happens that the sys_umount() is invoked before the end of
nfs_update_inode() operation. But when the sunrpc/nfs_debug is set to
debug mode, a lot of printk operations will be invoked in NFS. Due to
the lots of prink operations, the RPC call nfs4_close_ops will easily
be delayed, then it is possible that the sys_umount() is invoked before
the end of nfs_update_inode() operation. In the do_umont() (Because mnt-

mnt_count has been decreased, umount can be executed successfully), the

sb-s_root will be set to NULL in the shrink_dcache_for_umount () which
is invoked by the nfs_kill_super().
Therefore, kernel panic occurred by the NULL pointer access when
nfs_update_date() accessed inode-i_sb-s_root.

Because there is a possibility that sb-s_root of a super_block is set
to NULL with umount when nfs4_close_ops () is not finished in the NFS.
It is really necessary to check an empty pointer for the inode-i_sb-

s_root in the nfs_update_date().


This problem still exists in the latest linux kernel.
To resolve this problem, I have made the patch below for the
linux-2.6.19.2. After the patch is applied, the problem can be resolved.
signed-off-by:ShiChao [EMAIL PROTECTED]

--- old/fs/nfs/inode.c 2007-01-17 17:19:48.994020288 -0500
+++ new/fs/nfs/inode.c 2007-01-17 17:20:26.570307824 -0500
@@ -914,7 +914,7 @@

 server = NFS_SERVER(inode);
 /* Update the fsid if and only if this is the root directory */
- if (inode == inode-i_sb-s_root-d_inode
+ if ( inode-i_sb-s_root  inode == inode-i_sb-s_root-d_inode
  !nfs_fsid_equal(server-fsid, fattr-fsid))
 server-fsid = fattr-fsid;


___
NFSv4 mailing list
[EMAIL PROTECTED]
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More