Re: [Ocfs2-users] null pointer dereference

2012-08-21 Thread Sunil Mushran
You may want to run a full fsck on the fs.

fsck.ocfs2 -fy /dev/

On Tue, Aug 21, 2012 at 12:49 AM, Pawel  wrote:

> Hi,
> After upgrading ocfs2 my cluster is instable.
>
> At least ones per week I can see:
> kernel panic: Null pointer dereference  at 00048
> o2dlm_blocking_ast_wrapper + 0x8/0x20 [ocfs2_stack_o2cb]
> stack:
> dlm_do_local_bast [ocfs2_dlm]
> dlm_lookup_lockers [ocfs2_dlm]
> dlm_proxy_ast_handler
> add_timer
> ..
>
> After that sometimes deadlock happens on another nodes. Entire cluster
> restart solve the issue.
> I see in log:
> (dlm_thread,7227,3):dlm_send_proxy_ast_msg:484 ERROR:
> ECB9442E19A94EAC896641BFADD55E4B: res M0001f411c9,
> error -107 send AST to node 4
> (dlm_thread,7227,3):dlm_flush_asts:605 ERROR: status = -107
> o2net: No connection established with node 4 after 10.0 seconds, giving up.
> o2net: No connection established with node 4 after 10.0 seconds, giving up.
> o2net: No connection established with node 4 after 10.0 seconds, giving up.
> (dlm_thread,7227,4):dlm_send_proxy_ast_msg:484 ERROR:
> ECB9442E19A94EAC896641BFADD55E4B: res M0001f411c9,
> error -107 send AST to node 4
> (dlm_thread,7227,4):dlm_flush_asts:605 ERROR: status = -107
> o2cb: o2dlm has evicted node 4 from domain ECB9442E19A94EAC896641BFADD55E4B
> o2cb: o2dlm has evicted node 4 from domain ECB9442E19A94EAC896641BFADD55E4B
> o2dlm: Begin recovery on domain ECB9442E19A94EAC896641BFADD55E4B for node 4
> o2dlm: Node 5 (he) is the Recovery Master for the dead node 4 in domain
> ECB9442E19A94EAC896641BFADD55E4B
> o2dlm: End recovery on domain ECB9442E19A94EAC896641BFADD55E4B
>
>
> Additionaly ~4 times per day I see:
>
> ocfs2_check_dir_for_entry:2119 ERROR: status = -17
> ocfs2_mknod:459 ERROR: status = -17
> ocfs2_create:629 ERROR: status = -17
>
>
> I currently use kernel 3.4.2
> my filesystem has been created with:
> -N 8-b 4096 -C 32768 --fs-features
>
> backup-super,strict-journal-super,sparse,extended-slotmap,inline-data,metaecc,xattr,indexed-dirs,refcount,discontig-bg,unwritten,usrquota,grpquota
>
> Could you tell me what could make my system instable? Which feature ?
>
> Thanks for any  help
>
> Pawel
>
>
> ___
> Ocfs2-users mailing list
> Ocfs2-users@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] null pointer dereference

2012-08-21 Thread Pawel
Hi,
After upgrading ocfs2 my cluster is instable.

At least ones per week I can see:
kernel panic: Null pointer dereference  at 00048
o2dlm_blocking_ast_wrapper + 0x8/0x20 [ocfs2_stack_o2cb]
stack:
dlm_do_local_bast [ocfs2_dlm]
dlm_lookup_lockers [ocfs2_dlm]
dlm_proxy_ast_handler
add_timer
..

After that sometimes deadlock happens on another nodes. Entire cluster 
restart solve the issue.
I see in log:
(dlm_thread,7227,3):dlm_send_proxy_ast_msg:484 ERROR: 
ECB9442E19A94EAC896641BFADD55E4B: res M0001f411c9, 
error -107 send AST to node 4
(dlm_thread,7227,3):dlm_flush_asts:605 ERROR: status = -107
o2net: No connection established with node 4 after 10.0 seconds, giving up.
o2net: No connection established with node 4 after 10.0 seconds, giving up.
o2net: No connection established with node 4 after 10.0 seconds, giving up.
(dlm_thread,7227,4):dlm_send_proxy_ast_msg:484 ERROR: 
ECB9442E19A94EAC896641BFADD55E4B: res M0001f411c9, 
error -107 send AST to node 4
(dlm_thread,7227,4):dlm_flush_asts:605 ERROR: status = -107
o2cb: o2dlm has evicted node 4 from domain ECB9442E19A94EAC896641BFADD55E4B
o2cb: o2dlm has evicted node 4 from domain ECB9442E19A94EAC896641BFADD55E4B
o2dlm: Begin recovery on domain ECB9442E19A94EAC896641BFADD55E4B for node 4
o2dlm: Node 5 (he) is the Recovery Master for the dead node 4 in domain 
ECB9442E19A94EAC896641BFADD55E4B
o2dlm: End recovery on domain ECB9442E19A94EAC896641BFADD55E4B


Additionaly ~4 times per day I see:

ocfs2_check_dir_for_entry:2119 ERROR: status = -17
ocfs2_mknod:459 ERROR: status = -17
ocfs2_create:629 ERROR: status = -17


I currently use kernel 3.4.2
my filesystem has been created with:
-N 8-b 4096 -C 32768 --fs-features 
backup-super,strict-journal-super,sparse,extended-slotmap,inline-data,metaecc,xattr,indexed-dirs,refcount,discontig-bg,unwritten,usrquota,grpquota

Could you tell me what could make my system instable? Which feature ?

Thanks for any  help

Pawel


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users