date:20111222

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

2011-12-22 Thread Marek Królikowski

Hello
After 24 hours i see TEST-MAIL2 reboot ( possible kernel panic) but 
TEST-MAIL1 got in dmesg:
TEST-MAIL1 ~ #dmesg
[cut]
o2net: accepted connection from node TEST-MAIL2 (num 1) at 172.17.1.252:
o2dlm: Node 1 joins domain B24C4493BBC74FEAA3371E2534BB3611
o2dlm: Nodes in domain B24C4493BBC74FEAA3371E2534BB3611: 0 1
o2net: connection to node TEST-MAIL2 (num 1) at 172.17.1.252: has been 
idle for 60.0 seconds, shutting it down.
(swapper,0,0):o2net_idle_timer:1562 Here are some times that might help 
debug the situation: (Timer: 33127732045, Now 33187808090, DataReady 
33127732039, Advance 33127732051-33127732051, Key 0xebb9cd47, Func 506, 
FuncTime 33127732045-33127732048)
o2net: no longer connected to node TEST-MAIL2 (num 1) at 172.17.1.252:
(du,5099,12):dlm_do_master_request:1324 ERROR: link to 1 went down!
(du,5099,12):dlm_get_lock_resource:907 ERROR: status = -112
(dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
B24C4493BBC74FEAA3371E2534BB3611: res M0cf023ef70, 
error -112 send AST to node 1
(dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -112
(dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
B24C4493BBC74FEAA3371E2534BB3611: res P00, 
error -107 send AST to node 1
(dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -107
(kworker/u:3,5071,0):o2net_connect_expired:1724 ERROR: no connection 
established with node 1 after 60.0 seconds, giving up and returning errors.
(o2hb-B24C4493BB,14310,0):o2dlm_eviction_cb:267 o2dlm has evicted node 1 
from group B24C4493BBC74FEAA3371E2534BB3611
(ocfs2rec,5504,6):dlm_get_lock_resource:834 
B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at least 
one node (1) to recover before lock mastery can begin
(ocfs2rec,5504,6):dlm_get_lock_resource:888 
B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at least 
one node (1) to recover before lock mastery can begin
(du,5099,12):dlm_restart_lock_mastery:1213 ERROR: node down! 1
(du,5099,12):dlm_wait_for_lock_mastery:1030 ERROR: status = -11
(du,5099,12):dlm_get_lock_resource:888 
B24C4493BBC74FEAA3371E2534BB3611:N0020924f: at least one node (1) to 
recover before lock mastery can begin
(dlm_reco_thread,14322,0):dlm_get_lock_resource:834 
B24C4493BBC74FEAA3371E2534BB3611:$RECOVERY: at least one node (1) to recover 
before lock mastery can begin
(dlm_reco_thread,14322,0):dlm_get_lock_resource:868 
B24C4493BBC74FEAA3371E2534BB3611: recovery map is not empty, but must master 
$RECOVERY lock now
(dlm_reco_thread,14322,0):dlm_do_recovery:523 (14322) Node 0 is the Recovery 
Master for the Dead Node 1 for Domain B24C4493BBC74FEAA3371E2534BB3611
(ocfs2rec,5504,6):ocfs2_replay_journal:1549 Recovering node 1 from slot 1 on 
device (253,0)
(ocfs2rec,5504,6):ocfs2_begin_quota_recovery:407 Beginning quota recovery in 
slot 1
(kworker/u:0,2909,0):ocfs2_finish_quota_recovery:599 Finishing quota 
recovery in slot 1

And i try give this command:
debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP allow
debugfs.ocfs2: Unable to write log mask ENTRY: No such file or directory
debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP off
debugfs.ocfs2: Unable to write log mask ENTRY: No such file or directory

But not working


-Oryginalna wiadomość- 
From: Srinivas Eeda
Sent: Wednesday, December 21, 2011 8:43 PM
To: Marek Królikowski
Cc: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

Those numbers look good. Basically with the fixes backed out and another
fix I gave, you are not seeing that many orphans hanging around and
hence not seeing the process stuck kernel stacks. You can run the test
longer or if you are satisfied, please enable quotas and re-run the test
with the modified kernel. You might see a dead lock which needs to be
fixed(I was not able to reproduce this yet). If the system hangs, please
capture the following and provide me the output

1. echo t  /proc/sysrq-trigger
2. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
allow
3. wait for 10 minutes
4. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
off
5. echo t  /proc/sysrq-trigger


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] One node, two clusters?

2011-12-22 Thread Kushnir, Michael (NIH/NLM/LHC) [C]

Is it possible to have one machine be part of two different ocfs2 clusters with 
two different sans? Kind of to serve as a bridge for moving data between two 
clusters but without actually fully combining the two clusters?

Thanks,
Michael

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran

You don't need to have two clusters for this. This can be accomplished
with one cluster with the default local heartbeat.

Create one cluster.conf with all the nodes. All nodes, except the one
machine, will mount from just one san. The common node will mount from
both sans.

If you look at the cluster membership, other than the common node,
all nodes will be interacting (network connection, etc.) with nodes that
they can see on the san.

On 12/22/2011 09:40 AM, Werner Flamme wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Kushnir, Michael (NIH/NLM/LHC) [C] [22.12.2011 18:20]:
 Is it possible to have one machine be part of two different ocfs2
 clusters with two different sans? Kind of to serve as a bridge for
 moving data between two clusters but without actually fully
 combining the two clusters?

 Thanks, Michael
 Michael,

 I asked this two years ago and the answer was no.

 When I look at /etc/ocfs2/cluster.conf, I do not see a possibility to
 configure a second cluster. Though the nodes must be assigned to a
 cluster (and exactly one cluster, this is), there ist only one entry
 cluster: in the file, and so there is no way to define a second one.

 We synced via rsync :-(

 HTH
 Werner

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.18 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iEYEARECAAYFAk7za4EACgkQk33Krq8b42MvSwCfQAXzqVQRPyhOdFrKM8PCPqbf
 g0cAn20CV4rjzXNrTa/YGaUeNlO3+rmc
 =CBmQ
 -END PGP SIGNATURE-

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Kushnir, Michael (NIH/NLM/LHC) [C]

Is there a separate DLM instance for each ocfs2 volume?

I have two sub-clusters in the same cluster... A 10 node Hadoop cluster 
sharing a SATA RAID10 and a Two node web server cluster sharing a SSD RAID0. 
One server mounts both volumes to move data between as necessary. 

This morning I got the following error (see end of message), and all nodes lost 
access to all storage. I'm trying to mitigate risk of this happening again. 

My hadoop nodes are used to generate search engine indexes, so they can go 
down. But my web servers provide the search engine service so I need them to 
not be tied to my hadoop nodes. I just feel safer that way. At the same time,  
I need a bridge node to move data between the two. I can do it via NFS or 
SCP, but I figured it'd be worth while to ask if one node can be in two 
different clusters. 

Dec 22 09:15:42 lhce-imed-web1 kernel: 
(updatedb,1832,1):dlm_get_lock_resource:898 
042F68B6AF134E5C9A9EDF4D7BD7BE99:O0013d2ef94: at least one 
node (11) to recover before lock mastery can begin

Thanks,
Mike


-Original Message-
From: Sunil Mushran [mailto:sunil.mush...@oracle.com] 
Sent: Thursday, December 22, 2011 1:21 PM
To: Werner Flamme
Cc: ocfs2-users ML
Subject: Re: [Ocfs2-users] One node, two clusters?

You don't need to have two clusters for this. This can be accomplished with one 
cluster with the default local heartbeat.

Create one cluster.conf with all the nodes. All nodes, except the one machine, 
will mount from just one san. The common node will mount from both sans.

If you look at the cluster membership, other than the common node, all nodes 
will be interacting (network connection, etc.) with nodes that they can see on 
the san.

On 12/22/2011 09:40 AM, Werner Flamme wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Kushnir, Michael (NIH/NLM/LHC) [C] [22.12.2011 18:20]:
 Is it possible to have one machine be part of two different ocfs2 
 clusters with two different sans? Kind of to serve as a bridge for 
 moving data between two clusters but without actually fully combining 
 the two clusters?

 Thanks, Michael
 Michael,

 I asked this two years ago and the answer was no.

 When I look at /etc/ocfs2/cluster.conf, I do not see a possibility to 
 configure a second cluster. Though the nodes must be assigned to a 
 cluster (and exactly one cluster, this is), there ist only one entry 
 cluster: in the file, and so there is no way to define a second one.

 We synced via rsync :-(

 HTH
 Werner

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.18 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iEYEARECAAYFAk7za4EACgkQk33Krq8b42MvSwCfQAXzqVQRPyhOdFrKM8PCPqbf
 g0cAn20CV4rjzXNrTa/YGaUeNlO3+rmc
 =CBmQ
 -END PGP SIGNATURE-

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran

On 12/22/2011 10:39 AM, Kushnir, Michael (NIH/NLM/LHC) [C] wrote:
 Is there a separate DLM instance for each ocfs2 volume?

 I have two sub-clusters in the same cluster... A 10 node Hadoop cluster 
 sharing a SATA RAID10 and a Two node web server cluster sharing a SSD RAID0. 
 One server mounts both volumes to move data between as necessary.

 This morning I got the following error (see end of message), and all nodes 
 lost access to all storage. I'm trying to mitigate risk of this happening 
 again.

 My hadoop nodes are used to generate search engine indexes, so they can go 
 down. But my web servers provide the search engine service so I need them to 
 not be tied to my hadoop nodes. I just feel safer that way. At the same time, 
  I need a bridge node to move data between the two. I can do it via NFS or 
 SCP, but I figured it'd be worth while to ask if one node can be in two 
 different clusters.

 Dec 22 09:15:42 lhce-imed-web1 kernel: 
 (updatedb,1832,1):dlm_get_lock_resource:898 
 042F68B6AF134E5C9A9EDF4D7BD7BE99:O0013d2ef94: at least 
 one node (11) to recover before lock mastery can begin


You should add ocfs2 to PRUNEFS in /etc/updatedb.conf. updatedb generates
a lot of io and network traffic. And it will happen around the same time on all 
nodes.

Yes, each volume has a different dlm domain (instance).

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

2011-12-22 Thread srinivas eeda

We need to know what happened to node 2. Was the node rebooted because 
of a network timeout or kernel panic? can you please configure 
netconsole, serial console and rerun the test?

On 12/22/2011 8:08 AM, Marek Królikowski wrote:
 Hello
 After 24 hours i see TEST-MAIL2 reboot ( possible kernel panic) but 
 TEST-MAIL1 got in dmesg:
 TEST-MAIL1 ~ #dmesg
 [cut]
 o2net: accepted connection from node TEST-MAIL2 (num 1) at 
 172.17.1.252:
 o2dlm: Node 1 joins domain B24C4493BBC74FEAA3371E2534BB3611
 o2dlm: Nodes in domain B24C4493BBC74FEAA3371E2534BB3611: 0 1
 o2net: connection to node TEST-MAIL2 (num 1) at 172.17.1.252: has 
 been idle for 60.0 seconds, shutting it down.
 (swapper,0,0):o2net_idle_timer:1562 Here are some times that might 
 help debug the situation: (Timer: 33127732045, Now 33187808090, 
 DataReady 33127732039, Advance 33127732051-33127732051, Key 
 0xebb9cd47, Func 506, FuncTime 33127732045-33127732048)
 o2net: no longer connected to node TEST-MAIL2 (num 1) at 
 172.17.1.252:
 (du,5099,12):dlm_do_master_request:1324 ERROR: link to 1 went down!
 (du,5099,12):dlm_get_lock_resource:907 ERROR: status = -112
 (dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
 B24C4493BBC74FEAA3371E2534BB3611: res M0cf023ef70, 
 error -112 send AST to node 1
 (dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -112
 (dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
 B24C4493BBC74FEAA3371E2534BB3611: res P00, 
 error -107 send AST to node 1
 (dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -107
 (kworker/u:3,5071,0):o2net_connect_expired:1724 ERROR: no connection 
 established with node 1 after 60.0 seconds, giving up and returning 
 errors.
 (o2hb-B24C4493BB,14310,0):o2dlm_eviction_cb:267 o2dlm has evicted node 
 1 from group B24C4493BBC74FEAA3371E2534BB3611
 (ocfs2rec,5504,6):dlm_get_lock_resource:834 
 B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at 
 least one node (1) to recover before lock mastery can begin
 (ocfs2rec,5504,6):dlm_get_lock_resource:888 
 B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at 
 least one node (1) to recover before lock mastery can begin
 (du,5099,12):dlm_restart_lock_mastery:1213 ERROR: node down! 1
 (du,5099,12):dlm_wait_for_lock_mastery:1030 ERROR: status = -11
 (du,5099,12):dlm_get_lock_resource:888 
 B24C4493BBC74FEAA3371E2534BB3611:N0020924f: at least one node 
 (1) to recover before lock mastery can begin
 (dlm_reco_thread,14322,0):dlm_get_lock_resource:834 
 B24C4493BBC74FEAA3371E2534BB3611:$RECOVERY: at least one node (1) to 
 recover before lock mastery can begin
 (dlm_reco_thread,14322,0):dlm_get_lock_resource:868 
 B24C4493BBC74FEAA3371E2534BB3611: recovery map is not empty, but must 
 master $RECOVERY lock now
 (dlm_reco_thread,14322,0):dlm_do_recovery:523 (14322) Node 0 is the 
 Recovery Master for the Dead Node 1 for Domain 
 B24C4493BBC74FEAA3371E2534BB3611
 (ocfs2rec,5504,6):ocfs2_replay_journal:1549 Recovering node 1 from 
 slot 1 on device (253,0)
 (ocfs2rec,5504,6):ocfs2_begin_quota_recovery:407 Beginning quota 
 recovery in slot 1
 (kworker/u:0,2909,0):ocfs2_finish_quota_recovery:599 Finishing quota 
 recovery in slot 1

 And i try give this command:
 debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
 allow
 debugfs.ocfs2: Unable to write log mask ENTRY: No such file or 
 directory
 debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
 off
 debugfs.ocfs2: Unable to write log mask ENTRY: No such file or 
 directory

 But not working


 -Oryginalna wiadomość- From: Srinivas Eeda
 Sent: Wednesday, December 21, 2011 8:43 PM
 To: Marek Królikowski
 Cc: ocfs2-users@oss.oracle.com
 Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read 
 from both

 Those numbers look good. Basically with the fixes backed out and another
 fix I gave, you are not seeing that many orphans hanging around and
 hence not seeing the process stuck kernel stacks. You can run the test
 longer or if you are satisfied, please enable quotas and re-run the test
 with the modified kernel. You might see a dead lock which needs to be
 fixed(I was not able to reproduce this yet). If the system hangs, please
 capture the following and provide me the output

 1. echo t  /proc/sysrq-trigger
 2. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC 
 EXTENT_MAP allow
 3. wait for 10 minutes
 4. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC 
 EXTENT_MAP off
 5. echo t  /proc/sysrq-trigger


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

2011-12-22 Thread Marek Królikowski

Ok i reconfigure server and do again test hope tommorow die again because i 
see in log he crash after 10 hours work with no problem.
Thanks

-Oryginalna wiadomość- 
From: srinivas eeda
Sent: Thursday, December 22, 2011 9:12 PM
To: Marek Królikowski
Cc: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

We need to know what happened to node 2. Was the node rebooted because
of a network timeout or kernel panic? can you please configure
netconsole, serial console and rerun the test?

On 12/22/2011 8:08 AM, Marek Królikowski wrote:
 Hello
 After 24 hours i see TEST-MAIL2 reboot ( possible kernel panic) but 
 TEST-MAIL1 got in dmesg:
 TEST-MAIL1 ~ #dmesg
 [cut]
 o2net: accepted connection from node TEST-MAIL2 (num 1) at 
 172.17.1.252:
 o2dlm: Node 1 joins domain B24C4493BBC74FEAA3371E2534BB3611
 o2dlm: Nodes in domain B24C4493BBC74FEAA3371E2534BB3611: 0 1
 o2net: connection to node TEST-MAIL2 (num 1) at 172.17.1.252: has been 
 idle for 60.0 seconds, shutting it down.
 (swapper,0,0):o2net_idle_timer:1562 Here are some times that might help 
 debug the situation: (Timer: 33127732045, Now 33187808090, DataReady 
 33127732039, Advance 33127732051-33127732051, Key 0xebb9cd47, Func 506, 
 FuncTime 33127732045-33127732048)
 o2net: no longer connected to node TEST-MAIL2 (num 1) at 172.17.1.252:
 (du,5099,12):dlm_do_master_request:1324 ERROR: link to 1 went down!
 (du,5099,12):dlm_get_lock_resource:907 ERROR: status = -112
 (dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
 B24C4493BBC74FEAA3371E2534BB3611: res M0cf023ef70, 
 error -112 send AST to node 1
 (dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -112
 (dlm_thread,14321,1):dlm_send_proxy_ast_msg:484 ERROR: 
 B24C4493BBC74FEAA3371E2534BB3611: res P00, 
 error -107 send AST to node 1
 (dlm_thread,14321,1):dlm_flush_asts:605 ERROR: status = -107
 (kworker/u:3,5071,0):o2net_connect_expired:1724 ERROR: no connection 
 established with node 1 after 60.0 seconds, giving up and returning 
 errors.
 (o2hb-B24C4493BB,14310,0):o2dlm_eviction_cb:267 o2dlm has evicted node 1 
 from group B24C4493BBC74FEAA3371E2534BB3611
 (ocfs2rec,5504,6):dlm_get_lock_resource:834 
 B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at least 
 one node (1) to recover before lock mastery can begin
 (ocfs2rec,5504,6):dlm_get_lock_resource:888 
 B24C4493BBC74FEAA3371E2534BB3611:M15f023ef70: at least 
 one node (1) to recover before lock mastery can begin
 (du,5099,12):dlm_restart_lock_mastery:1213 ERROR: node down! 1
 (du,5099,12):dlm_wait_for_lock_mastery:1030 ERROR: status = -11
 (du,5099,12):dlm_get_lock_resource:888 
 B24C4493BBC74FEAA3371E2534BB3611:N0020924f: at least one node (1) 
 to recover before lock mastery can begin
 (dlm_reco_thread,14322,0):dlm_get_lock_resource:834 
 B24C4493BBC74FEAA3371E2534BB3611:$RECOVERY: at least one node (1) to 
 recover before lock mastery can begin
 (dlm_reco_thread,14322,0):dlm_get_lock_resource:868 
 B24C4493BBC74FEAA3371E2534BB3611: recovery map is not empty, but must 
 master $RECOVERY lock now
 (dlm_reco_thread,14322,0):dlm_do_recovery:523 (14322) Node 0 is the 
 Recovery Master for the Dead Node 1 for Domain 
 B24C4493BBC74FEAA3371E2534BB3611
 (ocfs2rec,5504,6):ocfs2_replay_journal:1549 Recovering node 1 from slot 1 
 on device (253,0)
 (ocfs2rec,5504,6):ocfs2_begin_quota_recovery:407 Beginning quota recovery 
 in slot 1
 (kworker/u:0,2909,0):ocfs2_finish_quota_recovery:599 Finishing quota 
 recovery in slot 1

 And i try give this command:
 debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
 allow
 debugfs.ocfs2: Unable to write log mask ENTRY: No such file or directory
 debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP off
 debugfs.ocfs2: Unable to write log mask ENTRY: No such file or directory

 But not working

 -Oryginalna wiadomość- From: Srinivas Eeda
 Sent: Wednesday, December 21, 2011 8:43 PM
 To: Marek Królikowski
 Cc: ocfs2-users@oss.oracle.com
 Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from 
 both

 Those numbers look good. Basically with the fixes backed out and another
 fix I gave, you are not seeing that many orphans hanging around and
 hence not seeing the process stuck kernel stacks. You can run the test
 longer or if you are satisfied, please enable quotas and re-run the test
 with the modified kernel. You might see a dead lock which needs to be
 fixed(I was not able to reproduce this yet). If the system hangs, please
 capture the following and provide me the output

 1. echo t  /proc/sysrq-trigger
 2. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
 allow
 3. wait for 10 minutes
 4. debugfs.ocfs2 -l ENTRY EXIT DLM_GLUE QUOTA INODE DISK_ALLOC EXTENT_MAP 
 off
 5. echo t  /proc/sysrq-trigger

___

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

[Ocfs2-users] One node, two clusters?

Re: [Ocfs2-users] One node, two clusters?

Re: [Ocfs2-users] One node, two clusters?

Re: [Ocfs2-users] One node, two clusters?

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

7 matches

Site Navigation

Mail list logo

Footer information