As another note, the process that's trying to read the file is in a VERY busy wait state... it's taking all the CPU it can get. STRACE doesn't show any output when I try to connect to the process.
--Jason On Fri, Apr 2, 2010 at 12:44 PM, Jason Price <japr...@gmail.com> wrote: > > To add further information: > > 1) Note A: > # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state > Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5 > Thread Pid: 3869 Node: 1 State: JOINED > Number of Joins: 1 Joining Node: 255 > Domain Map: 1 2 > Live Map: 1 2 > Lock Resources: 70731 (442210) > MLEs: 0 (1048380) > Blocking: 0 (647669) > Mastery: 0 (400711) > Migration: 0 (0) > Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty > Purge Count: 0 Refs: 70732 > Dead Node: 255 > Recovery Pid: 3870 Master: 255 State: INACTIVE > Recovery Map: > Recovery Node State: > > Node B: > # cat /sys/kernel/debug/o2dlm/6D419D86AE8A4DB1940788EDDA27027B/dlm_state > Domain: 6D419D86AE8A4DB1940788EDDA27027B Key: 0xc955c1d5 > Thread Pid: 3757 Node: 2 State: JOINED > Number of Joins: 1 Joining Node: 255 > Domain Map: 1 2 > Live Map: 1 2 > Lock Resources: 48113 (50521) > MLEs: 0 (85510) > Blocking: 0 (35121) > Mastery: 0 (50389) > Migration: 0 (0) > Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty > Purge Count: 0 Refs: 48114 > Dead Node: 255 > Recovery Pid: 3758 Master: 255 State: INACTIVE > Recovery Map: > Recovery Node State: > > There are no busy locks apparently, as shown by > > # debugfs.ocfs2 -R "fs_locks -B" /dev/sda1 > # > > I am unable to kill any of these processes, even with kill -9. > > # cat /etc/ocfs2/cluster.conf > cluster: > node_count = 2 > name = ocfs2ftpcluster > > node: > ip_port = 7777 > ip_address = 192.168.0.1 > number = 1 > name = prtftp01 > cluster = ocfs2ftpcluster > > node: > ip_port = 7777 > ip_address = 192.168.0.2 > number = 2 > name = prtftp02 > cluster = ocfs2ftpcluster > > If you'd like the output of : > > # debugfs.ocfs2 -R "fs_locks" /dev/sda1 | wc -l > 768681 > > I can give it, but it's a lot output. > > --Jason > > On Fri, Apr 2, 2010 at 11:38 AM, Jason Price <japr...@gmail.com> wrote: >> >> I'm setting up an HA ftp server (amongst other services). >> >> When two connections happen simultaneously, and (more specifically) the same >> user from two IP's attempt to access the same file (one for reading, and one >> for writing), the processes both hang. And all subsequent attempts to >> either read or write the file fail. >> >> The two processes that seem to have caused the lock: >> user 24139 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >> ::ffff:xxx.yyy.0.253: RETR prim_wo_img_dom.obs >> user 24142 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >> ::ffff:xxx.yyy.103.208: STOR prim_wo_img_dom.obs >> >> (there are 49 other process trying to do the same things, but these are the >> first ones.) >> >> I'm more than happy to provide any information needed on this issue: >> >> OSL >> CentOS release 5.4 (Final) >> >> uname -a: >> Linux prtftp01<omitted> 2.6.18-164.11.1.el5 #1 SMP Wed Jan 20 07:32:21 EST >> 2010 x86_64 x86_64 x86_64 GNU/Linux >> >> ocfs2 version 1.4.4 >> >> At the moment, only one host is actively serving FTP at any time. I can >> fail the services back and forth as needed. >> >> --Jason _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users