For reference, my 'reader' process that's doing the spinlocking is in an R state. The 'writer' process is in a D state, as are every other proftpd process that's attempting to get to that file.
--Jason On Fri, Apr 2, 2010 at 3:12 PM, David Johle <djo...@industrialinfo.com> wrote: > > FWIW, I have seen a similar problem here on occasion, but with vsftpd > instead. > > When I run `ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN` I usually see > one node with a single vsftpd in D (uninterruptable I/O) state, and multiple > vsftpd processes on the other node, presumably waiting for the resource. > > I also believe this when multiple processes are trying to read & write the > same file via FTP. And if left alone for a bit, other programs that may > read the same file will get hung waiting as well. Mine are typically not > busy waits though, but I have seen a couple that were. > > Sometimes I will find that all is cleared and back to normal after a short > while (a timeout somewhere perhaps?). Usually the only solution is to > reboot one or both nodes, which I have to instigate via kernel panic/self > fence because a normal shutdown also gets caught up by the non-killable > processes. > > > I need to get a netconsole set up to capture some stuff for the next time so > that I can add it to the bugzilla as well. > > > At 10:52 AM 4/2/2010, Jason Price wrote: >> >> Message: 1 >> Date: Fri, 2 Apr 2010 11:38:24 -0400 >> From: Jason Price <japr...@gmail.com> >> Subject: [Ocfs2-users] Ftp server... single file seems locked >> To: ocfs2-users@oss.oracle.com >> Message-ID: >> <p2r83f15e31004020838o961f478cg19ae4f403631...@mail.gmail.com> >> Content-Type: text/plain; charset="iso-8859-1" >> >> I'm setting up an HA ftp server (amongst other services). >> >> When two connections happen simultaneously, and (more specifically) the >> same >> user from two IP's attempt to access the same file (one for reading, and >> one >> for writing), the processes both hang. And all subsequent attempts to >> either read or write the file fail. >> >> The two processes that seem to have caused the lock: >> user 24139 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >> ::ffff:xxx.yyy.0.253: RETR prim_wo_img_dom.obs >> user 24142 1657 Thu Apr 1 18:25:01 2010 proftpd: cbs - >> ::ffff:xxx.yyy.103.208: STOR prim_wo_img_dom.obs >> >> (there are 49 other process trying to do the same things, but these are >> the >> first ones.) > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users