It looks like this was caused by an issue with 2.9.0 that we recently discovered. A fix is in testing. If you could send the log from the server running on crill-013 that could help confirm whether this is, indeed, the known problem.
Thanks, Elaine On Fri, Feb 6, 2015 at 3:25 PM, Becky Ligon <[email protected]> wrote: > Jyothi: > > Can you also send us your config file? > > QUESTION: Do you get these problems when using a clean filesystem? I think > the answer is yes based on your email, but please verify my question. > > Looks like you are having problems with removing files and/or directories. > Can you give me some idea of what your code is doing in terms of creating > and/or deleting files and directories? > > Becky > > On Fri, Feb 6, 2015 at 2:41 PM, Mangala Jyothi Bhaskar > <[email protected]> wrote: >> >> Hi, >> >> We have a cluster with PVFS installed with 16 nodes configured as I/O and >> Meta data servers. We used for many years PVFS 2.8.2 without any major >> problems. Recently, we upgraded the cluster software, and at the same time >> also upgraded PVFS to OrangeFS 2.9.0. The cluster is now running kernel 3.11 >> >> We have been running some parallel I/O tests using OpenMPI and have >> observed an issue while writing with higher number of processes. (Upto about >> 100 processes we do not see this issue). >> >> We are running a case of MPI Tile I/O with 100 processes. This leads to >> file system crash as one of the servers fails. We did not observe this issue >> with pvfs 2.8.2. >> >> Also after having restored the file system we see a lot of locktest files( >> like shown below) in the folder which makes the folder corrupt for any >> further usage . These files cannot be deleted unless meta data is deleted >> and the storage is re-created. >> >> Please find the attached server-log from the server that crashed for more >> details. >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.104 >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.100 >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.1 >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.230 >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.68 >> >> -rw-r--r-- 1 mjbhaskar users 0 Feb 5 17:27 >> output_256_16_16_2048_1600_64_testpvfs.txt.locktest.138 >> >> >> We would appreciate if you could let us know what might have caused this, >> or how to debug this problem. >> >> >> Thanks, >> >> Jyothi >> >> >> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >> > > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
