Thanks to our HPC support team who solved my problem.
They did something to my account and now I am able to run large replica 
exchange jobs.
I don't know whether this is really related to disk quota or special permission.
But for anyone who encountered this type of problem,
you can try running the job from home directory.
If this works, then you need to contact the administrator to remove the 
limitation on scratch space.


Chuanyin Shi
Department of Chemistry & Biochemistry
University of Oklahoma
Email: c...@ou.edu
________________________________________
From: gmx-users-boun...@gromacs.org [gmx-users-boun...@gromacs.org] on behalf 
of Shi, Chuanyin [c...@ou.edu]
Sent: Monday, October 25, 2010 11:37 AM
To: Discussion list for GROMACS users
Subject: RE: [gmx-users] Job crash: checkpoint file

I wish the error would come from the quota.
But the disk quota is fine, we still have 2.0TB on the scratch space which is 
shared by everyone.
And I have other jobs running and no problem with writing outputs.
I've tried smaller REMD jobs on the cluster using only 1 node (8 cpus) and 
seems to be no problem.
But using 7 nodes and one (or two) node complains about that.
And several weird files are generated: mdrun_mpi.80s-12939,v002.local.btr, 
mdrun_mpi.80s-12940,v002.local.btr, ....
v002 is the name of the node.


Chuanyin Shi
Department of Chemistry & Biochemistry
University of Oklahoma
Email: c...@ou.edu
________________________________________
From: gmx-users-boun...@gromacs.org [gmx-users-boun...@gromacs.org] on behalf 
of David van der Spoel [sp...@xray.bmc.uu.se]
Sent: Monday, October 25, 2010 11:21 AM
To: Discussion list for GROMACS users
Subject: Re: [gmx-users] Job crash: checkpoint file

On 2010-10-25 17.58, Shi, Chuanyin wrote:
> I am having exactly the same problem recently.
> The replica exchange job stops around 11000 steps.
> Switch to another cluster and the job is running fine.
> I wonder how often you've seen this type of crashing and any solutions for 
> this?
> Thanks.

Have you checked your quota?
I had the same problem recently, and I was indeed out of quota.
>
>
>
> Chuanyin Shi
> Department of Chemistry&  Biochemistry
> University of Oklahoma
> Email: c...@ou.edu
> ________________________________________
> From: gmx-users-boun...@gromacs.org [gmx-users-boun...@gromacs.org] on behalf 
> of Justin A. Lemkul [jalem...@vt.edu]
> Sent: Friday, October 08, 2010 1:19 PM
> To: Discussion list for GROMACS users
> Subject: Re: [gmx-users] Job crash: checkpoint file
>
> Jianhui Tian wrote:
>> Dear GMX users,
>>
>> I am running a replica simulation and the job crashed with the following
>> message:
>>
>> File input/output error:
>> Cannot rename checkpoint file; maybe you are out of quota?
>>
>>   From the mailling list, I see this might be a permission problem.
>> However, I checked the file permission and nothing wrong was noticed.
>> If I rerun the crashed simulation, it goes through the second time. This
>> seems strong. Any suggestion is welcomed.
>>
>
> I've seen this happen when our filesystem blips.  It seems like you're able to
> run your job, so I don't think there's anything to do about it, except perhaps
> inquire with your sysadmins about the stability of the filesystem, and whether
> or not you can expect to have this happen frequently.
>
> -Justin
>
>> JH
>>
>
> --
> ========================================
>
> Justin A. Lemkul
> Ph.D. Candidate
> ICTAS Doctoral Scholar
> MILES-IGERT Trainee
> Department of Biochemistry
> Virginia Tech
> Blacksburg, VA
> jalemkul[at]vt.edu | (540) 231-9080
> http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>
> ========================================
> --
> gmx-users mailing list    gmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


--
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:  +46184714205.
sp...@xray.bmc.uu.se    http://folding.bmc.uu.se
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to