On Thu, Jun 16, 2016 at 2:32 PM, Mark Abraham <mark.j.abra...@gmail.com>
wrote:

> Hi,
>
> On Thu, Jun 16, 2016 at 9:30 AM Husen R <hus...@gmail.com> wrote:
>
> > Hi,
> >
> > Thank you for your reply !
> >
> > md_test.xtc is exist and writable.
> >
>
> OK, but it needs to be seen that way from the set of compute nodes you are
> using, and organizing that is up to you and your job scheduler, etc.
>
>
> > I tried to restart from checkpoint file by excluding other node than
> > compute-node and it works.
> >
>
> Go do that, then :-)
>

I'm building a simple system that can respond to node failure. if failure
occured on node A, than the application has to be restarted and that node
has to be excluded.
this should apply to all node including this 'compute-node'.

>
>
> > only '--exclude=compute-node' that produces this error.
> >
>
> Then there's something about that node that is special with respect to the
> file system - there's nothing about any particular node that GROMACS cares
> about.
>

> Mark
>
>
> > is this has the same issue with this thread ?
> > http://comments.gmane.org/gmane.science.biology.gromacs.user/40984
> >
> > regards,
> >
> > Husen
> >
> > On Thu, Jun 16, 2016 at 2:20 PM, Mark Abraham <mark.j.abra...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > The stuff about different nodes or numbers of nodes doesn't matter -
> it's
> > > merely an advisory note from mdrun. mdrun failed when it tried to
> operate
> > > upon md_test.xtc, so perhaps you need to consider whether the file
> > exists,
> > > is writable, etc.
> > >
> > > Mark
> > >
> > > On Thu, Jun 16, 2016 at 6:48 AM Husen R <hus...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I got the following error message when I tried to restart gromacs
> > > > simulation from checkpoint file.
> > > > I restart the simulation using fewer nodes and processes, and also I
> > > > exclude one node using '--exclude=' option (in slurm) for
> experimental
> > > > purpose.
> > > >
> > > > I'm sure fewer nodes and processes are not the cause of this error
> as I
> > > > already test that.
> > > > I have checked that the cause of this error is '--exclude=' usage. I
> > > > excluded 1 node named 'compute-node' when restart from checkpoint (at
> > > first
> > > > run, I use all node including 'compute-node').
> > > >
> > > >
> > > > it seems that at first run, the submit job script was built at
> > > > compute-node. So, at restart, build user mismatch appeared because
> > > > compute-node was not found (excluded).
> > > >
> > > > Am I right ? is this behavior normal ?
> > > > or is that a way to avoid this, so I can freely restart from
> checkpoint
> > > > using any nodes without limitation.
> > > >
> > > > thank you in advance
> > > >
> > > > Regards,
> > > >
> > > >
> > > > Husen
> > > >
> > > > ==========================restart script=================
> > > > #!/bin/bash
> > > > #SBATCH -J ayo
> > > > #SBATCH -o md%j.out
> > > > #SBATCH -A necis
> > > > #SBATCH -N 2
> > > > #SBATCH -n 16
> > > > #SBATCH --exclude=compute-node
> > > > #SBATCH --time=144:00:00
> > > > #SBATCH --mail-user=hus...@gmail.com
> > > > #SBATCH --mail-type=begin
> > > > #SBATCH --mail-type=end
> > > >
> > > > mpirun gmx_mpi mdrun -cpi md_test.cpt -deffnm md_test
> > > > =====================================================
> > > >
> > > >
> > > >
> > > >
> > > > ==================================output
> error========================
> > > > Reading checkpoint file md_test.cpt generated: Wed Jun 15 16:30:44
> 2016
> > > >
> > > >
> > > >   Build time mismatch,
> > > >     current program: Sel Apr  5 13:37:32 WIB 2016
> > > >     checkpoint file: Rab Apr  6 09:44:51 WIB 2016
> > > >
> > > >   Build user mismatch,
> > > >     current program: pro@head-node [CMAKE]
> > > >     checkpoint file: pro@compute-node [CMAKE]
> > > >
> > > >   #ranks mismatch,
> > > >     current program: 16
> > > >     checkpoint file: 24
> > > >
> > > >   #PME-ranks mismatch,
> > > >     current program: -1
> > > >     checkpoint file: 6
> > > >
> > > > GROMACS patchlevel, binary or parallel settings differ from previous
> > run.
> > > > Continuation is exact, but not guaranteed to be binary identical.
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > Program gmx mdrun, VERSION 5.1.2
> > > > Source code file:
> > > > /home/pro/gromacs-5.1.2/src/gromacs/gmxlib/checkpoint.cpp, line: 2216
> > > >
> > > > Fatal error:
> > > > Truncation of file md_test.xtc failed. Cannot do appending because of
> > > this
> > > > failure.
> > > > For more information and tips for troubleshooting, please check the
> > > GROMACS
> > > > website at http://www.gromacs.org/Documentation/Errors
> > > > -------------------------------------------------------
> > > > ================================================================
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > > > send a mail to gmx-users-requ...@gromacs.org.
> > > >
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-requ...@gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-requ...@gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-requ...@gromacs.org.
>
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.

Reply via email to