On Fri, 20 May 2011 18:30:16 -0700, "[email protected]" <[email protected]> 
wrote:
> Take a look at your slurmctld and slurmd log files. My _guess_
> is that the clock on one or more of your nodes is out of sync
> and that is preventing message authentication from occurring.
> As I recall Munge credentials have a five minute period of
> being valid. If any of your nodes have a clock more than that
> far out of sync, messages will get discarded. Although SLURM
> does have some recovery mechanisms, long delays like this will
> occur.
> 
> Quoting Paul Thirumalai <[email protected]>:
> 
> > Hi All
> > So I am trying to run launch a script using sbatch, but it just seems to be
> > taking too long to complete. (Sbatch takes 2-3 seconds to complete)
> >
> > The commmand i am using is
> > /usr/bin/sbatch --output=/dev/null --error=/dev/null  --begin=now
> > <script_name>
> >
> > This comand takes around 3.5 seconds to complete. I am not sure why its
> > taking so long. Earlier I had changed the config to use select/linear
> > instead of select/cons_res, and after that all the issues started. I
> > reverted back the config, but to no avail.

An easy first step is sometimes to just run the command with
multiple -v's and see if there is an obvious "pause" between
two sections of output. That can help narrow down where time
is being spent. Sometimes the following perl one-liner is
useful for timestamping output:

 sbatch -vvvv --output=/dev/null --error=/dev/null --begin=now test.sh 2>&1 \
  | perl -MTime::HiRes=time -pne 'BEGIN {$s=time} printf "%09.4f",time-$s'

mark


Reply via email to