Using the exact script below, ssh output:
cv-hpcf1
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1032015
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
_=/bin/env
CVS_RSH=ssh
G_BROKEN_FILENAMES=1
HOME=/u/mcolonno
KRB5CCNAME=FILE:/tmp/krb5cc_1050163475_PhEns23756
LANG=en_US.UTF-8
LESSOPEN=|/usr/bin/lesspipe.sh %s
LOGNAME=mcolonno
MAIL=/var/mail/mcolonno
PATH=/usr/local/apps/NASTRAN/NX/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin
PWD=/u/mcolonno
QTDIR=/usr/lib64/qt-3.3
QTINC=/usr/lib64/qt-3.3/include
QTLIB=/usr/lib64/qt-3.3/lib
SHELL=/bin/bash
SHLVL=2
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SSH_CLIENT=192.168.101.220 35086 22
SSH_CONNECTION=192.168.101.220 35086 192.168.230.33 22
USER=mcolonno
done: Fri Jan 25 13:59:46 PST 2013
Using srun:
cv-hpcf1
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1032015
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
}
_=/bin/env
CVS_RSH=ssh
G_BROKEN_FILENAMES=1
HISTCONTROL=ignoredups
HISTSIZE=1000
HOME=/u/mcolonno
HOSTNAME=cv-hpcq
LANG=en_US.UTF-8
LESSOPEN=|/usr/bin/lesspipe.sh %s
LOADEDMODULES=
LOGNAME=mcolonno
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
MAIL=/var/spool/mail/mcolonno
module=() { eval `/usr/bin/modulecmd bash $*`
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
MODULESHOME=/usr/share/Modules
PATH=/usr/local/apps/NASTRAN/NX/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/u/mcolonno/bin
PWD=/u/mcolonno/NXNASTRAN
QTDIR=/usr/lib64/qt-3.3
QTINC=/usr/lib64/qt-3.3/include
QTLIB=/usr/lib64/qt-3.3/lib
SHELL=/bin/bash
SHLVL=2
SLURM_CHECKPOINT_IMAGE_DIR=/u/mcolonno/NXNASTRAN
SLURM_CPUS_ON_NODE=16
SLURM_DISTRIBUTION=cyclic
SLURMD_NODENAME=cv-hpcf1
SLURM_GTIDS=0
SLURM_JOB_CPUS_PER_NODE=16
SLURM_JOB_ID=199
SLURM_JOBID=199
SLURM_JOB_NAME=/u/mcolonno/NXNASTRAN/test-env.sh
SLURM_LAUNCH_NODE_IPADDR=192.168.101.220
SLURM_LOCALID=0
SLURM_NNODES=1
SLURM_NODEID=0
SLURM_NODELIST=cv-hpcf1
SLURM_NPROCS=1
SLURM_NTASKS=1
SLURM_PRIO_PROCESS=0
SLURM_PROCID=0
SLURM_SRUN_COMM_HOST=192.168.101.220
SLURM_SRUN_COMM_PORT=33121
SLURM_STEP_ID=0
SLURM_STEPID=0
SLURM_STEP_LAUNCHER_PORT=33121
SLURM_STEP_NODELIST=cv-hpcf1
SLURM_STEP_NUM_NODES=1
SLURM_STEP_NUM_TASKS=1
SLURM_STEP_TASKS_PER_NODE=1
SLURM_SUBMIT_DIR=/u/mcolonno/NXNASTRAN
SLURM_TASK_PID=27327
SLURM_TASKS_PER_NODE=1
SLURM_TOPOLOGY_ADDR=cv-hpcf1
SLURM_TOPOLOGY_ADDR_PATTERN=node
SRUN_DEBUG=3
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SSH_CLIENT=127.0.0.1 50820 22
SSH_CONNECTION=127.0.0.1 50820 127.0.0.1 22
SSH_TTY=/dev/pts/1
TERM=xterm
TMPDIR=/tmp
USER=mcolonno
done: Fri Jan 25 14:02:22 PST 2013
Nothing jumps out at me that would change the behavior of a bash
script on the node.
Thanks,
~Mike C.
-----Original Message-----
From: David Bigagli [mailto:[email protected]]
Sent: Friday, January 25, 2013 6:04 AM
To: slurm-dev
Subject: [slurm-dev] RE: not executing script(?)
I think the idea is that given a script like this one:
---------------------------------
cat myenv
#!/bin/sh
hostname
ulimit -a
env|sort
echo "done: `date`"
---------------------------------
run it as:
ssh myhost myenv > LOG.ssh
and as
srun -p mypartition -w myhost myenv > LOG.srun
then compare the logs line by line.
/David
On 01/24/2013 01:55 AM, Michael Colonno wrote:
>
> Updating this thread: Iran additional experiments submitting the
> job from the node it executes on - same behavior so I think this rules out
> system config limits. It seems like the application runs scripts that run
> other scripts and somehow SLURM's mode of execution confuses this. Anything
> else I can test?
>
> Thanks,
> ~Mike C.
>
> -----Original Message-----
> From: Moe Jette [ <mailto:[email protected]> mailto:[email protected]]
> Sent: Tuesday, January 22, 2013 7:49 PM
> To: slurm-dev; Michael Colonno
> Subject: Re: [slurm-dev] not executing script(?)
>
> Compare limits and environment variables for the two different modes of
> operation.
>
> Quoting Michael Colonno< <mailto:[email protected]>
> [email protected]>:
>
>>
>> Hi ~
>>
>> Getting some odd behavior with SLURM I haven't seen before (2.5.0 on
>> CentOS 6.3 x64 though I don't think any of that matters for this
>> issue). I'm trying to run a code which launches from a bash script
>> (commercial code, we didn't write it). If I ssh to a node and launch
>> the code, everything works fine. Syntax looks like this:
>>
>> >> launch_script input_file
>>
>> If I paste the exact same command at the end on a srun command the
>> job "runs" and I get a copy of the bash script that was supposed to
>> have been executed in the directory I launched from (even with
>> executable properties) in a file labeled input_file.[bunch of letters
>> and numbers]. Syntax looks like:
>>
>> >>srun -n1 -p whatever launch_script input_file
>>
>> Scratching my head on this one. Clearly it finds the correct script
>> to launch on the correct node but I can't explain the difference in
>> behavior between the interactive and SLURM versions. Test cases like
>> "hostname" all work fine. Probably not relevant but the parallel
>> codes I've compiled into SLURM also launch and run great.
>>
>> Thanks,
>> ~Mike C.
>>
>
>