Am 01.09.2011 um 15:28 schrieb Gowtham: > This belongs to "keeping our users from directly > SSHing into compute nodes" category. > > > On a test cluster (pauli), I have the following set up: > > 0. 1 Front end and 2 compute nodes > > Each compute node has 4 cpu cores > > > 1. Rocks 5.4 (service pack 2) - all rolls except > bio, condor and xen; runs SGE queuing system > 6.2u5 > > [root@pauli ~]# rocks list roll > NAME VERSION ARCH ENABLED > area51: 5.4 x86_64 yes > base: 5.4 x86_64 yes > ganglia: 5.4 x86_64 yes > hpc: 5.4 x86_64 yes > kernel: 5.4 x86_64 yes > os: 5.4 x86_64 yes > sge: 5.4 x86_64 yes > web-server: 5.4 x86_64 yes > service-pack: 5.4.2 x86_64 yes > > > 2. MPICH2 (1.4), compiled with GCC 4.1.1, is in > > /share/apps/mpich2/1.4/gcc/4.1.2 > > Configure & make/make install commands were as > follows > > export CC="/usr/bin/gcc" > export CXX="/usr/bin/g++" > export FC="/usr/bin/gfortran" > export F77="/usr/bin/gfortran" > > ./configure --prefix=/share/apps/mpich2/1.4/gcc/4.1.2 > make > make install > > I compiled a simple 'hello, world' C program > > mpicc -g -Wall hello_world.c -o hello_world.x > > and 'hello_world.x' runs fine. > > > 3. There are two groups on this cluster > > pauli-users : all users belong to this group > pauli-admins : only administrators belong to this one, > in addition to being part of pauli-users > > I created 3 user accounts (all belonging to > pauli-users) and one more account that belongs to > pauli-users & pauli-admins > > These groups & users were created before any compute > node was added to the cluster > > > 4. The extend-compute.xml had the following lines in > <post> section > > <file name="/etc/ssh/sshd_config" mode="append"> > > # Block non-root, non-pauli-admins users from directly > # accessing this compute node > AllowGroups root pauli-admins > </file> > > xmllint -noout extend-compute.xml was run and > no errors were found. > > rocks distribution was rebuilt and the compute > nodes were added via the usual insert-ethers > > I ran 'rocks sync users' > > When I check the '/etc/ssh/sshd_config' file > in compute nodes, I do see the line > > AllowGroups root pauli-admins > > The '/etc/group' file in compute node have lines > corresponding to 'pauli-users' and 'pauli-admins' > > pauli-users:x:500: > pauli-admins:x:501:john > > > 5. 'john' attempts to SSH into compute nodes get through > while 'greg' (just a pauli-user) are blocked > > > 6. Now comes SGE > > I run the 'hello_world.x' with 8 processors (spanning > both compute nodes) via SGE script - sge_test.sh - > with 8 processors > > > #! /bin/bash > # #$ -cwd > #$ -j y > #$ -S /bin/bash > #$ -pe mpich 8 > # > # Run 'Hello, World!' > /share/apps/mpich2/1.4/gcc/4.1.2/bin/mpirun -n $NSLOTS \ > -f $TMP/machines /share/apps/bin/hello_world.x > > > It produces desired output when I run this as 'john' > (a pauli-admin user) > > It hangs in 'r' state. 'sge_test.sh.po12' contains > > > -catch_rsh > /opt/gridengine/default/spool/compute-0-0/active_jobs/12.1/pe_hostfile > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > > > 'sge_test.sh.o12' contains > > > Permission denied, please try again. > Permission denied, please try again. > Permission denied (publickey,gssapi-with-mic,password).
What is the setting of: qrsh_command qrsh_daemon in `qconf -sconf`? -- Reuti > Can someone please help me if I am doing something wrong or missing something? > > Thanks, > g > > -- > Gowtham > Advanced IT Research Support > Michigan Technological University > > (906) 487/3593 > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
