This belongs to "keeping our users from directly
SSHing into compute nodes" category.


On a test cluster (pauli), I have the following set up:

   0. 1 Front end and 2 compute nodes

      Each compute node has 4 cpu cores


   1. Rocks 5.4 (service pack 2) - all rolls except
      bio, condor and xen; runs SGE queuing system
      6.2u5

[root@pauli ~]# rocks list roll
NAME          VERSION ARCH   ENABLED
area51:       5.4     x86_64 yes
base:         5.4     x86_64 yes
ganglia:      5.4     x86_64 yes
hpc:          5.4     x86_64 yes
kernel:       5.4     x86_64 yes
os:           5.4     x86_64 yes
sge:          5.4     x86_64 yes
web-server:   5.4     x86_64 yes
service-pack: 5.4.2   x86_64 yes


   2. MPICH2 (1.4), compiled with GCC 4.1.1, is in

      /share/apps/mpich2/1.4/gcc/4.1.2

     Configure & make/make install commands were as
     follows

export CC="/usr/bin/gcc"
export CXX="/usr/bin/g++"
export FC="/usr/bin/gfortran"
export F77="/usr/bin/gfortran"

./configure --prefix=/share/apps/mpich2/1.4/gcc/4.1.2
make
make install

      I compiled a simple 'hello, world' C program

      mpicc -g -Wall hello_world.c -o hello_world.x

      and 'hello_world.x' runs fine.


   3. There are two groups on this cluster

      pauli-users  : all users belong to this group
      pauli-admins : only administrators belong to this one,
                     in addition to being part of pauli-users

      I created 3 user accounts (all belonging to
      pauli-users) and one more account that belongs to
      pauli-users & pauli-admins

      These groups & users were created before any compute
      node was added to the cluster


   4. The extend-compute.xml had the following lines in
      <post> section

<file name="/etc/ssh/sshd_config" mode="append">

# Block non-root, non-pauli-admins users from directly
# accessing this compute node
AllowGroups root pauli-admins
</file>

       xmllint -noout extend-compute.xml was run and
       no errors were found.

       rocks distribution was rebuilt and the compute
       nodes were added via the usual insert-ethers

       I ran 'rocks sync users'

       When I check the '/etc/ssh/sshd_config' file
       in compute nodes, I do see the line

AllowGroups root pauli-admins

       The '/etc/group' file in compute node have lines
       corresponding to 'pauli-users' and 'pauli-admins'

pauli-users:x:500:
pauli-admins:x:501:john


   5. 'john' attempts to SSH into compute nodes get through
      while 'greg' (just a pauli-user) are blocked


   6. Now comes SGE

      I run the 'hello_world.x' with 8 processors (spanning
      both compute nodes) via SGE script - sge_test.sh -
      with 8 processors


#! /bin/bash
# #$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -pe mpich 8
#
# Run 'Hello, World!'
/share/apps/mpich2/1.4/gcc/4.1.2/bin/mpirun -n $NSLOTS \
  -f $TMP/machines /share/apps/bin/hello_world.x


     It produces desired output when I run this as 'john'
     (a pauli-admin user)

     It hangs in 'r' state. 'sge_test.sh.po12' contains


-catch_rsh 
/opt/gridengine/default/spool/compute-0-0/active_jobs/12.1/pe_hostfile
compute-0-0
compute-0-0
compute-0-0
compute-0-0
compute-0-1
compute-0-1
compute-0-1
compute-0-1


     'sge_test.sh.o12' contains


Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,gssapi-with-mic,password).


Can someone please help me if I am doing something wrong or missing something?

Thanks,
g

--
Gowtham
Advanced IT Research Support
Michigan Technological University

(906) 487/3593

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to