On Thu, May 07, 2020 at 06:29:05PM +0000, Mun Johl wrote:
> Hi William, et al.,
> 
> Thank you kindly for your response and insight.
> Please see my comments below.
> 
> > On Wed, May 06, 2020 at 11:10:40PM +0000, Mun Johl wrote:
> > > [Mun] In order to use ssh -X for our jobs that require an X11 window to 
> > > be pushed to a user's VNC display, I am planning on the
> > following changes.  But please let me know if I have missed something (or 
> > everything).
> > >
> > > 1. Update the global configuration with the following parameters:
> > >
> > >      rsh_command         /usr/bin/ssh -X
> > >      rsh_daemon          /usr/sbin/sshd -i
> > 
> > As you are using the pam_sge-qrsh-setup.so you will need to set
> > rsh_daemon to point to a rshd-wrapper which you should find in
> > $SGE_ROOT/util/resources/wrappers eg if $SGE_ROOT is /opt/sge
> > 
> > rsh_command /usr/bin/ssh -X
> > rsh_daemon /opt/sge/util/resources/wrappers/rshd-wrapper
> 
> [Mun] Thanks for pointing out my mistake!
> 
> > > 2. Use a PAM module to attach an additional group ID to sshd.  The 
> > > following line will be added to /etc/pam.d/sshd on all SGE
> > hosts:
> > >
> > >       auth required /opt/sge/lib/lx-amd64/pam_sge-qrsh-setup.so
> > >
> > > 3. Do I need to restart all of the SGE daemons at this point?
> > 
> > No it should be fine without a restart
> > 
> > >
> > > 4. In order to run our GUI app, launch it thusly:
> > >
> > >       $ qrsh -now no wrapper.tcl
> > 
> > That looks fine, assuming sensible default resource requests, although
> > obviously I don't know the details of the wrapper or application.
> 
> [Mun] After making the above changes, I'm still experiencing problems.  
> First, let me point out that I should have more accurately represented how 
> qrsh will be used:
> 
> $ qrsh -now no <other qrsh options> tclsh wrapper.tcl <options for wrapper 
> script>
> 
> Now for the issues:
> 
> I first added the pam_sge-qrsh-setup.so at the top of the /etc/pam.d/sshd 
> file.  When I did that the qrsh job was launched but quickly terminated with 
> the following error from the tool I was attempting to launch:
> 
> ncsim/STRPIN =
>       The connection to SimVision could not be established due to an error
>       in SimVision. Check your DISPLAY environment variable,
>       which may be one of the reasons for this error.
> 
> I am not explicitly setting the DISPLAY--as that is how I normally use 'ssh 
> -X'.  Nor have I done anything to open any additional ports.  Again, since 
> 'ssh -X' is working for us.  As a reminder, there is no way for me to know 
> what to set DISPLAY to even if I wanted to set it.

If you can get it back to the actually launching mode then trying to run
qrsh -now n /bin/env to list out the environment you are getting might
help debug.

> 
> Now, the /etc/pam.d/sshd update caused an ssh issue: Users could no longer 
> ssh into our servers :(  I didn't realize the order of the lines in the sshd 
> is significant.
> 
> Therefore, I moved the pam_sge-qrsh-setup.so entry below the other "auth" 
> lines.  Although, that resulted in the following error when I tried the qrsh 
> command again:
> 
>      Your "qrsh" request could not be scheduled, try again later.
Did you remember the -now no option?  That looks like the sort of
message one might get if you forgot it.

> 
> One final note is that we have "selinux" enabled on our servers.  I don't 
> know if that makes any difference, but I thought I'd throw it out there.
Depends how it is configured I guess.  Which linux distro are you using?


William

Attachment: signature.asc
Description: PGP signature

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to