On Thu, May 07, 2020 at 06:29:05PM +0000, Mun Johl wrote: > Hi William, et al., > > Thank you kindly for your response and insight. > Please see my comments below. > > > On Wed, May 06, 2020 at 11:10:40PM +0000, Mun Johl wrote: > > > [Mun] In order to use ssh -X for our jobs that require an X11 window to > > > be pushed to a user's VNC display, I am planning on the > > following changes. But please let me know if I have missed something (or > > everything). > > > > > > 1. Update the global configuration with the following parameters: > > > > > > rsh_command /usr/bin/ssh -X > > > rsh_daemon /usr/sbin/sshd -i > > > > As you are using the pam_sge-qrsh-setup.so you will need to set > > rsh_daemon to point to a rshd-wrapper which you should find in > > $SGE_ROOT/util/resources/wrappers eg if $SGE_ROOT is /opt/sge > > > > rsh_command /usr/bin/ssh -X > > rsh_daemon /opt/sge/util/resources/wrappers/rshd-wrapper > > [Mun] Thanks for pointing out my mistake! > > > > 2. Use a PAM module to attach an additional group ID to sshd. The > > > following line will be added to /etc/pam.d/sshd on all SGE > > hosts: > > > > > > auth required /opt/sge/lib/lx-amd64/pam_sge-qrsh-setup.so > > > > > > 3. Do I need to restart all of the SGE daemons at this point? > > > > No it should be fine without a restart > > > > > > > > 4. In order to run our GUI app, launch it thusly: > > > > > > $ qrsh -now no wrapper.tcl > > > > That looks fine, assuming sensible default resource requests, although > > obviously I don't know the details of the wrapper or application. > > [Mun] After making the above changes, I'm still experiencing problems. > First, let me point out that I should have more accurately represented how > qrsh will be used: > > $ qrsh -now no <other qrsh options> tclsh wrapper.tcl <options for wrapper > script> > > Now for the issues: > > I first added the pam_sge-qrsh-setup.so at the top of the /etc/pam.d/sshd > file. When I did that the qrsh job was launched but quickly terminated with > the following error from the tool I was attempting to launch: > > ncsim/STRPIN = > The connection to SimVision could not be established due to an error > in SimVision. Check your DISPLAY environment variable, > which may be one of the reasons for this error. > > I am not explicitly setting the DISPLAY--as that is how I normally use 'ssh > -X'. Nor have I done anything to open any additional ports. Again, since > 'ssh -X' is working for us. As a reminder, there is no way for me to know > what to set DISPLAY to even if I wanted to set it.
If you can get it back to the actually launching mode then trying to run qrsh -now n /bin/env to list out the environment you are getting might help debug. > > Now, the /etc/pam.d/sshd update caused an ssh issue: Users could no longer > ssh into our servers :( I didn't realize the order of the lines in the sshd > is significant. > > Therefore, I moved the pam_sge-qrsh-setup.so entry below the other "auth" > lines. Although, that resulted in the following error when I tried the qrsh > command again: > > Your "qrsh" request could not be scheduled, try again later. Did you remember the -now no option? That looks like the sort of message one might get if you forgot it. > > One final note is that we have "selinux" enabled on our servers. I don't > know if that makes any difference, but I thought I'd throw it out there. Depends how it is configured I guess. Which linux distro are you using? William
signature.asc
Description: PGP signature
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users