Hi Jim,

Thanks for the reply.  Unfortunately the answer doesn't seem to be that
simple - I do have the ssh stuff worked out (believe me, I've googled the
heck out of this thing!), the qsub test won't work without it.  I can scp
between the two nodes in all combinations of user "globus" or "labkey",
logged into either node, and in either direction.

Thanks,

Brian

On Thu, Dec 3, 2009 at 1:33 PM, Jim Basney <jbas...@ncsa.uiuc.edu> wrote:

> Hi Brian,
>
> "Host key verification failed" is an ssh client-side error. The top hit
> from Google for this error message is
> <http://www.securityfocus.com/infocus/1806> which looks like a good
> reference on the topic. I suspect you need to populate and distribute
> /etc/ssh_known_hosts files between your nodes.
>
> -Jim
>
> Brian Pratt wrote:
> > Actually more of a logging question - I don't expect anyone to solve the
> > problem by remote control, but I'm having a bit of trouble figuring out
> > which node (server or client) the error is coming from.
> >
> > Here's the scenario: a node running globus/ws-gram/pbs_server/pbs_sched
> and
> > one running pbs_mom. Using the globus simple ca.  Job-submitting user is
> > "labkey" on the globus node, and there's a labkey user on the client node
> > too.
> >
> >  I can watch decrypted SSL traffic on the client node with ssldump and
> > simpleca private key and can see the job script being handed to the
> pbs_mom
> > node.
> >
> > passwordless ssh/scp is configured between the two nodes.
> >
> > job-submitting user's .globus directory is shared via nfs with the mom
> > node.  UIDs agree on both nodes.  globus user can write to it.
> >
> >  Jobs submitted with qsub are fine. "qsub -o
> > ~labkey/globus_test/qsubtest_output.txt -e
> > ~labkey/globus_test/qsubtest_err.txt qsubtest"
> >  cat qsubtest
> >    #!/bin/bash
> >    date
> >    env
> >    logger "hello from qsubtest, I am $(whoami)"
> > and indeed it executes on the pbs_mom client node.
> >
> > Jobs submitted with fork are fine.  "globusrun-ws -submit -f
> gramtest_fork"
> >  cat gramtest_fork
> > <job>
> >   <executable>/mnt/userdata/gramtest_fork.sh</executable>
> >   <stdout>globus_test/gramtest_fork_stdout</stdout>
> >   <stderr>globus_test/gramtest_fork_stderr</stderr>
> > </job>
> > but those run local to the globus node, of course.
> >
> > But a job submitted as
> > globusrun-ws -submit -f gramtest_pbs -Ft PBS
> >
> > cat gramtest_pbs
> > <job>
> >   <executable>/usr/bin/env</executable>
> >   <stdout>gramtest_pbs_stdout</stdout>
> >   <stderr>gramtest_pbs_stderr</stderr>
> > </job>
> >
> > Gives this: cat globusrun-ws -submit -f gramtest_pbs -Ft PBS
> > Host key verification failed.
> > /bin/touch: cannot touch
> > `/home/labkey/.globus/c5acdc30-e04c-11de-9567-d32d83561bbd/exit.0': No
> such
> > file or directory
> > /var/spool/torque/mom_priv/jobs/
> > 1.domu-12-31-38-00-b4-b5.compute-1.internal.SC<http://1.domu-12-31-38-00-b4-b5.compute-1.internal.sc/>:
> 59: cannot open
> > /home/labkey/.globus/c5acdc30-e04c-11de-9567-d32d83561bbd/exit.0: No such
> > file
> > [: 59: !=: unexpected operator
> >
> > I'm stumped - what piece of the authentication picture am I missing?  And
> > how to identify the actor that emitted that failure message?
> >
> > Thanks,
> >
> > Brian Pratt
>

Reply via email to