Hi

I'm using release 0.1.41 of JSch, via Ant, which is being executed via
Gant (which will explain the following stack traces).

I'm using JSch in this case for a straightfoward connectivity test
between two machines--basically, as part of a larger process, I want
to verify that the end user has configured the remote host and SSH
keys correctly. In this configuration, the end user supplies the
hostname or IP, plus a local (private) key which is used to run a
command remotely via SSH. In this case I'm just running ls -1, though
it could be any command.

In the case where the NPE is being thrown, both local and the remote
systems are running Linux; the local machine is on Ubuntu 8.10, and
the remote on a Gentoo release (looks to be kernel 2.6.20).

After testing the same script against 3 different remote servers, I've
only found it fail against 1 of them, at a rate of around 20%. The
behavior is that the connection is established, the remote command is
run, I see the output of the directory listing, then I get a NPE.

com.jcraft.jsch.JSchException: java.lang.NullPointerException
        at com.jcraft.jsch.Channel.connect(Channel.java:206)
        at com.jcraft.jsch.Channel.connect(Channel.java:144)
        at 
org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.executeCommand(SSHExec.java:209)
        at 
org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.execute(SSHExec.java:162)

At line 206 in the Channel code, the exception is swallowed (e.g. only
the message is captured), so I rebuilt the release with debug symbols,
stopped in the debugger, then was able to output the stack trace to
the origin of the NPE:
java.lang.NullPointerException
        at com.jcraft.jsch.ChannelExec.start(ChannelExec.java:52)
        at com.jcraft.jsch.Channel.connect(Channel.java:200)
        at com.jcraft.jsch.Channel.connect(Channel.java:144)
        at 
org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.executeCommand(SSHExec.java:209)
        at 
org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.execute(SSHExec.java:162)

Again, "most" of the time the call completes with no error, so it
seems like a timing issue somehow. I was able to catch this in the
debugger--what looks to be happening is that Channel.io is being set
to null in some cases while ChannelExec is still running. The check in
ChannelExec, line 52, is throwing the NPE on trying to dereference io.
Is it safe to just check for if(io !== null && io.in!=null) at this
point in the code?

The only thing that is special about the remote host where this fails
is that it's a (Gentoo) Linux system running on top of OpenVZ, where
the host is an Ubuntu system running within VMWare. Though obviously
that's somewhat nasty, it's necessary in this case to allow our
developers to use a legacy OS configuration to host a test
environment. There is thus bridging going on, and maybe there's a
network burp somewhere causing this to fail occasionally. I have
successfully run the script many times against a similar
configuration, but where the physical box was hosting Ubuntu, which
was hosting the Gentoo instance via OpenVZ. So I'm wondering if the
VMWare host could be causing the problem.

TIA!
Patrick

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
JSch-users mailing list
JSch-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jsch-users

Reply via email to