Hi I'm using release 0.1.41 of JSch, via Ant, which is being executed via Gant (which will explain the following stack traces).
I'm using JSch in this case for a straightfoward connectivity test between two machines--basically, as part of a larger process, I want to verify that the end user has configured the remote host and SSH keys correctly. In this configuration, the end user supplies the hostname or IP, plus a local (private) key which is used to run a command remotely via SSH. In this case I'm just running ls -1, though it could be any command. In the case where the NPE is being thrown, both local and the remote systems are running Linux; the local machine is on Ubuntu 8.10, and the remote on a Gentoo release (looks to be kernel 2.6.20). After testing the same script against 3 different remote servers, I've only found it fail against 1 of them, at a rate of around 20%. The behavior is that the connection is established, the remote command is run, I see the output of the directory listing, then I get a NPE. com.jcraft.jsch.JSchException: java.lang.NullPointerException at com.jcraft.jsch.Channel.connect(Channel.java:206) at com.jcraft.jsch.Channel.connect(Channel.java:144) at org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.executeCommand(SSHExec.java:209) at org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.execute(SSHExec.java:162) At line 206 in the Channel code, the exception is swallowed (e.g. only the message is captured), so I rebuilt the release with debug symbols, stopped in the debugger, then was able to output the stack trace to the origin of the NPE: java.lang.NullPointerException at com.jcraft.jsch.ChannelExec.start(ChannelExec.java:52) at com.jcraft.jsch.Channel.connect(Channel.java:200) at com.jcraft.jsch.Channel.connect(Channel.java:144) at org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.executeCommand(SSHExec.java:209) at org.apache.tools.ant.taskdefs.optional.ssh.SSHExec.execute(SSHExec.java:162) Again, "most" of the time the call completes with no error, so it seems like a timing issue somehow. I was able to catch this in the debugger--what looks to be happening is that Channel.io is being set to null in some cases while ChannelExec is still running. The check in ChannelExec, line 52, is throwing the NPE on trying to dereference io. Is it safe to just check for if(io !== null && io.in!=null) at this point in the code? The only thing that is special about the remote host where this fails is that it's a (Gentoo) Linux system running on top of OpenVZ, where the host is an Ubuntu system running within VMWare. Though obviously that's somewhat nasty, it's necessary in this case to allow our developers to use a legacy OS configuration to host a test environment. There is thus bridging going on, and maybe there's a network burp somewhere causing this to fail occasionally. I have successfully run the script many times against a similar configuration, but where the physical box was hosting Ubuntu, which was hosting the Gentoo instance via OpenVZ. So I'm wondering if the VMWare host could be causing the problem. TIA! Patrick ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ JSch-users mailing list JSch-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jsch-users