My setup:
- Server (DSL/WLAN router running Dropbear sshd v2012.55)
$ uname -mrsp
Linux 2.6.19.2 mips unknown
$ ps | grep dropbear | grep -v grep
1673 root 1352 S dropbear -i -R -a
14826 root 1396 S dropbear -i -R -a
- Client
OS: Windows XP Pro 64-bit
SSH Client: Bitvise
As you can see, my Dropbear runs as root in inetd mode, permitting root logins
(which is what I use) and accepting connections to forwarded ports from other
hosts. I need this because my connection basically creates a few forward
tunnels (client to server) to other machines behind the router as well as
backward tunnels (server to client) to a few services on the client's network.
So far, so good.
The DSL router gets disconnected once every night, reconnects within seconds
and gets a new IP address, which is the usual thing in Germany for
consumer-type ISP connections. What I expect to happen is that the dropbear
process goes down, but in ca. 4 out of 7 days this does not happen. The main
symptom is that the auto-reconnect for the SSH connection to the dynamic host
name fails because ports on the router cannot be bound because they are already
in use. When I check with netstat I can see that indeed all the listening ports
for the reverse tunnels are still in use by the old Dropbear process which has
not terminated. On a few days a week it works, but I do not know the
circumstances or race conditions which cause this behaviour. So what I end up
doing most of the time is log on to the router without the tunnels and kill the
non-terminated Dropbear process blocking the listening ports. A few seconds
later, the full-blown SSH connection with forward and reverse tunnels
automatical
ly reconnects and everything is fine for another 24 hours.
Now this obviously is ugly and unstable. Is there a way to make Dropbear
understand it should terminate when the DSL connection is gone? Or is there at
least a workaround by which I can check if the SSH process ist still alive? I
thought that maybe I could try and connect to one of the stale reverse tunnels
(localhost:someport on the router) in order to see if it is still functional,
then kill the process otherwise, but I had difficulty doing so because a wget
test does not work (I only have Busybox wget which does not have a time-out
parameter).
Please ask for more information if this was too unspecific.
--
Alexander Kriegisch
http://scrum-master.de