Yesterday I upgraded my last production box (remote) from 4.3 to 4.4.,
without any hitch, rebooted, and so forth.
Last night at some innocuous time, it stopped accepting incoming mail
(postfix). This morning, it did courier-imap well, until I used an
existing ssh-session like this:
# pwd
/usr/src/usr.sbin/httpd
# cd
/var/log/
#
/usr/local/sbin/post
postalias postfix postkick
postqueue
postcat postfix-disable postlock
postsuper
postconf postfix-enable postlog
postdrop postfix-install postmap
# /usr/local/sbin/postfix
status
^C^Z
Now it is stuck like this for an hour or so. It still takes keyboard
input, though.
Courier-imag also does not respond any longer. But nmap is still
somewhat okay:
$ nmap -sV 172.16.0.4
Starting Nmap 4.68 ( http://nmap.org ) at 2009-01-10 08:58 SGT
Interesting ports on 172.16.0.4:
Not shown: 1707 closed ports
PORT STATE SERVICE VERSION
13/tcp open daytime
22/tcp open ssh?
25/tcp open smtp?
37/tcp open time (32 bits)
53/tcp open domain?
80/tcp open http Apache httpd
110/tcp open pop3?
993/tcp open imaps?
daytime works fine, http works very well, but domain, pop3 and smtp time
out; or worse: all get stuck like here:
$ telnet 172.16.0.4 25
Trying 172.16.0.4...
Connected to 172.16.0.4.
Escape character is '^]'.
helo
mail from:[email protected]
quit
^C^Z
$ telnet 172.16.0.4 110
Trying 172.16.0.4...
Connected to 172.16.0.4.
Escape character is '^]'.
user udippel
Why do I write in:
1. I have no access. It is a remote production server. If I only could
stop that 'hanging' postfix, I might be able to issue a 'reboot'
2. Any further trial to ssh into it also get stuck like this:
$ ssh -v 172.16.0.4
OpenSSH_5.1, OpenSSL 0.9.7j 04 May 2006
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Connecting to 172.16.0.4 [172.16.0.4] port 22.
debug1: Connection established.
debug1: identity file /home/users/udippel/.ssh/identity type -1
debug1: identity file /home/users/udippel/.ssh/id_rsa type -1
debug1: identity file /home/users/udippel/.ssh/id_dsa type -1
after which I can only leave by killing the session on the client.
3. Even if I went there with a huge effort, and some time delay, how can
I debug the problem, so that it won't occur again?
Thanks for all ideas,
Uwe