Hi all,
I've found the solution to some hard problems, but this one has me
absolutely stumped. First of all, a quick network diagram:
Provisioning Server
10.0.177.1 <---------Ethernet---------> 10.0.177.2
Cable Modem Router
10.0.180.254
| | | |
| | | |
| | | | 4xT1
| | | |
| | | |
Internet (SprintLink)
|
Internet (ATT)
|
Netscreen Firewall (DMZ) ---> Mail Server
What is happening is absolutely wierd. Certain customers started
calling stating that their OE clients were timing out on POP3 errors.
Nothing I could do would shake it loose. Telnetting into POP3 from my
network worked fine. Here's what I know.
1) It is only affecting certain accounts. I have deleted and recreated
these accounts, and the problem still exists.
2) If I telnet into the mail server from anywhere except the 10.0.180.x
network (or use www.mail2web.com), the mail account works just fine.
3) If I telnet into the cable modem router, and then telnet to the mail
server on port 110, I get this conversation:
CiscoUBR>telnet 10.0.188.85 110
Trying 10.0.188.85, 110 ... Open
+OK VopMail POP3 Server 5.3.232.0 Ready
<[EMAIL PROTECTED]>
user paegesus
+OK paegesus is welcome here
pass xxxxx
It hangs there - I've left it for an hour, and it never hears a
response back from the server. However, the server "hears" the pass
command, because it locks the user's mailbox. If I telnet from the
provisioning server right next to the router, everything works fine.
4) The timeout only occurs if the mailbox is empty. I send a test
message to the account, telnet from the router to the server, and read
the message, and dele it. Immediately I try to log back in, and get the
timeout.
I am clueless. If I were to try and cause the behavior myself on
purpose, I'd put a misconfigured echelon box on the backbone somewhere.
I've rebooted every piece of equipment that I own from endpoint to
endpoint. It can't be user account corruption, everything works fine
from everywhere else. It can't be an IP access-list somewhere, some
(most) accounts do work from the problem network. The whole
broken-when-empty, fine when not thing confuses the Hell out of me.
This just started today, I haven't touched/changed a thing, I was gone
all last week. I'm hoping that it's something that Sprint or AT&T fixes
tonite - but if anyone has any ideas I'm open.
--
Justin Ellison <[EMAIL PROTECTED]>
Systems Administrator
USA Companies, LLC.
(308 236-1510 x13
--
Justin Ellison <[EMAIL PROTECTED]>
signature.asc
Description: This is a digitally signed message part
