Well, the problem is apparently fixed but it's one of those things where I
had to apply Holmes' rule: "When you've eliminated every possible
explaination, whatever remains, however impossible, must be the truth". In
other words, I fixed it, but I don't know why what I did worked. Here's
what happened:
I was checking the rc.virtual file for anything signifigant about this
domain. Anything that might be causing this problem and/or had been
changed recently (remember, this domain has been around for over a year
with no problems until about a week ago). Back when we were having trouble
logging, I added a domain called dummy.org that was internal to our system
(ie no internic reg, just an entry in our nameserver) and set up a virtual
host for it on the web server to see if apache would log for it.
It did not share an IP with unionjobs.com
but it was the only thing I could think of that had been changed so I
removed the it and restarted everything. Lo and behold, no
errors since then! Now, here's the only thing I can think of: We currently
have approximately 30 IPs bound to one NIC. Now I've heard of
far, far more bound to one card, but... Idunno. Like I said, it's the only
thing that changed. If not a theoretical upper limit, would there be a
practical upper limit, maybe? Perhaps to the number of IPs it can
accept arps for without getting confused... Anyway, it's just speculation
on my part. If anyone has solid information, I'm always looking to learn.
=:)
As for the message I'm replying to...
> > 22:58:08.232440 ppp-20.internet-frontier.net.62414 > www.unionjobs.com.80: S
>1946168:1946168(0) win 8192 <mss 1460> (DF)
>
> This is a bit odd. ppp-20 tries again to initiate the connection from
> the same port, but with a different initial sequence number. Maybe it
> just didn't get the last RST.
Any possibility that this confusion (not recieving an RST) could cause
Netscape to mistakenly return the "connection reset by peer" error?
> > 22:58:08.232440 www.unionjobs.com.80 > ppp-20.internet-frontier.net.62414: R
>0:0(0) ack 1944 win 0
>
> Is the `ack 1944' a cut-and-paste error? One would think that it would
> be `ack 1944226'.
It's possible, I suppose, but I'd done everything in one chunk so since
it's not the end of the line, that's definately how it looked when I cut
it (that was from within a text editor, though so something could have
been deleted there).
> > 22:58:08.272440 www.unionjobs.com.80 > ppp-20.internet-frontier.net.62410: S
>2790906700:2790906700(0) ack 1937196 win 32736 <mss 1460>
>
> This one packet doesn't follow the same pattern as the rest. It's
> actually acknowledging a connection (one from port 62410, which isn't
> shown above).
I had snipped a bit out. Sorry, should have specified where.
> OK, just to be really, really clear on this: You haven't assigned the
> IP address for www.unionjobs.com to another host by mistake, have you?
> (If you're running DHCP/BOOTP, ensure that dhcpd/bootpd won't give out
> this address to a client).
We don't use DHCP and every way I have of checking says that I did not
have anything else ever assigned to that IP. The only thing that had
really changed between when it worked and when it didn't was adding an
extra virtual host that had a different ip (208.196.56.166) from the
host in question (208.196.56.222).
> I take it that you are using IP-based virtual hosts (i.e. you have
> allocated multiple IP addresses to virtual interfaces on the web
> server), right?
Yes.
> Are you using BindAddress (other than `BindAddress *') in httpd.conf?
The only bindaddress entry is "Bindaddress *".
Anyway, thanks as always for the help.
-====---====---====---====---====---====---====---====---====---====---====-
to unsubscribe email "unsubscribe linux-admin" to [EMAIL PROTECTED]
See the linux-admin FAQ: http://www.kalug.lug.net/linux-admin-FAQ/