I work for a NT-centric web site. In an effort to make things work better and be more manageable, I am pushing more Linux into the mix. Please help if you can--I think they are beginning to suspect our favorite OS is crummy!
We basically run streaming audio servers out of a co-location cage. Low incoming bandwidth (requests), lots of outgoing bandwidth (audio packets). I wanted to protect our servers from the malicious and the unnecessary, without placing a bottleneck on outbound packets, so I set up a "triangle route" (is there a real name for this?) as shown:
incoming +---------------+
---------------> | 209.249.0.172 | (dual-homed linux 2.0.36
with ipfwadm)
|
|
| 209.249.202.3 | ----------->+ +-------------------------+
+---------------+
| | 209.249.202.62
|
+------ | (WinNT, NetShow server) |
+---------------+
| | (default gw is 202.1)
|
| 209.249.202.1 | <-----------+
+-------------------------+
outgoing |
|
<--------------- | 209.249.0.173 | (fast BayNetworks IP
switch/router)
+---------------+
So, incoming packets are sent to our incoming firewall/gateway at 0.172, then they go to the server; the server's reply packets are sent out with a default gw of 202.1, so they go out the "fast path" instead of thru the firewall.
Now, this seems to work *MOST* of the time. Fine for almost all applications. Including audio streaming using a combination of TCP and UDP. But only works SOMETIMES for audio streaming over HTTP. Tres wierd! This happens even when I have ipfwadm set to forward ALL packets. (I haven't had a chance to try another kernel on the firewall yet.)
I had lots of trouble reproducing this problem but eventually I did
capture two tcpdumps while streaming audio over HTTP (Windows Media Player
client on 204.162.114.67). The first try worked and the second one
(immediately after!) failed. The full tcpdumps are at:
http://www.poger.com/tcpdump/worked.txt
http://www.poger.com/tcpdump/failed.txt
(they also contain extraneous (?) packets involving 202.59)
In summary, though:
In both cases all the data seems to have been transmitted from the server to the client, but for some reason the client is unhappy with the way the connection ends. In the failure case, it ends with a whole bunch of little PUSHed 8-byte TCP packets.
Even the connection that eventually WORKED actually involved two reset
TCP connections; I guess it worked on the second try. Header excerpts:
204.162.114.67.1749 > 209.249.202.62.80: . ack 16571 win 65525
209.249.202.62.80 > 204.162.114.67.1749: P 16571:17238(667) ack 248
win
204.162.114.67.1750 > 209.249.202.62.80: P 667:715(48) ack 1 win
209.249.202.62.80 > 204.162.114.67.1749: P 17238:18698(1460) ack 248
win
209.249.202.62.80 > 204.162.114.67.1749: P 18698:18703(5) ack 248 win
204.162.114.67.1749 > 209.249.202.62.80: R 46422:46422(0) win 0 (DF)
204.162.114.67.1750 > 209.249.202.62.80: FP 715:807(92) ack 1 win
209.249.202.62.80 > 204.162.114.67.1750: . ack 808 win 7954
209.249.202.62.80 > 204.162.114.67.1750: R 3216284:3216284(0) win 0
(DF)
One thing I noticed: the Windows Media Player client puts a "Connection: Keep-Alive" header in. (Real Player doesn't do that and seems to work better.) Could that be involved?
Thank you for any ideas!
-Elliot Poger
