I am having a really hard time coming up with a plausible explanation for this,
other than some kind of kernel bug with openindiana...
I have two systems in the office, Dell PowerEdge SC 1435 (Embedded Broadcom
5721 NIC) and Dell PowerEdge 2950 (Embedded Broadcom 5708 NIC), both running OI
151a5 or newer.
Inside the office, everything works fine. But when I go home and VPN into the
office, I ssh or vnc to these two boxes, and I get packet garbling and
retransmissions and dropped connections, but *only* on these two machines, and
*only* from the vpn connection, and *only* for certain specific types of
traffic. Here's an example:
I'm on an ssh prompt. I can type in commands all day and night, it always
works fine when I'm typing. (One character at a time, typing via keyboard, I
can hold down a key and completely fill the screen, 4320 keystrokes no
problem.) But I'm following a procedure, so I'm also pasting commands.
Sometimes when I paste commands, I get PuTTY Fatal Error: Incoming packet was
garbled on decryption. (Disconnected.)
It's not a MTU thing. (First of all, I checked all the MTU's looking for any
problems) but a better clue is that I can paste the same command over and over
and over (obviously the same packet size each time) and it only fails after the
Nth repititon. For testing purposes, I ssh into box, and I paste this command:
echo "hello there buddy, whatcha doing" > /dev/null
Obviously nowhere near the MTU size. I keep pasting it over and over, until
connection fails. Count how many times I can successfully paste it before
failure. Repeat. My results were: 5, 0, 12, 0, 9, 0. Deterministic inputs,
nondeterministic outputs. (Well, probably deterministic, but not determined by
the inputs that I'm controlling).
I have a workaround. I ssh into some other machine in the network, and then
ssh to the machine in question. Infinite success. Paste the above command
until my fingers are tired and I'm satisfied that there's no problem. The
problem *only* happens when I ssh (or vnc or whatever) directly to the machine
from the vpn client. And obviously, it doesn't happen when I ssh to some other
machine from the vpn client (and then ssh to the machine in question).
The only difference between the LAN traffic which works perfectly, and the VPN
traffic that's having a problem, is the fact that the VPN traffic needs to go
through a router. It's not the router that's messing up the traffic, or else I
would expect to see the same problem on a different machine.
It's hard for me to imagine a driver problem that will only affect traffic that
requires a router. But maybe. Maybe there's a broadcom driver problem, that
doesn't affect LAN traffic but does affect traffic going through a router.
Anyway, I'm at a loss for how to debug further. I suppose I could create a
dummy network with a really simple router in between, and see if the problem
persists, using a different router and no VPN. Also, if I do that, I'll be
able to wireshark both sides, to see what happens. For now, on my VPN, I can
only wireshark the OI side of the equation; can't wireshark the traffic at my
VPN endpoint.
I also have one Intel NIC I can stick into one of the machines.
_______________________________________________
OpenIndiana-discuss mailing list
[email protected]
http://openindiana.org/mailman/listinfo/openindiana-discuss