On 25/08/09 10:32 AM, Peter Memishian wrote:
> > The simplest way to fix the CR is let snoop to check the LSO > > information with the packet and avoid warning it. But I'm not sure > > it's enough. Since LSO packets will be segmented by the NIC hardware > > so on the wire there are only MTU-size packets. Are there suers who > > expect to have a option for snoop to see the "expected packets on the > > wire"? For example, `snoop -?` to parse the LSO packet header to > > multiple regular headers that are expected to be seen on the wire? > > I think snoop should only report what it really sees. For physical > devices, on the tx side, snoop can never see post-segmentation packets > on the wire. I don't think snoop should make a guess and report what it > has imagined to the users.
 > >
> > It gets a little bit complex when we implement LSO on top of VNICs, as > > is still being discussed. When snooping a VNIC created on e1000g, > > should the snoops be seeing original LSO packets as sent to e1000g or > > post-segmentation packets as seen on the wire? Any thoughts? > > If "snoop -d vnic0", I think it should report original LSO packets. If > the real traffic passes through e1000g, "snoop -d e1000g0" should report > the real packets that are passed to e1000g (if e1000g LSO is disabled, > it should show regular Ethernet frames to users).

Yes -- and FWIW we do something similar when hardware checksums are
enabled (snoop simply reports what's in the packet, even if it's invalid).

That said, there should be some facility that allows snoop to know this is
an LSO packet and make this clear to the user examining the dump.  Of
course, we could also disable LSO in this case, but I think that would be
a mistake because it's likely the problem the user is trying to
troubleshoot is related to LSO -- and thus by disabling LSO we are only
making their life harder.  (There is intentionally no supported
administrative mechanism for disabling LSO; we need to keep it that way.)

That's not very helpful when diagnosing problems with incorrect
checksums in packets. Luckily there is a workaround:
# echo 'dohwcksum/W 0' | adb -w -k

With no way to discover what features like this are enabled on
network interfaces and no way to control them (except for the
above), diagnosing checksum issues is harder than it needs to be
on Solaris.

By contrast, with modern BSD systems, all of this is available, e.g;

nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=3f00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx>
       enabled=0
       address: 00:24:8c:5c:66:80
       media: Ethernet 100baseTX (100baseTX half-duplex)
       status: active
       inet 192.168.1.254 netmask 0xffffff00 broadcast 192.168.1.255
       inet6 fe80::224:8cff:fe5c:6680%nfe0 prefixlen 64 scopeid 0x1


Every one of those capabilities can be individually enabled or
disabled, allowing for developers and administrators to make precise
changes in the processing of packet checksums when tracking down a
fault, not to mention working around it until a fix is available.
In this instance, none of the capabilities are enabled. FWIW, this
list only represents those that this NIC supports (there are more)
and those that are enabled (i.e. none.)

The comment about it being likely a problem related to LSO is only
guesswork until you can eliminate LSO from the processing path and
determine if the fault is then reproducible or not. One way to try
and eliminate LSO from the equation is to use dtrace to trace the
packet through the kernel - but we all know what the limitations
with that are. The other is to turn LSO off and see if the problem
persists.

Regardless of which particular option any one person feels is best,
we should not only be enabling our users to choose for themselves
but also enabling ourselves to diagnose problems by delivering
tools that make it possible to retrieve and manage all configuration
information. I suppose there's also mdb that can be used to retrieve
network interface configuration, but that's quite raw.

I haven't checked to see if AIX/HPUX allow this to be tweaked,
I suspect HPUX's heritage means it does not, but I do know that
BSD, Linux and Windows all allow it to be controlled at a finer
level than Solaris does. Strangely, I don't hear any horror
stories from people about it being possible to manage network
interfaces in that way on those platforms. Maybe because it is
not the monster that it is being made out to be?

Darren

_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to