Nicolas Droux wrote:
>
> On May 6, 2009, at 5:49 AM, Andrew Gallatin wrote:
>
>> Nicolas Droux wrote:
>>> [Bcc'ed driver-discuss at opensolaris.org and
>>> networking-discuss at opensolaris.org]
>>> I am pleased to announce the availability of the first revision of
>>> the "Crossbow APIs for Device Drivers" document, available at the
>>> following location:
>>
>> I recently ported a 10GbE driver to Crossbow. My driver currently
>> has a single ring-group, and a configurable number of rings. The
>> NIC hashes received traffic to the rings in hardware.
>>
>>
>> I'm having a strange issue which I do not see in the non-crossbow
>> version of the driver. When I run TCP benchmarks, I'm seeing
>> what seems like packet loss. Specifically, netstat shows
>> tcpInUnorderBytes and tcpInDupBytes increasing at a rapid rate,
>> and bandwidth is terrible (~1Gb/s for crossbow, 7Gb/s non-crossbow
>> on the same box with the same OS revision).
>>
>> The first thing I suspected was that packets were getting dropped
>> due to my having the wrong generation number, but a dtrace probe
>> doesn't show any drops there.
>>
>> Now I'm wondering if perhaps the interupt handler is in
>> the middle of a call to mac_rx_ring() when interrupts
>> are disabled. Am I supposed to ensure that my interrupt handler is not
>> calling mac_rx_ring() before my rx_ring_intr_disable()
>> routine returns? Or does the mac layer serialize this?
>
> Can you reproduce the problem with only one RX ring enabled? If so,
Yes, easily.
> something to try would be to bind the poll thread to the same CPU as the
> MSI for that single RX ring. To find the CPU the MSI is bound to, run
> ::interrupts from mdb, then assign the CPU to use for the poll thread by
> doing a "dladm setlinkprop -p cpus=<cpuid> <link>".
That helps quite a bit. For comparison, with no binding at all, it
looks like this: (~1Gb/s)
TCP tcpRtoAlgorithm = 0 tcpRtoMin = 400
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens = 0 tcpPassiveOpens = 0
tcpAttemptFails = 0 tcpEstabResets = 0
tcpCurrEstab = 5 tcpOutSegs = 17456
tcpOutDataSegs = 21 tcpOutDataBytes = 2272
tcpRetransSegs = 0 tcpRetransBytes = 0
tcpOutAck = 17435 tcpOutAckDelayed = 0
tcpOutUrg = 0 tcpOutWinUpdate = 0
tcpOutWinProbe = 0 tcpOutControl = 0
tcpOutRsts = 0 tcpOutFastRetrans = 0
tcpInSegs =124676
tcpInAckSegs = 21 tcpInAckBytes = 2272
tcpInDupAck = 412 tcpInAckUnsent = 0
tcpInInorderSegs =122654 tcpInInorderBytes =175240560
tcpInUnorderSegs = 125 tcpInUnorderBytes =152184
tcpInDupSegs = 412 tcpInDupBytes =590976
tcpInPartDupSegs = 0 tcpInPartDupBytes = 0
tcpInPastWinSegs = 0 tcpInPastWinBytes = 0
tcpInWinProbe = 0 tcpInWinUpdate = 0
tcpInClosed = 0 tcpRttNoUpdate = 0
tcpRttUpdate = 21 tcpTimRetrans = 0
tcpTimRetransDrop = 0 tcpTimKeepalive = 0
tcpTimKeepaliveProbe= 0 tcpTimKeepaliveDrop = 0
tcpListenDrop = 0 tcpListenDropQ0 = 0
tcpHalfOpenDrop = 0 tcpOutSackRetrans = 0
After doing the binding, I'm seeing less out-of-order
packets. netstat -s -P tcp 1 now looks like this: (~4Gb/s)
TCP tcpRtoAlgorithm = 0 tcpRtoMin = 400
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens = 0 tcpPassiveOpens = 0
tcpAttemptFails = 0 tcpEstabResets = 0
tcpCurrEstab = 5 tcpOutSegs = 46865
tcpOutDataSegs = 3 tcpOutDataBytes = 1600
tcpRetransSegs = 0 tcpRetransBytes = 0
tcpOutAck = 46869 tcpOutAckDelayed = 0
tcpOutUrg = 0 tcpOutWinUpdate = 19
tcpOutWinProbe = 0 tcpOutControl = 0
tcpOutRsts = 0 tcpOutFastRetrans = 0
tcpInSegs =372387
tcpInAckSegs = 3 tcpInAckBytes = 1600
tcpInDupAck = 33 tcpInAckUnsent = 0
tcpInInorderSegs =372264 tcpInInorderBytes =527482971
tcpInUnorderSegs = 14 tcpInUnorderBytes = 18806
tcpInDupSegs = 33 tcpInDupBytes = 46591
tcpInPartDupSegs = 0 tcpInPartDupBytes = 0
tcpInPastWinSegs = 0 tcpInPastWinBytes = 0
tcpInWinProbe = 0 tcpInWinUpdate = 0
tcpInClosed = 0 tcpRttNoUpdate = 0
tcpRttUpdate = 3 tcpTimRetrans = 0
tcpTimRetransDrop = 0 tcpTimKeepalive = 0
tcpTimKeepaliveProbe= 0 tcpTimKeepaliveDrop = 0
tcpListenDrop = 0 tcpListenDropQ0 = 0
tcpHalfOpenDrop = 0 tcpOutSackRetrans = 0
And the old version of the driver, which does not deal with the new
crossbow interfaces:
TCP tcpRtoAlgorithm = 0 tcpRtoMin = 400
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens = 0 tcpPassiveOpens = 0
tcpAttemptFails = 0 tcpEstabResets = 0
tcpCurrEstab = 5 tcpOutSegs = 55231
tcpOutDataSegs = 3 tcpOutDataBytes = 1600
tcpRetransSegs = 0 tcpRetransBytes = 0
tcpOutAck = 55228 tcpOutAckDelayed = 0
tcpOutUrg = 0 tcpOutWinUpdate = 465
tcpOutWinProbe = 0 tcpOutControl = 0
tcpOutRsts = 0 tcpOutFastRetrans = 0
tcpInSegs =438394
tcpInAckSegs = 3 tcpInAckBytes = 1600
tcpInDupAck = 0 tcpInAckUnsent = 0
tcpInInorderSegs =438392 tcpInInorderBytes =617512374
tcpInUnorderSegs = 0 tcpInUnorderBytes = 0
tcpInDupSegs = 0 tcpInDupBytes = 0
tcpInPartDupSegs = 0 tcpInPartDupBytes = 0
tcpInPastWinSegs = 0 tcpInPastWinBytes = 0
tcpInWinProbe = 0 tcpInWinUpdate = 0
tcpInClosed = 0 tcpRttNoUpdate = 0
tcpRttUpdate = 3 tcpTimRetrans = 0
tcpTimRetransDrop = 0 tcpTimKeepalive = 0
tcpTimKeepaliveProbe= 0 tcpTimKeepaliveDrop = 0
tcpListenDrop = 0 tcpListenDropQ0 = 0
tcpHalfOpenDrop = 0 tcpOutSackRetrans = 0
> There might be a race between the poll thread and the thread trying to
> deliver the chain through mac_rx_ring() from interrupt context, since we
> currently don't rebind the MSIs to the same CPUs as their corresponding
> poll threads. We are planning to do the rebinding of MSIs, but we are
> depending on interrupt rebinding APIs which are still being worked on.
> The experiment above would allow us to confirm whether it the issue seen
> here or if need to look somewhere else.
>
> BTW, which ONNV build are you currently using?
SunOS dell1435a 5.11 snv_111a i86pc i386 i86pc
This is an OpenSolaris 2009.06 machine (updated from
snv_84).
I can try BFU'ing a different machine to a later build
(once I've BFU'ed it to 111a so as to repro it there).
I'm traveling, and wouldn't have a chance to try that
test until next week.
BTW, did you see my earlier message on networking-discuss
(http://mail.opensolaris.org/pipermail/networking-discuss/2009-April/010979.html)
That was with the pre-crossbow version of the driver.
Cheers,
Drew