Dne 30.4.2012 19:56, Alexander Duyck napsal(a):
> On 04/30/2012 02:02 AM, Moris Bangoura wrote:
>> Hi,
>>
>> we are working with modified ixgbe drivers (packetshader - ixgbe
>> 2.0.38.2, netmap - ixgbe 3.9.15)  that allow receiving/sending small
>> frames in wirespeed.
>>
>> In our lab  we use 2 CPU NUMA architecture (Xeon CPU, Intel 5520
>> chipset), 2x dual 10GbE 82599 cards.
>>
>> There is a problem with receiving small frames with length, that is not
>> multiply of 64B (without or with 4B CRC, depending if RDRXCTL.CRCStrip
>> and HLREG0.RXCRCSTRP register is  set to 1 or 0).
>>
>> We suspect, that problem is somewhere in 82599 DMA engine, Intel 5520
>> IOH, QPI or CPU cache line.
>>
>> What we discovered:
>>
>> 1. If CRCStrip reg is set to 1:
>> - RX of 60B(+4B CRC) frame is  6,9  Mpps (PCIe TLP payload is 60B)
>> - RX of 64B(+4B CRC) frame is 14,2 Mpps (PCIe TLP payload is 64B) ->  OK,
>> wirespeed.
>>
>> 2. If CRCStrip reg is set to 0:
>> - RX of 60B(+4B CRC) is<  14,8 Mpps (PCIe TLP payload is 64B) ->  OK,
>> wirespeed.
>> - RX of 61B(+4B CRC) is<  6,9 Mpps   (PCIe TLP payload is 65B)
>>
>> Is there some possible workaround, so 82599 DMA engine always aligns
>> length of Memory Write Request payload to be multiply of 64B?
>>
>> Example:
>> 0. 64B frame is received on Rx MAC with CRCStrip reg set to 1.
>> 1. The receive DMA fetches the next RX descriptor from the appropriate
>> host memory ring to be used for the next
>> received packet.
>> 2. The receive DMA posts the packet appended with 4B (so Memory Write
>> Request payload length is multiply of 64B) to the location indicated by
>> the RX descriptor through the PCIe interface.
>> 3. When the packet is placed into host memory, the receive DMA updates
>> all the RX descriptor(s) that were used by the
>> packet data (real non-appended packet length is reported via PKT_LEN).
>> 4. The receive DMA writes back the RX descriptor content along with
>> status bits that indicate the packet information
>> including what offloads were done on that packet.
>> 5. 82599 initiates an interrupt indicating, that new packet is ready in
>> host memory. The host reads packet data (only PKT_LEN indicated by RX
>> descriptor).
>>
>> Maybe there is some 82599 RX DMA register/bit that is not covered by
>> 82599 datasheet (version 2.75).
>>
>> Regards,
>>
> Morris,
>
> Are you seeing this issue with both the 2.0.38.2 and 3.9.15 drivers, or
> is this mainly with 2.0.38.2?  I just want to clarify since the 3.9.15
> driver should be significantly more optimized than 2.0.38.2 driver.
>
> The behaviour you are describing sounds like an issue with partial cache
> line writes.  This is an issue for most architectures because it
> typically requires a read/modify/write cycle to write the cache line
> instead of being just a direct write as in the case of a full cache line
> write.  The 3.9.15 driver contains several updates since the 2.0.38.2 in
> regards to partial cache line writes and will likely show much better
> performance.  Specifically it will cut the number of partial cache line
> writes in half by aligning the buffers with the start of a cache line.
>
> The hardware itself doesn't contain any workarounds for this, but I
> would recommend testing with the 3.9.15 driver instead of the 2.0.38.2
> driver as it will contain several software improvements that may help to
> improve the performance.
>
> Thanks,
>
> Alex
Hi,

thank you for quick answer.

Yes, the issue could be seen in both versions - same results.

Each address for 82599 DMA write is aligned... each cell  of the RX 
packet data buffer ring in RAM is  fixed size (2048B) with 64B alignment.
Also modified driver uses 64B aligned memcpy and prefetching...

Do i understand it right, that partial cache line writes can not occur 
with 64B aligned address and packet for example 60B (64B ethernet packet 
without 4B CRC stripped)?
See ftp://download.intel.com/design/intarch/PAPERS/321071.pdf.

Maybe there is some register/bit for 82599 DMA RX, that is not covered 
by 82599 datasheet... and could alter 82599 DMA write.

For example IXGBE_RDRXCTL_AGGDIS... are DMA RX writes somehow aggregated?

Thank you,

-- 
Moris Bangoura
CTU FEE Prague


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to