Andrew Gallatin wrote: > Nitin Hande wrote: >> Andrew Gallatin wrote: >>> Hi, >>> >>> Can somebody shed some light on how crossbow hashes outgoing >>> packets to different transmit rings (not ring groups)? >>> >>> My 10GbE driver has multiple rings (and a single group). Each >>> transmit ring shares an interrupt with a corresponding receive >>> ring. We call a set of 1 TX ring, 1 RX ring, and interrupt handler >>> state a "slice". Transmit completions are handled from the interrupt >>> handler. On OSes which support multiple transmit routes, >>> we've found that ensuring that a particular connection is always >>> hashed to the same slice by the host and the NIC helps quite a bit >>> with performance (improves CPU locality, reduces cache misses, >>> decreases >>> power consumption). >>> >>> Some OSes (like FreeBSD) allow a driver to assist in tagging a >>> connection so as to ensure that it is easy to hash >>> traffic for the same connection into the same slice in the host >>> and the NIC. Others (Linux, S10) allow the driver to hash the >>> outgoing packets to provide this locality. >>> >>> So.. Where is the transmit hashing done in crossbow? Is it tunable? >>> Is there a hook where I can do provide a hash routine (like Linux)? >>> Can I tag packets (like FreeBSD)? Is it at least something standard >>> like Toeplitz? >> >> If your driver has advertised multiple tx rings, then look for >> mac_tx_fanout_mode() which in turn computes the hash on fanout hint >> passed from ip. Providing hooks for additional hash routines has been >> suggested. > > I guess my best bet might be to lie, and say I have only one TX > ring, then fanout things myself, like I used to before Crossbow. > Is there any non-obvious disadvantage to that?
If you advertise a single ring, then the tx path will end up in mac_tx_single_ring_mode() , they way it does for an e1000g driver. I think in that case the entry point in the driver is through older xxx_m_tx(), you may have to pay attention to that in your driver. There could be slight variance in both the schemes. In case of single_ring_mode(), if you get backpressured from the driver on the tx side due to lack of descriptors, then packets will be enqueued at the tx srs. At that point, if there are multiple threads trying to send additional packets, all the packets will end up getting queued, whereas there will be only one worker thread trying to clear up the queue build-up. At high packet rates its difficult for this one thread to catch up (Additionally also look at MAC_DROP_ON_NO_DESC flag in .mac_tx_srs_no_desc() which can drop the packets rather than queuing). Versus in mac_tx_fanout_mode() each tx ring gets its own softring in case of backpressure and its own worker thread. > > When looking at this, I noticed mac_tx_serializer_mode(). Am I reading > this right, in that is serializes a single queue? That seems lacking, > compared to the nxge_serialize stuff it replaces. Yes. This part was done for nxge and as far as I remember recent performance of this scheme was very close to that of the previous scheme. I think Gopi can comment more on this. What part do you think is missing here ? Nitin > > Drew
