>> At the same time, while I suspect the performance degrade because of the >> GLDv3 layer, I will try to see how much of each part of GLDv3 affect the >> performance (maybe by temporarily removing the lock or the loop). >> > > good point. I overlooked this part earlier.
Here is what I found from the TCP_STREAM testing (the results might not be very accurate because it varies sometimes): a. current softmac bits performance degrade about 8%. b. Removing holding mip->mi_rx_lock in mac_rx, which makes the call of i_dls_link_rx() a tail call: performance degrade about 6%. c. Then remove the loop in i_dls_link_rx() and the holding of dls_head (the call of i_dls_head_hold() and i_dls_head_rele()) which makes di_rx() a tail call: performance degrade about 3%. d. Then remove holding dlp->dl_impl_lock in i_dls_link_rx(). performance degrade about 1.8%. e. Then remove holding dip->di_lock in dls_accept(). performance degrade about 0.9%. So it looks most of the performance loss originated from GLDv3. I guess the next step is to decide whether try to remove those locks or try the multiple-lower-streams approach. Note that even we remove all the locks, there is still some performance degrade (about 1%) Thanks - Cathy > one idea I have is to have dls find out if the underlying mac is a softmac. > if it is, then make dls_link_add() add a softmac-specific rx func. currently > you can only add i_dls_link_rx(), i_dls_link_rx_promisc(). for softmac, > there might be some processing that can be skipped because it is already > done by the legacy driver. in certain cases, your softmac's rx func may > be able to pass packets straight into dls_impl_t with minimal checks. > I am not sure about the details of how to do this yet. I'll take a closer > look at your softmac code. > > >> We also discussed (in our I-team meeting) about another alternative approach >> - we will try to prototype it if the former analysis proves performance >> cannot be improved after all: that is to open multiple lower streams, each >> corresponding to one upper stream, and the ETHERTYPE_VLAN raw stream will >> only be open when there is VLAN or aggregation openned on this device. Not >> sure whether it can be done yet, but it is an initial thought. >> > > this may be worth trying too. > > > eric
