>> At the same time, while I suspect the performance degrade because of the 
>> GLDv3 layer, I will try to see how much of each part of GLDv3 affect the 
>> performance (maybe by temporarily removing the lock or the loop).
>>
> 
> good point. I overlooked this part earlier.

Here is what I found from the TCP_STREAM testing (the results might not be 
very accurate because it varies sometimes):

a. current softmac bits

performance degrade about 8%.

b. Removing holding mip->mi_rx_lock in mac_rx, which makes the call of 
i_dls_link_rx() a tail call:

performance degrade about 6%.

c. Then remove the loop in i_dls_link_rx() and the holding of
dls_head (the call of i_dls_head_hold() and i_dls_head_rele()) which makes 
di_rx() a tail call:

performance degrade about 3%.


d. Then remove holding dlp->dl_impl_lock in i_dls_link_rx().

performance degrade about 1.8%.


e. Then remove holding dip->di_lock in dls_accept().

performance degrade about 0.9%.

So it looks most of the performance loss originated from GLDv3. I guess the 
next step is to decide whether try to remove those locks or try the 
multiple-lower-streams approach. Note that even we remove all the locks, 
there is still some performance degrade (about 1%)

Thanks
- Cathy


> one idea I have is to have dls find out if the underlying mac is a softmac.
> if it is, then make dls_link_add() add a softmac-specific rx func. currently
> you can only add i_dls_link_rx(), i_dls_link_rx_promisc(). for softmac,
> there might be some processing that can be skipped because it is already
> done by the legacy driver. in certain cases, your softmac's rx func may
> be able to pass packets straight into dls_impl_t with minimal checks.
> I am not sure about the details of how to do this yet. I'll take a closer
> look at your softmac code.
> 
>  
>> We also discussed (in our I-team meeting) about another alternative approach 
>> - we will try to prototype it if the former analysis proves performance 
>> cannot be improved after all: that is to open multiple lower streams, each 
>> corresponding to one upper stream, and the ETHERTYPE_VLAN raw stream will 
>> only be open when there is VLAN or aggregation openned on this device. Not 
>> sure whether it can be done yet, but it is an initial thought.
>>
> 
> this may be worth trying too.
> 
>  
> eric


Reply via email to