Tzachi Dar wrote:
> Have you been actually being able to measure a difference?
>
> I believe that your code is better, but I wander if it really has an
> affect that we can measure.
In the following code sequence modified to use pointers (with Sean's
observations)
for( i = 0; i < MAX_SEND_WC; i++ )
wc[i].p_next = &wc[i + 1];
wc[MAX_SEND_WC - 1].p_next = NULL;
for( p_free=wc; p_free < &wc[MAX_SEND_WC - 1]; p_free++ )
p_free->p_next = p_free + 1;
p_free->p_next = NULL;
If the MS WDK compiler optimizations are really 'good', it might optimize the
loops to basically the same instruction sequence. I do not believe this to be
the case.
The slightly faster execution time is based on the observation that the total
number of instructions executed is reduced by skipping the array index
arithmetic by use of pointers.
Since these loops live in the Tx & Rx speed paths, every little bit helps.
>
> Thanks
> Tzachi
>
>> -----Original Message-----
>> From: Smith, Stan [mailto:[email protected]]
>> Sent: Wednesday, August 04, 2010 8:43 PM
>> To: Tzachi Dar
>> Cc: [email protected]
>> Subject: [IPOIB_NDIS6_CM] enhance wc linking loop performance by
>> removing array index calculations
>>
>>
>> Hello,
>>
>> While reading IPOIB code I noticed a minor speed enhancement in CQ
>> callback routines. When linking WC (work complete) items into a
>> list, by removing the array index calculations in favor of pointer
>> arithmetic the loop will execute slightly faster.
>>
>> Worth a commit?
>>
>> stan.
>>
>> --- A/ulp/ipoib_NDIS6_CM/kernel/ipoib_endpoint.cpp Wed Aug 04
>> 10:30:43 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_endpoint.cpp
>> Wed Aug 04 10:28:59 2010 @@ -888,9 +888,10 @@
>> p_port->p_adapter->p_ifc->modify_qp( p_endpt-
>>> conn.h_send_qp, &mod_attr );
>> p_port->p_adapter->p_ifc->modify_qp( p_endpt-
>>> conn.h_recv_qp, &mod_attr );
>>
>> - for( i = 0; i < MAX_RECV_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_RECV_WC - 1].p_next = NULL;
>> + for( p_free_wc=wc; p_free_wc < &wc[MAX_RECV_WC]; p_free_wc++
>> ) + p_free_wc->p_next = p_free_wc + 1;
>> +
>> + (--p_free_wc)->p_next = NULL;
>>
>> do
>> {
>>
>> --- A/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Wed Aug 04 10:29:33
>> 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Wed Aug 04
>> 10:28:31 2010 @@ -1987,7 +1987,6 @@
>> ib_wc_t wc[MAX_RECV_WC], *p_free,
>> *p_wc; int32_t NBL_cnt, recv_cnt =
>> 0, shortage, discarded; cl_qlist_t
>> done_list, bad_list; - size_t i;
>> ULONG recv_complete_flags = 0;
>> BOOLEAN res;
>>
>> @@ -2017,9 +2016,11 @@
>> cl_qlist_init( &bad_list );
>>
>> ipoib_port_ref( p_port, ref_recv_cb );
>> - for( i = 0; i < MAX_RECV_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_RECV_WC - 1].p_next = NULL;
>> +
>> + for( p_free=wc; p_free < &wc[MAX_RECV_WC]; p_free++ )
>> + p_free->p_next = p_free + 1;
>> +
>> + (--p_free)->p_next = NULL;
>>
>> /*
>> * We'll be accessing the endpoint map so take a reference
>> @@ -5769,7 +5770,6 @@ cl_qlist_t
>> done_list; ipoib_endpt_t *p_endpt;
>> ip_stat_sel_t type;
>> - size_t i;
>> NET_BUFFER *p_netbuffer = NULL;
>> ipoib_send_NB_SG *s_buf;
>>
>> @@ -5798,9 +5798,10 @@
>> //cl_qlist_check_validity(&p_port->send_mgr.pending_list);
>> ipoib_port_ref( p_port, ref_send_cb );
>>
>> - for( i = 0; i < MAX_SEND_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_SEND_WC - 1].p_next = NULL;
>> + for( p_free=wc; p_free < &wc[MAX_SEND_WC]; p_free++ )
>> + p_free->p_next = p_free + 1;
>> +
>> + (--p_free)->p_next = NULL;
>>
>> do
>> {
_______________________________________________
ofw mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw