Re: [lng-odp] [PATCH v19] linux-generic: pktio: genuinely sleep in timeout receive

Github ODP bot Mon, 08 Jan 2018 08:13:19 -0800

Matias Elo(matiaselo) replied on github web page:

platform/linux-generic/odp_packet_io.c
line 59
@@ -1698,21 +1699,22 @@ int odp_pktin_recv_tmo(odp_pktin_queue_t queue, 
odp_packet_t packets[], int num,
                        if (odp_unlikely(!started)) {
                                odp_time_t t;
 
-                               t = odp_time_local_from_ns(wait * SLEEP_NSEC);
+                               t = odp_time_local_from_ns(wait * 1000);
                                started = 1;
                                t1 = odp_time_sum(odp_time_local(), t);
                        }
 
                        /* Check every SLEEP_CHECK rounds if total wait time
                         * has been exceeded. */
-                       if ((wait & (SLEEP_CHECK - 1)) == 0) {
-                               t2 = odp_time_local();
+                       for (i = 0; i < SLEEP_USEC; i++) {
+                               if ((wait & (SLEEP_CHECK - 1)) == 0) {
+                                       t2 = odp_time_local();



Comment:
Seems unnecessarily complex. Something like the following could do the same:
```
uint64_t sleep_round = 0;
...
        /* Check every SLEEP_CHECK rounds if total wait time
         * has been exceeded. */
        if ((sleep_round++ & (SLEEP_CHECK - 1)) == 0) {
                t2 = odp_time_local();

                if (odp_time_cmp(t2, t1) > 0)
                        return 0;
        }
        wait = wait > SLEEP_USEC ? wait - SLEEP_USEC : 0;
```



> Matias Elo(matiaselo) wrote:
> Comment still says nanoseconds. Should also state that the maximum value is 1 
> 000 000, so that timespec.tv_nsec is used correctly.


>> Matias Elo(matiaselo) wrote:
>> Agreed, the "divisor" can be added later if needed.


>>> Juha-Matti Tilli(jmtilli) wrote:
>>> Now that's a great idea! If somebody sees one microsecond as too big sleep 
>>> time, it is always possible to implement a "divisor": multiply the sleep 
>>> time by e.g. 4, and sleep e.g. 250 nanoseconds. No slow division operation 
>>> is needed for this "divisor", only multiplication. But I won't implement 
>>> this divisor now, as SLEEP_NSEC anyway always used to be 1000.


>>>> Matias Elo(matiaselo) wrote:
>>>> Ifdefs are not allowed.  IMO the cleanest fix would be to add a separate 
>>>> commit (before this one) to this patch set, which modifies 
>>>> odp_pktin_wait_time() to always return microseconds and converts 
>>>> SLEEP_NSEC to SLEEP_USEC (and does necessary changes to make everything 
>>>> work).
>>>> 
>>>> Or does someone oppose having a minimum sleep time of 1us between pktin 
>>>> receive calls in the old code path?


>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>> I would prefer return 0; as the trial_successful already tells not to use 
>>>>> this interrupt-driven way, but it probably doesn't matter much. I'll do 
>>>>> this (with return 0).


>>>>>> Matias Elo(matiaselo) wrote:
>>>>>> By checking both ops here you can return right away and minimize 
>>>>>> overhead for pktios not supporting interrupt driven mode.
>>>>>> E.g.
>>>>>> ```
>>>>>>          if (!entry[i]->s.ops->recv_mq_tmo && !entry[i]->s.ops->fd_set) {
>>>>>>                  *trial_successful = 0;
>>>>>>                  return -1;
>>>>>>          }
>>>>>> ```


>>>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>>>> I'm somewhat worried that somebody could adjust SLEEP_NSEC and then the 
>>>>>>> entire thing fails. What about this:
>>>>>>> 
>>>>>>> <pre>
>>>>>>> #if SLEEP_NSEC == 1000
>>>>>>>   uint64_t usecs = wait;
>>>>>>> #else
>>>>>>>   uint64_t usecs = (wait * SLEEP_NSEC + 999) / 1000;
>>>>>>> #endif
>>>>>>> </pre>
>>>>>>> 
>>>>>>> For the return value, you mean <code>odp_pktin_wait_time()</code> and 
>>>>>>> not <code>odp_pktin_recv_mq_tmo()</code>, right? Because I don't see 
>>>>>>> how the return value of the latter could depend on 
>>>>>>> <code>SLEEP_NSEC</code> definition.
>>>>>>> 
>>>>>>> Also, if we modify <code>odp_pktin_wait_time()</code> to always return 
>>>>>>> microseconds, then <code>odp_pktin_recv_mq_tmo()</code> suddenly starts 
>>>>>>> to work incorrectly with <code>SLEEP_NSEC != 1000</code> as there is a 
>>>>>>> <code>wait--</code> line in there that does this just before sleeping 
>>>>>>> one microsecond, exactly.


>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>> Same thing with the loop order than with netmap.


>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>> Couple comment lines describing the different stages of this function 
>>>>>>>>> wouldn't hurt when someone is reading this code later on. 


>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>> Same thing with the loop order than with netmap.


>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>> Since this loop is a performance optimization I would move it 
>>>>>>>>>>> before the loop which sets the read descriptors. No use to set the 
>>>>>>>>>>> descriptors if packets are already available.


>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>> The motivation for a separate `odp_pktin_wait_time()` function was 
>>>>>>>>>>>> to avoid having to do time conversions in the fast path functions. 
>>>>>>>>>>>> As `odp_pktin_wait_time()` already returns the time in 
>>>>>>>>>>>> microseconds no additional conversions are needed here. Same thing 
>>>>>>>>>>>> for `odp_pktin_recv_mq_tmo()`.
>>>>>>>>>>>> 
>>>>>>>>>>>> Currently, the value returned by `odp_pktin_recv_mq_tmo()` depends 
>>>>>>>>>>>> on `SLEEP_NSEC` definition. This should be modified so that the 
>>>>>>>>>>>> returned value is always in microseconds.


>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>> Every pktio type we currenly have uses microseconds for timeout, 
>>>>>>>>>>>>> so you could change `recv_tmo()` and `recv_mq_tmo()` to use 
>>>>>>>>>>>>> microseconds instead of nanoseconds. This removes the need for 
>>>>>>>>>>>>> per operation conversions.


>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>>> > Are you entirely sure about this? This is a function pointer, 
>>>>>>>>>>>>>> > and if two pktio devices have the same function pointer for 
>>>>>>>>>>>>>> > recv_mq_tmo(), the other function pointers including the 
>>>>>>>>>>>>>> > recv() function pointer should be identical as well. We do 
>>>>>>>>>>>>>> > pass the device (entry[i]) to the function pointed to by 
>>>>>>>>>>>>>> > recv_mq_tmo as well, so it should be able to update the 
>>>>>>>>>>>>>> > correct counters. The ops structure is const, after all...
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I stand corrected, didn't notice the pktio_entry_t array 
>>>>>>>>>>>>>> parameter. A bit confusing as `recv_mq_tmo()` behaves 
>>>>>>>>>>>>>> differently than pretty much every other pktio op.


>>>>>>>>>>>>>>> muvarov wrote
>>>>>>>>>>>>>>> just pass nsec to drivers or you will have double calculation 
>>>>>>>>>>>>>>> like in netmap code.


>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>>>>>>>>>>>>> Are you entirely sure about this? This is a function pointer, 
>>>>>>>>>>>>>>>> and if two pktio devices have the same function pointer for 
>>>>>>>>>>>>>>>> recv_mq_tmo(), the other function pointers including the 
>>>>>>>>>>>>>>>> recv() function pointer should be identical as well. We do 
>>>>>>>>>>>>>>>> pass the device (entry[i]) to the function pointed to by 
>>>>>>>>>>>>>>>> recv_mq_tmo as well, so it should be able to update the 
>>>>>>>>>>>>>>>> correct counters. The ops structure is const, after all...
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So, when receiving packets from a particular device using 
>>>>>>>>>>>>>>>> netmap, we do call netmap_recv() and pass to it the 
>>>>>>>>>>>>>>>> pktio_entry[i] and also index[i] for the queue index. This is 
>>>>>>>>>>>>>>>> the same that the odp_pktin_recv_mq_tmo() function already 
>>>>>>>>>>>>>>>> did. It calls odp_pktin_recv() in a loop, and the latter 
>>>>>>>>>>>>>>>> function calls entry->s.ops->recv(entry, queue.index, packets, 
>>>>>>>>>>>>>>>> num); which to me seems to be exactly the same thing -- you 
>>>>>>>>>>>>>>>> pass exactly the same entry and the same queue.index. Same 
>>>>>>>>>>>>>>>> function, same arguments, I can't see how the behavior would 
>>>>>>>>>>>>>>>> be different...
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Now, if somebody opposes the .recv_mq_tmo function pointer due 
>>>>>>>>>>>>>>>> to added code line count, I might perhaps agree. But it seems 
>>>>>>>>>>>>>>>> to be the only way the DPDK support for genuine sleep could be 
>>>>>>>>>>>>>>>> added, so I see some value in it.


>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>>>>>>>>>>>>>> Several options: (1) I could implement a binary search 
>>>>>>>>>>>>>>>>> algorithm that incrementally checks various usec values and 
>>>>>>>>>>>>>>>>> uses odp_pktin_wait_time() to check which one of these trial 
>>>>>>>>>>>>>>>>> attempts result in the same sleep wait value, but that would 
>>>>>>>>>>>>>>>>> be slow; (2) I could implement odp_pktin_wait_time_reverse() 
>>>>>>>>>>>>>>>>> that does the conversion in the other direction, but that 
>>>>>>>>>>>>>>>>> would be an API change; (3) I could just do the conversion 
>>>>>>>>>>>>>>>>> and pass usec values instead of sleep wait values. I 
>>>>>>>>>>>>>>>>> carefully considered each and selected (3), but do you think 
>>>>>>>>>>>>>>>>> (2) or (1) would be better?


>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>>>>>>>>>>>>>>> Will fix.


>>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote:
>>>>>>>>>>>>>>>>>>> I disagree. SLEEP_NSEC is a private #define in 
>>>>>>>>>>>>>>>>>>> odp_packet_io.c so what the wait parameter actually means 
>>>>>>>>>>>>>>>>>>> cannot be interpreted in the drivers. (Believe me, I tried 
>>>>>>>>>>>>>>>>>>> it!) Thus, the conversion from the sleep wait parameter to 
>>>>>>>>>>>>>>>>>>> microseconds is needed.


>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>>>>>>>>> As I noted in the previous comment, I think this check 
>>>>>>>>>>>>>>>>>>>> isn't enough. The input queues would have to be from the 
>>>>>>>>>>>>>>>>>>>> same pktio device for this to work.


>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>>>>>>>>>> Internal functions should not use odp_ prefix.


>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>>>>>>>>>>> You should pass the `wait` parameter to 
>>>>>>>>>>>>>>>>>>>>>> `entry->s.ops->recv_tmo()` and not do any conversions 
>>>>>>>>>>>>>>>>>>>>>> here.


>>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>>>>>>>>>>>> ODP_TIME_SEC_IN_NS would work here.


https://github.com/Linaro/odp/pull/341#discussion_r160116983
updated_at 2018-01-08 11:02:12

Re: [lng-odp] [PATCH v19] linux-generic: pktio: genuinely sleep in timeout receive

Reply via email to