Matias Elo(matiaselo) replied on github web page: platform/linux-generic/odp_packet_io.c line 59 @@ -1698,21 +1699,22 @@ int odp_pktin_recv_tmo(odp_pktin_queue_t queue, odp_packet_t packets[], int num, if (odp_unlikely(!started)) { odp_time_t t; - t = odp_time_local_from_ns(wait * SLEEP_NSEC); + t = odp_time_local_from_ns(wait * 1000); started = 1; t1 = odp_time_sum(odp_time_local(), t); } /* Check every SLEEP_CHECK rounds if total wait time * has been exceeded. */ - if ((wait & (SLEEP_CHECK - 1)) == 0) { - t2 = odp_time_local(); + for (i = 0; i < SLEEP_USEC; i++) { + if ((wait & (SLEEP_CHECK - 1)) == 0) { + t2 = odp_time_local();
Comment: Seems unnecessarily complex. Something like the following could do the same: ``` uint64_t sleep_round = 0; ... /* Check every SLEEP_CHECK rounds if total wait time * has been exceeded. */ if ((sleep_round++ & (SLEEP_CHECK - 1)) == 0) { t2 = odp_time_local(); if (odp_time_cmp(t2, t1) > 0) return 0; } wait = wait > SLEEP_USEC ? wait - SLEEP_USEC : 0; ``` > Matias Elo(matiaselo) wrote: > Comment still says nanoseconds. Should also state that the maximum value is 1 > 000 000, so that timespec.tv_nsec is used correctly. >> Matias Elo(matiaselo) wrote: >> Agreed, the "divisor" can be added later if needed. >>> Juha-Matti Tilli(jmtilli) wrote: >>> Now that's a great idea! If somebody sees one microsecond as too big sleep >>> time, it is always possible to implement a "divisor": multiply the sleep >>> time by e.g. 4, and sleep e.g. 250 nanoseconds. No slow division operation >>> is needed for this "divisor", only multiplication. But I won't implement >>> this divisor now, as SLEEP_NSEC anyway always used to be 1000. >>>> Matias Elo(matiaselo) wrote: >>>> Ifdefs are not allowed. IMO the cleanest fix would be to add a separate >>>> commit (before this one) to this patch set, which modifies >>>> odp_pktin_wait_time() to always return microseconds and converts >>>> SLEEP_NSEC to SLEEP_USEC (and does necessary changes to make everything >>>> work). >>>> >>>> Or does someone oppose having a minimum sleep time of 1us between pktin >>>> receive calls in the old code path? >>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>> I would prefer return 0; as the trial_successful already tells not to use >>>>> this interrupt-driven way, but it probably doesn't matter much. I'll do >>>>> this (with return 0). >>>>>> Matias Elo(matiaselo) wrote: >>>>>> By checking both ops here you can return right away and minimize >>>>>> overhead for pktios not supporting interrupt driven mode. >>>>>> E.g. >>>>>> ``` >>>>>> if (!entry[i]->s.ops->recv_mq_tmo && !entry[i]->s.ops->fd_set) { >>>>>> *trial_successful = 0; >>>>>> return -1; >>>>>> } >>>>>> ``` >>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>> I'm somewhat worried that somebody could adjust SLEEP_NSEC and then the >>>>>>> entire thing fails. What about this: >>>>>>> >>>>>>> <pre> >>>>>>> #if SLEEP_NSEC == 1000 >>>>>>> uint64_t usecs = wait; >>>>>>> #else >>>>>>> uint64_t usecs = (wait * SLEEP_NSEC + 999) / 1000; >>>>>>> #endif >>>>>>> </pre> >>>>>>> >>>>>>> For the return value, you mean <code>odp_pktin_wait_time()</code> and >>>>>>> not <code>odp_pktin_recv_mq_tmo()</code>, right? Because I don't see >>>>>>> how the return value of the latter could depend on >>>>>>> <code>SLEEP_NSEC</code> definition. >>>>>>> >>>>>>> Also, if we modify <code>odp_pktin_wait_time()</code> to always return >>>>>>> microseconds, then <code>odp_pktin_recv_mq_tmo()</code> suddenly starts >>>>>>> to work incorrectly with <code>SLEEP_NSEC != 1000</code> as there is a >>>>>>> <code>wait--</code> line in there that does this just before sleeping >>>>>>> one microsecond, exactly. >>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>> Same thing with the loop order than with netmap. >>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>> Couple comment lines describing the different stages of this function >>>>>>>>> wouldn't hurt when someone is reading this code later on. >>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>> Same thing with the loop order than with netmap. >>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>> Since this loop is a performance optimization I would move it >>>>>>>>>>> before the loop which sets the read descriptors. No use to set the >>>>>>>>>>> descriptors if packets are already available. >>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>> The motivation for a separate `odp_pktin_wait_time()` function was >>>>>>>>>>>> to avoid having to do time conversions in the fast path functions. >>>>>>>>>>>> As `odp_pktin_wait_time()` already returns the time in >>>>>>>>>>>> microseconds no additional conversions are needed here. Same thing >>>>>>>>>>>> for `odp_pktin_recv_mq_tmo()`. >>>>>>>>>>>> >>>>>>>>>>>> Currently, the value returned by `odp_pktin_recv_mq_tmo()` depends >>>>>>>>>>>> on `SLEEP_NSEC` definition. This should be modified so that the >>>>>>>>>>>> returned value is always in microseconds. >>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>> Every pktio type we currenly have uses microseconds for timeout, >>>>>>>>>>>>> so you could change `recv_tmo()` and `recv_mq_tmo()` to use >>>>>>>>>>>>> microseconds instead of nanoseconds. This removes the need for >>>>>>>>>>>>> per operation conversions. >>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>> > Are you entirely sure about this? This is a function pointer, >>>>>>>>>>>>>> > and if two pktio devices have the same function pointer for >>>>>>>>>>>>>> > recv_mq_tmo(), the other function pointers including the >>>>>>>>>>>>>> > recv() function pointer should be identical as well. We do >>>>>>>>>>>>>> > pass the device (entry[i]) to the function pointed to by >>>>>>>>>>>>>> > recv_mq_tmo as well, so it should be able to update the >>>>>>>>>>>>>> > correct counters. The ops structure is const, after all... >>>>>>>>>>>>>> >>>>>>>>>>>>>> I stand corrected, didn't notice the pktio_entry_t array >>>>>>>>>>>>>> parameter. A bit confusing as `recv_mq_tmo()` behaves >>>>>>>>>>>>>> differently than pretty much every other pktio op. >>>>>>>>>>>>>>> muvarov wrote >>>>>>>>>>>>>>> just pass nsec to drivers or you will have double calculation >>>>>>>>>>>>>>> like in netmap code. >>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>> Are you entirely sure about this? This is a function pointer, >>>>>>>>>>>>>>>> and if two pktio devices have the same function pointer for >>>>>>>>>>>>>>>> recv_mq_tmo(), the other function pointers including the >>>>>>>>>>>>>>>> recv() function pointer should be identical as well. We do >>>>>>>>>>>>>>>> pass the device (entry[i]) to the function pointed to by >>>>>>>>>>>>>>>> recv_mq_tmo as well, so it should be able to update the >>>>>>>>>>>>>>>> correct counters. The ops structure is const, after all... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So, when receiving packets from a particular device using >>>>>>>>>>>>>>>> netmap, we do call netmap_recv() and pass to it the >>>>>>>>>>>>>>>> pktio_entry[i] and also index[i] for the queue index. This is >>>>>>>>>>>>>>>> the same that the odp_pktin_recv_mq_tmo() function already >>>>>>>>>>>>>>>> did. It calls odp_pktin_recv() in a loop, and the latter >>>>>>>>>>>>>>>> function calls entry->s.ops->recv(entry, queue.index, packets, >>>>>>>>>>>>>>>> num); which to me seems to be exactly the same thing -- you >>>>>>>>>>>>>>>> pass exactly the same entry and the same queue.index. Same >>>>>>>>>>>>>>>> function, same arguments, I can't see how the behavior would >>>>>>>>>>>>>>>> be different... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Now, if somebody opposes the .recv_mq_tmo function pointer due >>>>>>>>>>>>>>>> to added code line count, I might perhaps agree. But it seems >>>>>>>>>>>>>>>> to be the only way the DPDK support for genuine sleep could be >>>>>>>>>>>>>>>> added, so I see some value in it. >>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>> Several options: (1) I could implement a binary search >>>>>>>>>>>>>>>>> algorithm that incrementally checks various usec values and >>>>>>>>>>>>>>>>> uses odp_pktin_wait_time() to check which one of these trial >>>>>>>>>>>>>>>>> attempts result in the same sleep wait value, but that would >>>>>>>>>>>>>>>>> be slow; (2) I could implement odp_pktin_wait_time_reverse() >>>>>>>>>>>>>>>>> that does the conversion in the other direction, but that >>>>>>>>>>>>>>>>> would be an API change; (3) I could just do the conversion >>>>>>>>>>>>>>>>> and pass usec values instead of sleep wait values. I >>>>>>>>>>>>>>>>> carefully considered each and selected (3), but do you think >>>>>>>>>>>>>>>>> (2) or (1) would be better? >>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>> Will fix. >>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>>> I disagree. SLEEP_NSEC is a private #define in >>>>>>>>>>>>>>>>>>> odp_packet_io.c so what the wait parameter actually means >>>>>>>>>>>>>>>>>>> cannot be interpreted in the drivers. (Believe me, I tried >>>>>>>>>>>>>>>>>>> it!) Thus, the conversion from the sleep wait parameter to >>>>>>>>>>>>>>>>>>> microseconds is needed. >>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>> As I noted in the previous comment, I think this check >>>>>>>>>>>>>>>>>>>> isn't enough. The input queues would have to be from the >>>>>>>>>>>>>>>>>>>> same pktio device for this to work. >>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>> Internal functions should not use odp_ prefix. >>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>> You should pass the `wait` parameter to >>>>>>>>>>>>>>>>>>>>>> `entry->s.ops->recv_tmo()` and not do any conversions >>>>>>>>>>>>>>>>>>>>>> here. >>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>>> ODP_TIME_SEC_IN_NS would work here. https://github.com/Linaro/odp/pull/341#discussion_r160116983 updated_at 2018-01-08 11:02:12