Matias Elo(matiaselo) replied on github web page: platform/linux-generic/pktio/netmap.c line 72 @@ -786,6 +824,100 @@ static int netmap_recv(pktio_entry_t *pktio_entry, int index, return num_rx; } +static int netmap_recv_tmo(pktio_entry_t *pktio_entry, int index, + odp_packet_t pkt_table[], int num, uint64_t usecs) +{ + struct timeval timeout; + int ret; + int maxfd; + fd_set readfds; + + ret = netmap_recv(pktio_entry, index, pkt_table, num); + if (ret != 0) + return ret; + + /* 32 bit architectures use 32-bit division if possible */ + if (sizeof(size_t) < 8) { + if (usecs <= UINT32_MAX) { + timeout.tv_sec = ((uint32_t)usecs) / (1000 * 1000); + timeout.tv_usec = ((uint32_t)usecs) % (1000 * 1000); + } else { + timeout.tv_sec = usecs / (1000 * 1000); + timeout.tv_usec = usecs % (1000 * 1000); + } + } else { + timeout.tv_sec = usecs / (1000 * 1000); + timeout.tv_usec = usecs % (1000 * 1000);
Comment: The second division can be avoided. ``` timeout.tv_usec = usecs - (timeout.tv_sec * 1000000); ``` > Matias Elo(matiaselo) wrote: > To me the value of this optimization seems questionable (everywhere where it > is used). System calls are used here anyway, so one division operation (next > comment) shouldn't have that much effect. > >> Matias Elo(matiaselo) wrote: >> Not needed. >>> Matias Elo(matiaselo) wrote: >>> Not needed. >>>> Matias Elo(matiaselo) wrote: >>>> Same comments than to `odp_pktin_recv_tmo()`. >>>>> Matias Elo(matiaselo) wrote: >>>>> Can roll over if SLEEP_USEC > 1. >>>>>> Matias Elo(matiaselo) wrote: >>>>>> Seems unnecessarily complex. Something like the following could do the >>>>>> same: >>>>>> ``` >>>>>> uint64_t sleep_round = 0; >>>>>> ... >>>>>> /* Check every SLEEP_CHECK rounds if total wait time >>>>>> * has been exceeded. */ >>>>>> if ((sleep_round++ & (SLEEP_CHECK - 1)) == 0) { >>>>>> t2 = odp_time_local(); >>>>>> >>>>>> if (odp_time_cmp(t2, t1) > 0) >>>>>> return 0; >>>>>> } >>>>>> wait = wait > SLEEP_USEC ? wait - SLEEP_USEC : 0; >>>>>> ``` >>>>>> >>>>>>> Matias Elo(matiaselo) wrote: >>>>>>> Comment still says nanoseconds. Should also state that the maximum >>>>>>> value is 1 000 000, so that timespec.tv_nsec is used correctly. >>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>> Agreed, the "divisor" can be added later if needed. >>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>> Now that's a great idea! If somebody sees one microsecond as too big >>>>>>>>> sleep time, it is always possible to implement a "divisor": multiply >>>>>>>>> the sleep time by e.g. 4, and sleep e.g. 250 nanoseconds. No slow >>>>>>>>> division operation is needed for this "divisor", only multiplication. >>>>>>>>> But I won't implement this divisor now, as SLEEP_NSEC anyway always >>>>>>>>> used to be 1000. >>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>> Ifdefs are not allowed. IMO the cleanest fix would be to add a >>>>>>>>>> separate commit (before this one) to this patch set, which modifies >>>>>>>>>> odp_pktin_wait_time() to always return microseconds and converts >>>>>>>>>> SLEEP_NSEC to SLEEP_USEC (and does necessary changes to make >>>>>>>>>> everything work). >>>>>>>>>> >>>>>>>>>> Or does someone oppose having a minimum sleep time of 1us between >>>>>>>>>> pktin receive calls in the old code path? >>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>> I would prefer return 0; as the trial_successful already tells not >>>>>>>>>>> to use this interrupt-driven way, but it probably doesn't matter >>>>>>>>>>> much. I'll do this (with return 0). >>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>> By checking both ops here you can return right away and minimize >>>>>>>>>>>> overhead for pktios not supporting interrupt driven mode. >>>>>>>>>>>> E.g. >>>>>>>>>>>> ``` >>>>>>>>>>>> if (!entry[i]->s.ops->recv_mq_tmo && >>>>>>>>>>>> !entry[i]->s.ops->fd_set) { >>>>>>>>>>>> *trial_successful = 0; >>>>>>>>>>>> return -1; >>>>>>>>>>>> } >>>>>>>>>>>> ``` >>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>> I'm somewhat worried that somebody could adjust SLEEP_NSEC and >>>>>>>>>>>>> then the entire thing fails. What about this: >>>>>>>>>>>>> >>>>>>>>>>>>> <pre> >>>>>>>>>>>>> #if SLEEP_NSEC == 1000 >>>>>>>>>>>>> uint64_t usecs = wait; >>>>>>>>>>>>> #else >>>>>>>>>>>>> uint64_t usecs = (wait * SLEEP_NSEC + 999) / 1000; >>>>>>>>>>>>> #endif >>>>>>>>>>>>> </pre> >>>>>>>>>>>>> >>>>>>>>>>>>> For the return value, you mean <code>odp_pktin_wait_time()</code> >>>>>>>>>>>>> and not <code>odp_pktin_recv_mq_tmo()</code>, right? Because I >>>>>>>>>>>>> don't see how the return value of the latter could depend on >>>>>>>>>>>>> <code>SLEEP_NSEC</code> definition. >>>>>>>>>>>>> >>>>>>>>>>>>> Also, if we modify <code>odp_pktin_wait_time()</code> to always >>>>>>>>>>>>> return microseconds, then <code>odp_pktin_recv_mq_tmo()</code> >>>>>>>>>>>>> suddenly starts to work incorrectly with <code>SLEEP_NSEC != >>>>>>>>>>>>> 1000</code> as there is a <code>wait--</code> line in there that >>>>>>>>>>>>> does this just before sleeping one microsecond, exactly. >>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>> Same thing with the loop order than with netmap. >>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>> Couple comment lines describing the different stages of this >>>>>>>>>>>>>>> function wouldn't hurt when someone is reading this code later >>>>>>>>>>>>>>> on. >>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>> Same thing with the loop order than with netmap. >>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>> Since this loop is a performance optimization I would move it >>>>>>>>>>>>>>>>> before the loop which sets the read descriptors. No use to >>>>>>>>>>>>>>>>> set the descriptors if packets are already available. >>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>> The motivation for a separate `odp_pktin_wait_time()` >>>>>>>>>>>>>>>>>> function was to avoid having to do time conversions in the >>>>>>>>>>>>>>>>>> fast path functions. As `odp_pktin_wait_time()` already >>>>>>>>>>>>>>>>>> returns the time in microseconds no additional conversions >>>>>>>>>>>>>>>>>> are needed here. Same thing for `odp_pktin_recv_mq_tmo()`. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Currently, the value returned by `odp_pktin_recv_mq_tmo()` >>>>>>>>>>>>>>>>>> depends on `SLEEP_NSEC` definition. This should be modified >>>>>>>>>>>>>>>>>> so that the returned value is always in microseconds. >>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>> Every pktio type we currenly have uses microseconds for >>>>>>>>>>>>>>>>>>> timeout, so you could change `recv_tmo()` and >>>>>>>>>>>>>>>>>>> `recv_mq_tmo()` to use microseconds instead of nanoseconds. >>>>>>>>>>>>>>>>>>> This removes the need for per operation conversions. >>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>> > Are you entirely sure about this? This is a function >>>>>>>>>>>>>>>>>>>> > pointer, and if two pktio devices have the same function >>>>>>>>>>>>>>>>>>>> > pointer for recv_mq_tmo(), the other function pointers >>>>>>>>>>>>>>>>>>>> > including the recv() function pointer should be >>>>>>>>>>>>>>>>>>>> > identical as well. We do pass the device (entry[i]) to >>>>>>>>>>>>>>>>>>>> > the function pointed to by recv_mq_tmo as well, so it >>>>>>>>>>>>>>>>>>>> > should be able to update the correct counters. The ops >>>>>>>>>>>>>>>>>>>> > structure is const, after all... >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I stand corrected, didn't notice the pktio_entry_t array >>>>>>>>>>>>>>>>>>>> parameter. A bit confusing as `recv_mq_tmo()` behaves >>>>>>>>>>>>>>>>>>>> differently than pretty much every other pktio op. >>>>>>>>>>>>>>>>>>>>> muvarov wrote >>>>>>>>>>>>>>>>>>>>> just pass nsec to drivers or you will have double >>>>>>>>>>>>>>>>>>>>> calculation like in netmap code. >>>>>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>>>>>> Are you entirely sure about this? This is a function >>>>>>>>>>>>>>>>>>>>>> pointer, and if two pktio devices have the same function >>>>>>>>>>>>>>>>>>>>>> pointer for recv_mq_tmo(), the other function pointers >>>>>>>>>>>>>>>>>>>>>> including the recv() function pointer should be >>>>>>>>>>>>>>>>>>>>>> identical as well. We do pass the device (entry[i]) to >>>>>>>>>>>>>>>>>>>>>> the function pointed to by recv_mq_tmo as well, so it >>>>>>>>>>>>>>>>>>>>>> should be able to update the correct counters. The ops >>>>>>>>>>>>>>>>>>>>>> structure is const, after all... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> So, when receiving packets from a particular device >>>>>>>>>>>>>>>>>>>>>> using netmap, we do call netmap_recv() and pass to it >>>>>>>>>>>>>>>>>>>>>> the pktio_entry[i] and also index[i] for the queue >>>>>>>>>>>>>>>>>>>>>> index. This is the same that the odp_pktin_recv_mq_tmo() >>>>>>>>>>>>>>>>>>>>>> function already did. It calls odp_pktin_recv() in a >>>>>>>>>>>>>>>>>>>>>> loop, and the latter function calls >>>>>>>>>>>>>>>>>>>>>> entry->s.ops->recv(entry, queue.index, packets, num); >>>>>>>>>>>>>>>>>>>>>> which to me seems to be exactly the same thing -- you >>>>>>>>>>>>>>>>>>>>>> pass exactly the same entry and the same queue.index. >>>>>>>>>>>>>>>>>>>>>> Same function, same arguments, I can't see how the >>>>>>>>>>>>>>>>>>>>>> behavior would be different... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Now, if somebody opposes the .recv_mq_tmo function >>>>>>>>>>>>>>>>>>>>>> pointer due to added code line count, I might perhaps >>>>>>>>>>>>>>>>>>>>>> agree. But it seems to be the only way the DPDK support >>>>>>>>>>>>>>>>>>>>>> for genuine sleep could be added, so I see some value in >>>>>>>>>>>>>>>>>>>>>> it. >>>>>>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>>>>>>> Several options: (1) I could implement a binary search >>>>>>>>>>>>>>>>>>>>>>> algorithm that incrementally checks various usec values >>>>>>>>>>>>>>>>>>>>>>> and uses odp_pktin_wait_time() to check which one of >>>>>>>>>>>>>>>>>>>>>>> these trial attempts result in the same sleep wait >>>>>>>>>>>>>>>>>>>>>>> value, but that would be slow; (2) I could implement >>>>>>>>>>>>>>>>>>>>>>> odp_pktin_wait_time_reverse() that does the conversion >>>>>>>>>>>>>>>>>>>>>>> in the other direction, but that would be an API >>>>>>>>>>>>>>>>>>>>>>> change; (3) I could just do the conversion and pass >>>>>>>>>>>>>>>>>>>>>>> usec values instead of sleep wait values. I carefully >>>>>>>>>>>>>>>>>>>>>>> considered each and selected (3), but do you think (2) >>>>>>>>>>>>>>>>>>>>>>> or (1) would be better? >>>>>>>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>>>>>>>> Will fix. >>>>>>>>>>>>>>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>>>>>>>>>>>>>> I disagree. SLEEP_NSEC is a private #define in >>>>>>>>>>>>>>>>>>>>>>>>> odp_packet_io.c so what the wait parameter actually >>>>>>>>>>>>>>>>>>>>>>>>> means cannot be interpreted in the drivers. (Believe >>>>>>>>>>>>>>>>>>>>>>>>> me, I tried it!) Thus, the conversion from the sleep >>>>>>>>>>>>>>>>>>>>>>>>> wait parameter to microseconds is needed. >>>>>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> As I noted in the previous comment, I think this >>>>>>>>>>>>>>>>>>>>>>>>>> check isn't enough. The input queues would have to >>>>>>>>>>>>>>>>>>>>>>>>>> be from the same pktio device for this to work. >>>>>>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> Internal functions should not use odp_ prefix. >>>>>>>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> You should pass the `wait` parameter to >>>>>>>>>>>>>>>>>>>>>>>>>>>> `entry->s.ops->recv_tmo()` and not do any >>>>>>>>>>>>>>>>>>>>>>>>>>>> conversions here. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ODP_TIME_SEC_IN_NS would work here. https://github.com/Linaro/odp/pull/341#discussion_r160123292 updated_at 2018-01-08 11:37:07