Re: [lng-odp] [PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction

Github ODP bot Fri, 19 Jan 2018 09:05:25 -0800

Petri Savolainen(psavol) replied on github web page:

platform/linux-generic/odp_packet_io.c
line 5
@@ -1669,6 +1669,9 @@ int odp_pktin_recv(odp_pktin_queue_t queue, odp_packet_t 
packets[], int num)
                return -1;
        }
 
+       if (entry->s.state != PKTIO_STATE_STARTED)
+               return -1;



Comment:
This is line may not be polled any more as actual waiting may happen inside 
driver (search for select()).

> muvarov wrote
> If state was changed in the middle of driver recv() this place is not quite 
> correct. Entry is not locked and other thread can change the state while 
> driver will sill "read" packets.  Because of recv() is very short I think 
> this can be ok here.  odp_unlikely() might make some reason here.


>> Balasubramanian Manoharan(bala-manoharan) wrote:
>> I would appreciate if we could deprecate infinite wait 
>> option(ODP_PKTIN_WAIT). It is most efficient to remove this feature of 
>> infinite wait rather than implement this using any other mechanism explained 
>> in this thread.


>>> Stanislaw Kardach(semihalf-kardach-stanislaw) wrote:
>>> After having to implement such behavior in a hardware platform I was 
>>> working on I have to side with @psavol. Such synchronization requires 
>>> `odp_pktio_stop()` or `odp_pktout_send()` or `odp_queue_enq()` to update 
>>> pktio state in appropriate queue, otherwise each queue would have to 
>>> de-reference a pointer to pktio to read its state on every send. This 
>>> incurs a penalty if the pktio structure is otherwise not needed in the fast 
>>> path (i.e. no calls to `odp_packet_input()`). It's one thing to check the 
>>> state once at the start of the send function, completely different if we 
>>> need to check it every time (and it has to be a `volatile` check).
>>> Also I'm not entirely convinced by the argument that it is easy to 
>>> implement *at least in linux-generic`. My understanding was that ODP was an 
>>> API for dataplane programming which means a low level access to the 
>>> hardware with focus on offloading functionality to the hardware. Therefore 
>>> the reasoning that it is easy to implement a feature in a completely 
>>> abstract software environment works against the hardware offloading. 
>>> Wouldn't it be better to always use an argument: *it is not-hard / easy to 
>>> implement in hardware, i.e. on vendor X, HW Y it can be implemented like 
>>> {this}*?


>>>> Petri Savolainen(psavol) wrote:
>>>> It's not about application worrying about implementation. It's about these 
>>>> facts:
>>>>  * implementations have more work (more overhead) to monitor three inputs 
>>>> (packet, timeout and stop call) vs two inputs (packet and timeout).
>>>> * any implementation (including caterpillar) should avoid internal usage 
>>>> of posix signals as application may have it's own signal handler.
>>>> * application can easily monitor/synchronize it's own threads during 
>>>> interface shutdown by using pktin timeout and other ODP APIs
>>>> 
>>>> Stop call to recv synchronization would be needed rarely, but it would 
>>>> causes (potentially) major synchronization effort for implementation. In 
>>>> general, strict synchronization requirements between different API calls 
>>>> should be avoided as those increase implementation complexity and decrease 
>>>> parallelism/performance.


>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>> The reason I changed the return from `odp_pktin_recv_tmo(..., 
>>>>> ODP_PKTIN_WAIT)` to <0 when `odp_pktio_stop()` is called is to avoid the 
>>>>> issue @psavol raises. You'd need to check for <0 returns anyway since 
>>>>> these calls may fail for various reasons, so there's no additional work 
>>>>> involved. The implementation of this behavior (in linux-generic, at 
>>>>> least) is also pretty trivial. 
>>>>> 
>>>>> How an implementation implements `ODP_PKTIN_WAIT` is up to it. The fact 
>>>>> that linux-generic currently does a poll under the covers is not 
>>>>> important. I'd expect Caterpillar to do an actual wait of some sort and 
>>>>> have a low-overhead signaling mechanism to get packets when they arrive, 
>>>>> or at least some sort of adaptive wait. Again, this is something 
>>>>> applications should not need to worry about. If the application really 
>>>>> wants to control the timeout it's free to do so with the existing API, so 
>>>>> nothing changes here, but for those that don't care this is simpler from 
>>>>> an application standpoint and equally easy to implement.
>>>>> 
>>>>> I should point out we have the same ambiguity today when 
>>>>> `odp_pktio_send()` is called on a stopped pktio. The API doesn't specify 
>>>>> what happens here but in linux-generic we'll try to send the packet, 
>>>>> which will either be transmitted (thus violating the intent of the "stop" 
>>>>> call) or else be abandoned on a transmit queue, which makes for difficult 
>>>>> graceful termination. Here again, returning <0 if the pktio is in any 
>>>>> other state except START makes sense since that's the only state in which 
>>>>> we want to be accepting new packets for transmission.


>>>>>> Petri Savolainen(psavol) wrote:
>>>>>> The biggest CON of PKTIN_WAIT is that implementation needs to build 
>>>>>> mechanism (polling or signal) to catch a stop call which happens almost 
>>>>>> never (e.g. once a week, or about once every 1000 billion calls). If 
>>>>>> implementation e.g. polls a stop flag, etc - what's acceptable period of 
>>>>>> polling the flag (e.g. 1/10msec, ... ?). The more often implementation 
>>>>>> polls, the more overhead is created and the less power is saved. 
>>>>>> Application of the other hand can easily setup the correct poll period 
>>>>>> (e.g. 1/sec) as the timeout. Implementation can easily pool at too high 
>>>>>> rate (e.g. 100x more often than necessary).
>>>>>> 
>>>>>> ODP implementation should not use signals internally, but leave those to 
>>>>>> application use. Application signal handler should see all signals 
>>>>>> application has created and nothing else (ODP shoud not create signals).
>>>>>> 
>>>>>> Let's just deprecate PKTIN_WAIT option. Application can easily use a 
>>>>>> loop of long timeout instead. E.g. a wake up per few minutes would not 
>>>>>> affect performance, or power saving. During high load implementation 
>>>>>> would not setup timeout (or cancel it), as there would be packets to 
>>>>>> return instead. Timeout is set only when there's no packets == no high 
>>>>>> load.


>>>>>>> bogdanPricope wrote
>>>>>>> All three possibilities for setting âwaitâ parameter are useful in 
>>>>>>> different application designs: ODP_PKTIN_NO_WAIT, ODP_PKTIN_WAIT, 
>>>>>>> finite wait.... mainly because each option has its pros/cons and 
>>>>>>> depending on your application profile and expected traffic you need to 
>>>>>>> select one of them for best performance.
>>>>>>> 
>>>>>>> 
>>>>>>> ODP_PKTIN_NO_WAIT
>>>>>>> Pros:  best performance on high core load with equally spread traffic
>>>>>>> Cons: wildly looping when there is no traffic: considering direct RX 
>>>>>>> mode with RSS, we can imagine a scenario where all traffic goes to a 
>>>>>>> single core while the rest of cores are only looping
>>>>>>> 
>>>>>>> ODP_PKTIN_WAIT
>>>>>>> Pros: no wild loop
>>>>>>> Cons:  At some point you need to restart you applications and you need 
>>>>>>> to shutdown resources gracefully (else, in some cases initialization 
>>>>>>> will fail and you will need to reboot host - bigger down time). You 
>>>>>>> have no way to gracefully interrupt this function.
>>>>>>> 
>>>>>>> 
>>>>>>> finite wait:
>>>>>>> Pros: no wild loop, no graceful shutdown issue
>>>>>>> Cons: You need to arm/disable a timer (or similar) for each loop. This 
>>>>>>> is especially painful on high load.
>>>>>>> 
>>>>>>> 
>>>>>>> Having another way to stop an infinite/finite wait, on request 
>>>>>>> (odp_pktio_stop()) or error (someone tripped the cable in the lab, NIC 
>>>>>>> failure, etc.) is very useful.
>>>>>>> 
>>>>>>> The select() example is not very good âconâ as select triggers on 
>>>>>>> socket end-of-file (and errors); also it triggers on signals (I wonder 
>>>>>>> why).
>>>>>>> 
>>>>>>> See man select:
>>>>>>> âin particular, a file descriptor is also ready on end-of-fileâ
>>>>>>> âEINTR  A signal was caught; see signal(7).â
>>>>>>> 
>>>>>>> 
>>>>>>> Main question is if NXP and Cavium can implement this functionality.
>>>>>>> 
>>>>>>> Else, I am OK with this.
>>>>>>> 
>>>>>>> Reviewed-by: Bogdan Pricope <bogdan.pric...@linaro.org>


>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>> @bala-manoharan @bogdanPricope Can you comment on how `ODP_PKTIN_WAIT` 
>>>>>>>> is currently being used, to your knowledge?


>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>> The application can always compensate for an implementation's 
>>>>>>>>> limitations, but why should it have to? The ODP philosophy is for 
>>>>>>>>> applications to state their functional needs and let the 
>>>>>>>>> implementation provide those functional needs in a platform-optimal 
>>>>>>>>> manner. Having an indefinite wait is simpler from an application 
>>>>>>>>> standpoint and avoids needless overhead. If I have dozens of threads 
>>>>>>>>> that specify `ODP_PKTIN_WAIT` the implementation is free to 
>>>>>>>>> consolidate any internal timer management to amortize costs across 
>>>>>>>>> all of those threads. If each is managing its own timers that's not 
>>>>>>>>> possible. 
>>>>>>>>> 
>>>>>>>>> I have no problem with having an explicit timeout as a wait option, 
>>>>>>>>> but I'd argue that if we had to deprecate one it would be the 
>>>>>>>>> variable timeouts since a well-structured application shouldn't have 
>>>>>>>>> threads that are trying to do six different things that it's 
>>>>>>>>> constantly flipping between. The provision of timeouts in general is 
>>>>>>>>> to accommodate older apps moving to ODP that aren't as well 
>>>>>>>>> structured to a world where threads are cheap and can be used 
>>>>>>>>> "wastefully".
>>>>>>>>> 
>>>>>>>>> In any event, the `ODP_PKTIN_WAIT` feature has been around for some 
>>>>>>>>> time and deprecating it would conceivably impact existing 
>>>>>>>>> applications in significant ways, so I'd be reluctant to make such 
>>>>>>>>> changes without careful consideration. But there is an ambiguity 
>>>>>>>>> surrounding the intersection of this feature with `odp_pktio_stop()` 
>>>>>>>>> behavior that this PR looks to clarify.


>>>>>>>>>> Petri Savolainen(psavol) wrote:
>>>>>>>>>> The problem is caused by infinite wait option. I'd just deprecate 
>>>>>>>>>> the option. Implementation gets problematic when this call would 
>>>>>>>>>> need to monitor three different things: packet arrival, timeout and 
>>>>>>>>>> user calling stop. E.g. socket based implementation of this use 
>>>>>>>>>> select() which monitors packet and timeout, but not user input. If 
>>>>>>>>>> this change would be made, select could not sleep long but keep 
>>>>>>>>>> polling potential user call to stop (which normally would not happen 
>>>>>>>>>> almost ever).
>>>>>>>>>> 
>>>>>>>>>> So, it's better to leave stop synchronization to application 
>>>>>>>>>> control. It sets the timeout such that it can react fast enough to 
>>>>>>>>>> potential interface shutdown calls, but long enough to save energy. 
>>>>>>>>>> E.g. application can decide to use 1sec interface for shutdown 
>>>>>>>>>> polling, but implementation would need to poll more often e.g. every 
>>>>>>>>>> 10-100ms.


https://github.com/Linaro/odp/pull/387#discussion_r162571671
updated_at 2018-01-19 09:23:30

Re: [lng-odp] [PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction

Reply via email to