Thanks, Janne. That's very helpful. Nikhil, can you please offer your comments on this thread and perhaps outline how these scenarios work on your platform?
On Thu, Sep 7, 2017 at 3:39 AM, Peltonen, Janne (Nokia - FI/Espoo) <[email protected]> wrote: > Hi, > > Comments below: > > Bill Fischofer wrote: >> As discussed during today's ARCH call, we need to close on the >> remaining IPsec confusion surrounding the definition of >> odp_ipsec_sa_disable(). >> >> The main scenario that this API is designed to address, from an >> application standpoint is as follows: >> >> - Application has one or more active SAs that are receiving inline RX >> IPsec traffic. >> >> - An active SA needs to be deactivated / destroyed for some reason >> >> - Application needs to be sure that "in flight" events that it is not >> yet aware of are in a defined state. >> >> odp_ipsec_sa_disable() allows the application to tell the ODP >> implementation that it wants to deactivate an SA so that it can be >> destroyed. In response to this the implementation is expected to: >> >> - Stop making internal use of this SA >> >> - Tell the application when the SA is quiesced as far as the >> implementation is concerned. >> >> As currently defined, the way this second step is communicated to the >> application is via a status event posted to the event queue associated >> with the SA. >> >> The main point of the discussion is that NXP HW does not have a direct >> analog to this API. As Nikhil explained it, the "quiescing" function >> is assumed to be an application responsibility such that by the time >> an application calls odp_ipsec_sa_destroy() it already knows that the >> SA is inactive. >> >> When an application is using IPsec in lookaside mode this is a not >> unreasonable requirement, as the application explicitly invokes every >> IPsec operation, so it knows which operations are in flight. > > Yes, except that with fragmentation offload combined with IPsec > offload the application cannot easily know when it has received all > the result packets generated by single IPsec operation. If there > were always one result packet per operation, simple SA reference > counting in the application would work. > > We discussed this with Petri when designing the API and considered > various options but concluded that the disable completion event > is a simple solution with low overhead when an SA is not being > disabled. And since event queues are a central concept of ODP and > of its asynchronous APIs, informing the disable completion through > an event seems natural. > > Other options that do not work that well: > > - Application inspects the result fragments to see when it has got > them all. This would require almost the same work as fragment > reassembly and it would introduce state that would have to be > synchronized between multiple threads (since the result packets > may get scheduled to different threads). > > - ODP implementation marks the last result fragment with a special > bit. This may perhaps be nasty for some HW implementations that > parallelize the processing and do not know which of the fragments > gets queued the last. If the application reference counts the > SAs (as it most likely has to do to safely destroy the SA and > safely delete its own state for the SA), it can unref and SA > when it sees the marked "last" result fragment but only after > it has processed the other fragments. This imposes synchronization > overhead (e.g. ordered lock if the SA queue is ordered and > scheduled) for the normal processing path when the SA is not > being deleted. > > - Variations of the above (e.g. returning the total number of result > fragments for one IPsec operation in all the results) have > similar problems regarding parallelization in the implementation > and synchronization need in the application. > > The disable completion event also requires synchronization (e.g. > in the form of an ordered lock but only when the SA is being deleted) > >> >> For inline processing, however, it's less clear what the application >> can do since packets may be in process that the application has yet to >> see. If an application calls odp_ipsec_sa_destroy() the implementation >> can pick a point after which all packets matching that SA are >> ignored/dropped, but the issue is what about packets that have been >> processed and are sitting in event queues that the application is not >> yet aware of? >> >> If an application receives an event that was processed under a >> destroyed SA, the concern is that it not segfault or experience some >> other anomaly trying to access any application context that it had >> associated with that now-defunct SA. >> >> The solution proposed in PR #109[1] is that odp_ipsec_sa_disable() be >> allowed to complete synchronously rather than asynchronously, which >> means that the NXP implementation can treat it as an effective NOP >> since it doesn't map to any HW function. While this solves one >> problem, there is still the problem of how to deal with events that >> are already enqueued at the time of this call. The intent of >> odp_ipsec_sa_disable() is that after the application is assured that >> the SA is disabled it may safely call odp_ipsec_sa_destroy() with >> assurance that it doesn't have to deal with expired SA handles. > > Yes, but not only that. The disable sequence is needed also for > knowing when the application can safely destroy its own state > related to the SA. So even if through some hack the implementation > side of the problem could be fixed, the problem of how to synchronize > the application side state teardown remains. > >> >> It seems to me there are two ways we can square this circle. The first >> would be to say that in the NXP implementation odp_ipsec_sa_destroy() >> doesn't actually destroy the SA--it just marks it as destroyed. Actual >> cleanup/destruction would occur sometime later (e.g., when >> odp_term_global() is called). This would allow any pending events to >> be processed normally regardless of when the application calls >> odp_ipsec_sa_destroy(). > > We cannot wait until odp_term_global() since the application may > be long-living and use unlimited number of SAs during its lifetime > (even though only bounded number at a time). > > But regardless of this, how would the application in this case safely > free its own SA state pointed to by the ODP SA context? > >> >> Another approach would rely on how IPsec events themselves are >> processed. An IPsec event appears as an ODP_EVENT_PACKET with subtype >> ODP_EVENT_PACKET_IPSEC. >> >> A typical event dispatch loop for handling these events might be: >> >> odp_event_t ev; >> odp_event_type_t ev_t; >> odp_event_subtype_t ev_sub_t; >> odp_packet_t pkt; >> odp_ipsec_result_t ipsec_res; >> odp_sa_t sa; >> struct my_sa_context *ctx; /* This is the key application need */ >> >> while (1) { >> ev = odp_schedule(&queue, ODP_SCHED_WAIT); >> ev_t = odp_event_types(ev, &ev_sub_t); >> >> switch (ev_sub_t) { >> case ODP_EVENT_PACKET_IPSEC: >> pkt = odp_ipsec_packet_from_event(ev); >> >> if (odp_ipsec_result(&ipsec_res, pkt) < 0) { /* Stale >> event, discard */ > > There is a race condition here if the idea is that calling > odp_ipsec_sa_disable() (or destroy()) marks the SA stale to prevent > further processing for that SA in the result event handling. > > Maybe the odp_ipsec_result() succeeds but the SA becomes stale > right after that, before rest of the processing is done. > > >> odp_event_free(ev); >> continue; >> } >> >> if (ipsec_res->status.all != ODP_IPSEC_OK) { >> ...Handle IPsec failure >> odp_event_free(ev); >> continue; >> } >> >> ctx = odp_ipsec_sa_context(ipsec_res->sa); /* Get our >> context from result */ >> ...process the IPsec packet >> >> case /* Other cases go here */ >> ...etc. >> } >> } >> >> It would seem that a synchronous odp_ipsec_sa_disable() call would not >> be a problem if the NXP implementation can guarantee that after >> odp_ipsec_sa_destroy() is called, odp_ipsec_result() is guaranteed to >> return RC < 0 when called for a packet that references a destroyed SA. >> Alternately, an odp_ipsec_result_t could be returned with a status >> field that explicitly states that the SA has been destroyed, in which >> case it's a simple matter for the application to handle that case as >> well without ambiguity. > > This is racy and does not solve the issue of how the application > knows when it can free the SA. > > With the current API, in inline mode (but async is similar), IPsec > event handling can be written like this (assuming ordered SA queue): > > While (1) { > ev = schedule() > ... > switch (ev type) > > case ODP_EVENT_PACKET_IPSEC: > pkt = odp_ipsec_packet_from_event(ev); > odp_ipsec_result(&ipsec_res, pkt); > ctx = odp_ipsec_sa_context(ipsec_res->sa); > > ...process the IPsec packet > > case ODP_EVENT_IPSEC_STATUS: > odp_ipsec_status(&status, ev) > if (status.id == ODP_IPSEC_STATUS_SA_DISABLE) > ctx = odp_ipsec_sa_context(status->sa); > odp_schedule_order_lock(); > /* > * Now other threads are done with this SA > * since disable event is the last one > */ > free(ctx); > odp_ipsec_sa_destroy(status->sa); > > > Async case would be similar with the addition that in the IPsec > operation enqueue side application needs some synchronization to > ensure it does not call odp_ipsec_sa_disable() before it has > stopped using that SA in IPsec enqueuing. > > I am afraid any change in the API regarding disable would make the > application more complicated and would impose more synchronization > overhead in the normal processing path. But I am willing to change > my mind if it becomes clear how the application state can still be > safely deleted. > > Maybe NXP ODP implementation would do internal tricks to emulate the > API? Maybe the options for that could be investigated more instead of > trying to map the API to the underlying HW too directly? > > Janne > > >> >> Based on what Nikhil said this morning about how NXP hardware works, >> this approach should be possible. If so this would seem to address >> both NXP's and Nokia's concerns about this PR. >> >> Please feel free to make corrections if I've misstated anything here >> or if there are other potential solutions that should be explored >> here. >> >> Thanks. >> >> --- >> [1] https://github.com/Linaro/odp/pull/109
