Thanks, Janne. That's very helpful. Nikhil, can you please offer your
comments on this thread and perhaps outline how these scenarios work
on your platform?

On Thu, Sep 7, 2017 at 3:39 AM, Peltonen, Janne (Nokia - FI/Espoo)
<[email protected]> wrote:
> Hi,
>
> Comments below:
>
> Bill Fischofer wrote:
>> As discussed during today's ARCH call, we need to close on the
>> remaining IPsec confusion surrounding the definition of
>> odp_ipsec_sa_disable().
>>
>> The main scenario that this API is designed to address, from an
>> application standpoint is as follows:
>>
>> - Application has one or more active SAs that are receiving inline RX
>> IPsec traffic.
>>
>> - An active SA needs to be deactivated / destroyed for some reason
>>
>> - Application needs to be sure that "in flight" events that it is not
>> yet aware of are in a defined state.
>>
>> odp_ipsec_sa_disable() allows the application to tell the ODP
>> implementation that it wants to deactivate an SA so that it can be
>> destroyed. In response to this the implementation is expected to:
>>
>> - Stop making internal use of this SA
>>
>> - Tell the application when the SA is quiesced as far as the
>> implementation is concerned.
>>
>> As currently defined, the way this second step is communicated to the
>> application is via a status event posted to the event queue associated
>> with the SA.
>>
>> The main point of the discussion is that NXP HW does not have a direct
>> analog to this API. As Nikhil explained it, the "quiescing" function
>> is assumed to be an application responsibility such that by the time
>> an application calls odp_ipsec_sa_destroy() it already knows that the
>> SA is inactive.
>>
>> When an application is using IPsec in lookaside mode this is a not
>> unreasonable requirement, as the application explicitly invokes every
>> IPsec operation, so it knows which operations are in flight.
>
> Yes, except that with fragmentation offload combined with IPsec
> offload the application cannot easily know when it has received all
> the result packets generated by single IPsec operation. If there
> were always one result packet per operation, simple SA reference
> counting in the application would work.
>
> We discussed this with Petri when designing the API and considered
> various options but concluded that the disable completion event
> is a simple solution with low overhead when an SA is not being
> disabled. And since event queues are a central concept of ODP and
> of its asynchronous APIs, informing the disable completion through
> an event seems natural.
>
> Other options that do not work that well:
>
> - Application inspects the result fragments to see when it has got
>   them all. This would require almost the same work as fragment
>   reassembly and it would introduce state that would have to be
>   synchronized between multiple threads (since the result packets
>   may get scheduled to different threads).
>
> - ODP implementation marks the last result fragment with a special
>   bit. This may perhaps be nasty for some HW implementations that
>   parallelize the processing and do not know which of the fragments
>   gets queued the last. If the application reference counts the
>   SAs (as it most likely has to do to safely destroy the SA and
>   safely delete its own state for the SA), it can unref and SA
>   when it sees the marked "last" result fragment but only after
>   it has processed the other fragments. This imposes synchronization
>   overhead (e.g. ordered lock if the SA queue is ordered and
>   scheduled) for the normal processing path when the SA is not
>   being deleted.
>
> - Variations of the above (e.g. returning the total number of result
>   fragments for one IPsec operation in all the results) have
>   similar problems regarding parallelization in the implementation
>   and synchronization need in the application.
>
> The disable completion event also requires synchronization (e.g.
> in the form of an ordered lock but only when the SA is being deleted)
>
>>
>> For inline processing, however, it's less clear what the application
>> can do since packets may be in process that the application has yet to
>> see. If an application calls odp_ipsec_sa_destroy() the implementation
>> can pick a point after which all packets matching that SA are
>> ignored/dropped, but the issue is what about packets that have been
>> processed and are sitting in event queues that the application is not
>> yet aware of?
>>
>> If an application receives an event that was processed under a
>> destroyed SA, the concern is that it not segfault or experience some
>> other anomaly trying to access any application context that it had
>> associated with that now-defunct SA.
>>
>> The solution proposed in PR #109[1] is that odp_ipsec_sa_disable() be
>> allowed to complete synchronously rather than asynchronously, which
>> means that the NXP implementation can treat it as an effective NOP
>> since it doesn't map to any HW function. While this solves one
>> problem, there is still the problem of how to deal with events that
>> are already enqueued at the time of this call. The intent of
>> odp_ipsec_sa_disable() is that after the application is assured that
>> the SA is disabled it may safely call odp_ipsec_sa_destroy() with
>> assurance that it doesn't have to deal with expired SA handles.
>
> Yes, but not only that. The disable sequence is needed also for
> knowing when the application can safely destroy its own state
> related to the SA. So even if through some hack the implementation
> side of the problem could be fixed, the problem of how to synchronize
> the application side state teardown remains.
>
>>
>> It seems to me there are two ways we can square this circle. The first
>> would be to say that in the NXP implementation odp_ipsec_sa_destroy()
>> doesn't actually destroy the SA--it just marks it as destroyed. Actual
>> cleanup/destruction would occur sometime later (e.g., when
>> odp_term_global() is called). This would allow any pending events to
>> be processed normally regardless of when the application calls
>> odp_ipsec_sa_destroy().
>
> We cannot wait until odp_term_global() since the application may
> be long-living and use unlimited number of SAs during its lifetime
> (even though only bounded number at a time).
>
> But regardless of this, how would the application in this case safely
> free its own SA state pointed to by the ODP SA context?
>
>>
>> Another approach would rely on how IPsec events themselves are
>> processed. An IPsec event appears as an ODP_EVENT_PACKET with subtype
>> ODP_EVENT_PACKET_IPSEC.
>>
>> A typical event dispatch loop for handling these events might be:
>>
>> odp_event_t ev;
>> odp_event_type_t ev_t;
>> odp_event_subtype_t ev_sub_t;
>> odp_packet_t pkt;
>> odp_ipsec_result_t ipsec_res;
>> odp_sa_t sa;
>> struct my_sa_context *ctx; /* This is the key application need */
>>
>> while (1) {
>>          ev = odp_schedule(&queue, ODP_SCHED_WAIT);
>>          ev_t = odp_event_types(ev, &ev_sub_t);
>>
>>          switch (ev_sub_t) {
>>          case ODP_EVENT_PACKET_IPSEC:
>>                  pkt = odp_ipsec_packet_from_event(ev);
>>
>>                  if (odp_ipsec_result(&ipsec_res, pkt) < 0) { /* Stale
>> event, discard */
>
> There is a race condition here if the idea is that calling
> odp_ipsec_sa_disable() (or destroy()) marks the SA stale to prevent
> further processing for that SA in the result event handling.
>
> Maybe the odp_ipsec_result() succeeds but the SA becomes stale
> right after that, before rest of the processing is done.
>
>
>>                          odp_event_free(ev);
>>                          continue;
>>                  }
>>
>>                  if (ipsec_res->status.all != ODP_IPSEC_OK) {
>>                        ...Handle IPsec failure
>>                        odp_event_free(ev);
>>                        continue;
>>                  }
>>
>>                  ctx = odp_ipsec_sa_context(ipsec_res->sa); /* Get our
>> context from result */
>>                  ...process the IPsec packet
>>
>>         case /* Other cases go here */
>>          ...etc.
>>          }
>> }
>>
>> It would seem that a synchronous odp_ipsec_sa_disable() call would not
>> be a problem if the NXP implementation can guarantee that after
>> odp_ipsec_sa_destroy() is called, odp_ipsec_result() is guaranteed to
>> return RC < 0 when called for a packet that references a destroyed SA.
>> Alternately, an odp_ipsec_result_t could be returned with a status
>> field that explicitly states that the SA has been destroyed, in which
>> case it's a simple matter for the application to handle that case as
>> well without ambiguity.
>
> This is racy and does not solve the issue of how the application
> knows when it can free the SA.
>
> With the current API, in inline mode (but async is similar), IPsec
> event handling can be written like this (assuming ordered SA queue):
>
> While (1) {
>         ev = schedule()
>         ...
>         switch (ev type)
>
>           case ODP_EVENT_PACKET_IPSEC:
>                   pkt = odp_ipsec_packet_from_event(ev);
>                   odp_ipsec_result(&ipsec_res, pkt);
>                   ctx = odp_ipsec_sa_context(ipsec_res->sa);
>
>                   ...process the IPsec packet
>
>          case ODP_EVENT_IPSEC_STATUS:
>                 odp_ipsec_status(&status, ev)
>                 if (status.id == ODP_IPSEC_STATUS_SA_DISABLE)
>                         ctx = odp_ipsec_sa_context(status->sa);
>                         odp_schedule_order_lock();
>                         /*
>                          * Now other threads are done with this SA
>                          * since disable event is the last one
>                          */
>                         free(ctx);
>                         odp_ipsec_sa_destroy(status->sa);
>
>
> Async case would be similar with the addition that in the IPsec
> operation enqueue side application needs some synchronization to
> ensure it does not call odp_ipsec_sa_disable() before it has
> stopped using that SA in IPsec enqueuing.
>
> I am afraid any change in the API regarding disable would make the
> application more complicated and would impose more synchronization
> overhead in the normal processing path. But I am willing to change
> my mind if it becomes clear how the application state can still be
> safely deleted.
>
> Maybe NXP ODP implementation would do internal tricks to emulate the
> API? Maybe the options for that could be investigated more instead of
> trying to map the API to the underlying HW too directly?
>
>         Janne
>
>
>>
>> Based on what Nikhil said this morning about how NXP hardware works,
>> this approach should be possible. If so this would seem to address
>> both NXP's and Nokia's concerns about this PR.
>>
>> Please feel free to make corrections if I've misstated anything here
>> or if there are other potential solutions that should be explored
>> here.
>>
>> Thanks.
>>
>> ---
>> [1] https://github.com/Linaro/odp/pull/109

Reply via email to