On Mon, Sep 11, 2017 at 1:30 AM, Peltonen, Janne (Nokia - FI/Espoo)
<[email protected]> wrote:
>
> Bill Fischofer [mailto:[email protected]] wrote:
>> On Fri, Sep 8, 2017 at 6:10 AM, Janne Peltonen <[email protected]>
>> wrote:
>> >
>> >
>> > On Fri, 8 Sep 2017, Nikhil Agarwal wrote:
>> >> On 7 September 2017 at 14:09, Peltonen, Janne (Nokia - FI/Espoo)
>> >> <[email protected]> wrote:>
>> >> > Bill Fischofer wrote:
>> >> > > if (odp_ipsec_result(&ipsec_res, pkt) < 0) { /* Stale
>> >> > > event, discard */
>> >>
>> >> > There is a race condition here if the idea is that calling
>> >> > odp_ipsec_sa_disable() (or destroy()) marks the SA stale to prevent
>> >> > further processing for that SA in the result event handling.
>> >>
>> >> > Maybe the odp_ipsec_result() succeeds but the SA becomes stale
>> >> > right after that, before rest of the processing is done.
>> >>
>> >> Same race condition exist in the proposed API when one thread received
>> >> the last
>> >> packet of SA and still process it while the other thread on receiving
>> >> disable
>> >> event destroy the SA, application will always need to synchronize its own
>> >> thread.
>> >
>> > There is no race in the API, only in incorrect use of it. As I explained,
>> > synchronization is needed even with the disable event and it can be
>> > easily done using an ordered lock that ensures that other event handling
>> > threads are finished processing the SA (they have released the queue
>> > context through another call to schedule or explicitly). So the event
>> > handler for the disable event can destroy the SA safely without any race
>> > if it uses an ordered lock (or if the SA queue is atomic).
>> >
>> > I wrote a piece of pseudocode about this in my reply, maybe you missed it.
>> >
>> >>
>> >> Let me reiterate the solution we discussed yesterday. After the SA is
>> >> disabled
>> >> successfully, implementations will stop enqueuing any more events from
>> >> that SA
>> >> and any call to odp_ipsec_result will indicate that SA is disabled.(SA
>> >> Entry is
>> >> still valid though). Application may choose to synchronize its database
>> >> and
>> >> process that packet or may drop it. Then it will call
>> >> odp_ipsec_sa_destroy which
>> >> will delete the SA and any further call to odp_ipsec_result pertaining to
>> >> that
>> >> SA will result in invalid event. Hope this resolves the issue.
>> >
>> > That clarifies the API you are proposing but not how you intend it to
>> > be used. It is easy to say that an application just should do whatever
>> > synchronization is necessary.
>> >
>> > That said, I think your proposal could be made to work from application
>> > perspective, but with some inconvenience:
>> >
>> > After the packet event handler checks (for every IPsec packet event)
>> > that the event is not stale, it has to prevent odp_ipsec_sa_destroy()
>> > from being called in any other thread until it no longer uses the SA.
>> > While this could be done e.g. using an RCU based or epoch based
>> > synchronization scheme to delay the destroy() just long enough, it
>> > may perhaps not be easy for all application authors to do so since
>> > ODP does not really provide anything than the basic atomics for that.
>>
>> Wouldn't this happen automatically if the event queue were atomic? Or
>> if an ordered lock were used to protect odp_ipsec_result() for the
>> duration of the SA context reference? For example:
>
> That approach would work if the event handler destroyed the SA in
> the same queue context as it handles the packet events of the SA.
>
> The problem is that SA deletion starts elsewhere in some control thread
> and there would have to be a way to reliably delegate the actual destroy
> to the correct queue context by having a an event enqueued through the
> SA queue. And after the proposed API change the ODP implementation would
> not necessarily send such an event (e.g. disable complete event) and
> would not allow the application to send any event either.
>
> The approach would work if there are unhandled ipsec packets events
> for the SA in the SA queues but that may not be the case. In async case
> one could send a dummy packet after disable() to get such an event, but
> in the inline case that would not work (well, if one shared the same
> event queue between inline and async processing, then that trick could
> work).
>
> The atomic queue or ordered lock approach works with the current API
> just because there is the disable completion event that can be used
> to transfer control to the correct packet processing thread (even if
> the actual destroy would be further deferred to some control thread).
>
> A variation of this, with probably significant overhead, would be
> such that the ipsec packet event handler just enqueued the events
> to an application created normal ODP queue and the application would
> generate a suitable "disable complete" event to the same queue and
> then the actual event processing in the context of the application
> created normal queue could do the needed ordering using an ordered
> lock.
>
> So yes, things can be made to work with the proposed API change,
> but in a less straightforward manner than with the current API.
Thanks. I'm just trying to find a way forward to some sort of
resolution on this issue. We can discuss further in today's ARCH call.
>
> Janne
>
>>
>> while (1) {
>> ev = odp_schedule(&queue, ODP_SCHED_WAIT);
>> ev_t = odp_event_types(ev, &ev_sub_t);
>>
>> switch (ev_sub_t) {
>> case ODP_IPSEC_PACKET_IPSEC:
>> pkt = odp_ipsec_packet_from_event(ev);
>> odp_schedule_order_lock(0); /* Protect SA refs from destroy races */
>>
>> if (odp_ipsec_result(&ipsec_res, pkt) < 0) { /* Stale event, discard
>> */
>> odp_schedule_order_unlock(0);
>> free_event(ev);
>> continue;
>> }
>>
>> if (ipsec_res->status.all != ODP_IPSEC_OK) {
>> odp_schedule_order_unlock(0);
>> ...Handle IPsec failure
>> odp_event_free(ev);
>> continue;
>> }
>>
>> ctx = odp_ipsec_sa_context(ipsec_res->sa); /* Get our context
>> from result */
>>
>> ...process the event, referencing SA context as needed
>> If we decide to call odp_ipsec_sa_destroy() here, we're OK
>> since no other thread can be actively referencing the SA.
>>
>> odp_schedule_order_unlock(0); /* When we're done with
>> referencing the SA */
>>
>> ...continue event processing without additional SA / ctx referencing.
>> break;
>>
>> case /* Other cases go here */
>> ...etc.
>> }
>> }
>>
>>
>> >
>> > The disable completion event is, IMHO, cleaner for ODP API. It is
>> > not clear to me if it is impossible to provide it or merely difficult.
>> >
>> > Janne
>> >