Re: [lng-odp] [PATCHv2 2/3] api: odp_timer.h: updated API, lock-less implementation

Ola Liljedahl Thu, 18 Dec 2014 11:26:54 -0800

On 18 December 2014 at 16:25, Savolainen, Petri (NSN - FI/Espoo)
<[email protected]> wrote:
> After quick look into it...
>
>> -----Original Message-----
>> From: [email protected] [mailto:lng-odp-
>> [email protected]] On Behalf Of ext Ola Liljedahl
>> Sent: Wednesday, December 17, 2014 12:46 AM
>> To: [email protected]
>> Subject: [lng-odp] [PATCHv2 2/3] api: odp_timer.h: updated API, lock-less
>> implementation
>>
>> The timer API is updated. A major change is that timers are allocated and
>> freed
>> separately from timeouts being set and cancelled. The life-length of a
>> timer
>> normally corresponds to the life-length of the associated stateful flow
>> while
>> the life-length of a timeout corresponds to individual packets being
>> transmitted and received.
>> The reference timer implementation is lock-less for platforms with support
>> for 128-bit (16-byte) atomic exchange and CAS operations. Otherwise a
>> lock-based
>> implementation (using as many locks as desired, no global lock) is used
>> but
>> some operations (e.g. reset re-using existing timeout buffer) may still be
>> lock-less on some architectures.
>> Updated the example example/timer/odp_timer_test.c.
>>
>> Signed-off-by: Ola Liljedahl <[email protected]>
>> ---
>> (This document/code contribution attached is provided under the terms of
>> agreement LES-LTM-21309)
>>  example/timer/odp_timer_test.c                     |  173 +--
>>  platform/linux-generic/include/api/odp_timer.h     |  474 +++++++--
>>  .../linux-generic/include/odp_timer_internal.h     |   60 +-
>>  platform/linux-generic/odp_timer.c                 | 1098 ++++++++++++++-
>> -----
>>  4 files changed, 1324 insertions(+), 481 deletions(-)
>>
>> diff --git a/example/timer/odp_timer_test.c
>> b/example/timer/odp_timer_test.c
>> index 972bc96..33ef219 100644
>> --- a/example/timer/odp_timer_test.c
>> +++ b/example/timer/odp_timer_test.c
>> @@ -26,7 +26,7 @@
>>
>>
>>  #define MAX_WORKERS           32            /**< Max worker threads */
>> -#define MSG_POOL_SIZE         (4*1024*1024) /**< Message pool size */
>> +#define MSG_POOL_SIZE         (4*1024) /**< Message pool size */
>>
>>
>>  /** Test arguments */
>> @@ -43,67 +43,116 @@ typedef struct {
>>  /** @private Barrier for test synchronisation */
>>  static odp_barrier_t test_barrier;
>>
>> -/** @private Timer handle*/
>> -static odp_timer_t test_timer;
>> +/** @private Buffer pool handle */
>> +static odp_buffer_pool_t pool;
>>
>> +/** @private Timer pool handle */
>> +static odp_timer_pool_t tp;
>> +
>> +/** @private Timeout status ASCII strings */
>> +static const char *const status2str[] = {
>> +     "fresh", "stale", "orphaned"
>> +};
>> +
>> +/** @private Timer set status ASCII strings */
>> +static const char *timerset2str(uint64_t val)
>> +{
>> +     switch (val) {
>> +     case ODP_TICK_TOOEARLY:
>> +             return "tooearly";
>> +     case ODP_TICK_TOOLATE:
>> +             return "toolate";
>> +     case ODP_TICK_INVALID:
>> +             return "error";
>> +     default:
>> +             return "success";
>> +     }
>> +};
>> +
>> +/** @private Helper struct for timers */
>> +struct test_timer {
>> +     odp_timer_t tim;
>> +     odp_buffer_t buf;
>> +     uint64_t tick;
>> +};
>> +
>> +/** @private Array of all timer helper structs */
>> +static struct test_timer tt[256];
>>
>>  /** @private test timeout */
>>  static void test_abs_timeouts(int thr, test_args_t *args)
>>  {
>> -     uint64_t tick;
>>       uint64_t period;
>>       uint64_t period_ns;
>>       odp_queue_t queue;
>> -     odp_buffer_t buf;
>> -     int num;
>> +     int remain = args->tmo_count;
>> +     uint64_t tick;
>> +     struct test_timer *ttp;
>>
>>       EXAMPLE_DBG("  [%i] test_timeouts\n", thr);
>>
>>       queue = odp_queue_lookup("timer_queue");
>>
>>       period_ns = args->period_us*ODP_TIME_USEC;
>> -     period    = odp_timer_ns_to_tick(test_timer, period_ns);
>> +     period    = odp_timer_ns_to_tick(tp, period_ns);
>>
>>       EXAMPLE_DBG("  [%i] period %"PRIu64" ticks,  %"PRIu64" ns\n", thr,
>>                   period, period_ns);
>>
>> -     tick = odp_timer_current_tick(test_timer);
>> +     EXAMPLE_DBG("  [%i] current tick %"PRIu64"\n", thr,
>> +                 odp_timer_current_tick(tp));
>>
>> -     EXAMPLE_DBG("  [%i] current tick %"PRIu64"\n", thr, tick);
>> -
>> -     tick += period;
>> -
>> -     if (odp_timer_absolute_tmo(test_timer, tick, queue,
>> ODP_BUFFER_INVALID)
>> -         == ODP_TIMER_TMO_INVALID){
>> -             EXAMPLE_DBG("Timeout request failed\n");
>> +     ttp = &tt[thr - 1]; /* Thread starts at 1 */
>> +     ttp->tim = odp_timer_alloc(tp, queue, ttp);
>> +     if (ttp->tim == ODP_TIMER_INVALID) {
>> +             EXAMPLE_ERR("Failed to allocate timer\n");
>>               return;
>>       }
>> +     ttp->buf = odp_buffer_alloc(pool);
>> +     if (ttp->buf == ODP_BUFFER_INVALID) {
>> +             EXAMPLE_ERR("Failed to allocate buffer\n");
>> +             return;
>> +     }
>> +     tick = odp_timer_current_tick(tp);
>>
>> -     num = args->tmo_count;
>> -
>> -     while (1) {
>> -             odp_timeout_t tmo;
>> -
>> -             buf = odp_schedule_one(&queue, ODP_SCHED_WAIT);
>> -
>> -             tmo  = odp_timeout_from_buffer(buf);
>> -             tick = odp_timeout_tick(tmo);
>> -
>> -             EXAMPLE_DBG("  [%i] timeout, tick %"PRIu64"\n", thr, tick);
>> -
>> -             odp_buffer_free(buf);
>> -
>> -             num--;
>> -
>> -             if (num == 0)
>> -                     break;
>> +     while (remain != 0) {
>> +             odp_buffer_t buf;
>>
>>               tick += period;
>> +             ttp->tick = odp_timer_set_abs(ttp->tim, tick, &ttp->buf);
>> +             if (odp_unlikely(ttp->tick == ODP_TICK_TOOEARLY ||
>> +                              ttp->tick == ODP_TICK_TOOLATE ||
>> +                              ttp->tick == ODP_TICK_INVALID)) {
>> +                     /* Too early or too late timeout requested */
>> +                     EXAMPLE_ABORT("odp_timer_set_abs() failed: %s\n",
>> +                                   timerset2str(ttp->tick));
>> +             }
>>
>> -             odp_timer_absolute_tmo(test_timer, tick,
>> -                                    queue, ODP_BUFFER_INVALID);
>> +             /* Get the next expired timeout */
>> +             buf = odp_schedule_one(&queue, ODP_SCHED_WAIT);
>> +             if (odp_timer_tmo_metadata(buf, NULL, &tick, (void **)&ttp)) {
>> +                     ttp->buf = buf;
>> +                     if (odp_likely(tick != ttp->tick)) {
>
> Isn't this a potential race condition introduced by the API ? I.e. if same 
> operations run on multiple threads and returned tick value is used for 
> checking tmo validity (STALE vs FRESH).
Yes. But as described in a private email, earlier you did not approve
of the alternative design where the timeout is checked against the
timer for freshness/staleness. You have to chose either or, there is
not free lunch.


The API could handle such a race condition (by keeping all mutable
state in the timer and ensuring atomicity) but having two threads
concurrently access the same resource(s) (what about other resources
in the application?) without synchronization seems like a bad idea
anyway.

>
> Thread A:
> // HW timer is set, timer expires,
> // tmo buf is scheduled to thread B, before this function returns (due to an 
> interrupt, etc)
> ... = odp_timer_set_abs(ttp->tim, tick, &ttp->buf);
>
>
> Thread B:
> // receives and handles tmo
> buf = odp_schedule_one(&queue, ODP_SCHED_WAIT);
>                 if (odp_timer_tmo_metadata(buf, NULL, &tick, (void **)&ttp)) {
>                         ttp->buf = buf;
>
> // ttp->tick is not set yet
>                         if (odp_likely(tick != ttp->tick)) { ...
>
>
> Thread A:
>
> // odp_timer_set_abs() returns and sets ttp->tick
> ttp->tick = ...
>
>
>
>
>
>
>
>> --- a/platform/linux-generic/include/api/odp_timer.h
>> +++ b/platform/linux-generic/include/api/odp_timer.h
>> @@ -8,9 +8,211 @@
>>  /**
>>   * @file
>>   *
>> - * ODP timer
>> + * ODP timer service
>>   */
>>
>> +/** Example #1 Retransmission timer (e.g. for reliable connections)
>> + @code
>> +
>> +//Create timer pool for reliable connections
>> +#define SEC 1000000000ULL //1s expressed in nanoseconds
>> +odp_timer_pool_t tcp_tpid =
>> +    odp_timer_pool_create("TCP",
>> +                       buffer_pool,
>> +                       1000000,//resolution 1ms
>> +                       0,//min tmo
>> +                       7200 * SEC,//max tmo length 2hours
>> +                       40000,//num_timers
>> +                       true,//shared
>> +                       ODP_CLOCK_CPU
>> +                      );
>> +if (tcp_tpid == ODP_TIMER_POOL_INVALID)
>> +{
>> +     //Failed to create timer pool => fatal error
>> +}
>> +
>
> It's better to remove  all this pseudo example code from the API file and
> add one or two as real examples that can be built, run and verified
> that they actually work and are in sync with the current API version.
Except those real examples will have to include a lot of boiler plate
code (create buffer pool and queues etc) which is not important in
other to showcase how to use the timer API. But I don't see why one
precludes the other.

>
>>
>> +/**
>> + * Return values of timer set calls.
>> + */
>> +/**
>> + * Timer set operation failed, expiration too early.
>> + * Either retry with a later expiration time or process the timeout
>> + * immediately. */
>> +#define ODP_TICK_TOOEARLY 0xFFFFFFFFFFFFFFFDULL
>> +/**
>> + * Timer set operation failed, expiration too late.
>> + * Truncate the expiration time against the maximum timeout for the
>> + * timer pool. */
>> +#define ODP_TICK_TOOLATE  0xFFFFFFFFFFFFFFFEULL
>> +/**
>> + * Timer set operation failed because not timeout buffer present or
>> specified.
>> + * This value is also return from odp_timer_cancel() and
>> odp_timer_free().
>> + */
>> +#define ODP_TICK_INVALID  0xFFFFFFFFFFFFFFFFULL
>
>
> I think it's not good idea to overload tick variables with special tick 
> values. Yes, the type is uint64_t and it's very unlikely run into this values 
> when tick starts from 0 - but (on API level) it opens a door for strange 
> behavior with if the system tick is uninitialized or init with a large value.
Overloading the return value is one thing (and it is not beautiful API
design, I agree). Having problem with overflow of uint64_t values is
another. If the system tick overflows (and three reserved values at
the end does not matter in reality, three ticks later and you're dead
anyway) because of missing initialization, then you have a serious bug
which I don't think application should try to recover from. Testing of
the implementation will have to make sure that the timer tick starts
form some suitable low value (e.g. 0) and does not overflow during the
life time of the system (a 64-bit counter @ 1GHz takes >580 years to
overflow).

>
>
>>
>>  /**
>> - * Create a timer
>> + * Create a timer pool
>>   *
>> - * Creates a new timer with requested properties.
>> + * Create a new timer pool.
>>   *
>>   * @param name       Name
>> - * @param pool       Buffer pool for allocating timeout notifications
>> + * @param buf_pool   Buffer pool for allocating timeouts (and only
>> timeouts)
>>   * @param resolution Timeout resolution in nanoseconds
>> - * @param min_tmo    Minimum timeout duration in nanoseconds
>> - * @param max_tmo    Maximum timeout duration in nanoseconds
>> + * @param min_tmo    Minimum relative timeout in nanoseconds
>> + * @param max_tmo    Maximum relative timeout in nanoseconds
>> + * @param num_timers Number of supported timers (minimum)
>> + * @param shared     Shared or private timer pool.
>> + *              Operations on shared timers will include the necessary
>> + *              mutual exclusion, operations on private timers may not
>> + *              (mutual exclusion is the responsibility of the caller).
>> + * @param clk_src    Clock source to use
>> + *
>> + * @return Timer pool handle if successful, otherwise
>> ODP_TIMER_POOL_INVALID
>> + * and errno set
>> + */
>> +odp_timer_pool_t
>> +odp_timer_pool_create(const char *name,
>> +                   odp_buffer_pool_t buf_pool,
>> +                   uint64_t resolution,
>> +                   uint64_t min_tmo,
>> +                   uint64_t max_tmo,
>> +                   uint32_t num_timers,
>> +                   int shared,
>> +                   odp_timer_clk_src_t clk_src);
>
> Not a must change in this point, but the API would be cleaner if all these 
> params would be packed into a odp_timer_pool_param_t struct (all except name, 
> just like odp_buffer_pool_param_t).
I could do that. This is modeled on the old timer API. But as the
number of parameters increase, of course it starts to become unwieldy.
My biggest problem with this function is that (as Leo once wrote) you
can specify two of resolution, min timeout and max timeout and the
third will be returned. But not all three.

>
>
> Also I'd swap "int shared" to "int private". So that the default value 
> (private == 0) would select the shared timer mode (== multicore, which is the 
> default for everything else in ODP).
Good suggestion. I do have problems with the non-shared (private) mode
anyway as the timer API requires some other agent to expire the timers
and all data structure access have to be synchronized between the only
client and the timer manager itself. So I would like to drop this
parameter (I think it was Bill Mills or David Lide who originally
suggested it a long time ago).

>
>
>>  /**
>> - * Maximum timeout in timer ticks
>> + * ODP timer pool information and configuration
>> + */
>> +
>> +typedef struct odp_timer_pool_info_s {
>> +     uint64_t resolution;/**< Timer resolution (in ns) */
>> +     uint64_t min_tmo;   /**< Min supported relative timeout (in ticks)*/
>> +     uint64_t max_tmo;   /**< Max supported relative timeout (in ticks)*/
>> +     uint32_t num_timers;/**< Number of supported timers */
>> +     uint32_t cur_timers;/**< Number of currently allocated timers */
>> +     uint32_t hwm_timers;/**< Number of used timers high watermark */
>> +     int shared;         /**< Shared flag */
>> +     char name[80];      /**< Name of timer pool */
>
> Name can be a const pointer to the string.
I don't like returning pointers to internal data. Who knows how long
the application wants to keep this pointer and then unexpectedly use
it, long after some other thread might have deleted the timer pool.

>
> Params from pool_create could be packed into the odp_timer_pool_param_t (just 
> like odp_buffer_pool_info_t).
Yes, this follows from the earlier suggestion. Expect another patch.

>
>
>> +} odp_timer_pool_info_t;
>> +
>> +/**
>> + * Query timer pool information and configuration
>> + *  Timer resolution in nanoseconds
>> + *  Minimum and maximum (relative) timeouts in timer ticks
>> + *  Number of supported timers
>> + *  Nunber of timers in use
>> + *  Nunber of timers in use - high watermark
>> + *  Shared or private timer pool
>> + *  Name of timer pool.
>>   *
>> - * @param timer Timer
>> + * @param tpid Timer pool identifier
>> + * @param buf Pointer to information buffer
>> + * @param buf_size Size of information buffer
>
> Could be dropped since sizeof(odp_timer_pool_info_t) defines the size (just 
> like odp_buffer_pool_info(
> ) outputs info and returns success/failure).
I expected the time_pool_info struct to potentially grow. But perhaps
we would require a recompilation of the application in such a case.

>
>>   *
>> - * @return Maximum timeout in timer ticks
>> + * @return Actual size written
>>   */
>> -uint64_t odp_timer_maximum_tmo(odp_timer_t timer);
>> +size_t odp_timer_pool_info(odp_timer_pool_t tpid,
>> +                        odp_timer_pool_info_t *buf,
>> +                        size_t buf_size);
>>
>
>
>
>> +
>> +/**
>> + * Free a timer
>> + *
>> + * Free (destroy) a timer, freeing associated resources.
>> + * The timeout buffer for an active timer will be returned.
>> + * An expired and enqueued timeout buffer will not be freed.
>> + * It is the responsibility of the application to free this timeout when
>> it
>> + * is received.
>> + *
>> + * @param tim      Timer handle
>> + * @param tmo_buf Reference to a buffer variable which will be written
>> with
>> + * the buffer handle of any present timeout buffer (e.g. for an active
>> timer).
>> + * @return       A tick value which will not match any valid expiration
>> tick.
>>   */
>> -uint64_t odp_timer_current_tick(odp_timer_t timer);
>> +uint64_t odp_timer_free(odp_timer_t tim, odp_buffer_t *tmo_buf);
>
> Why this returns tick? And what value it actually returns?
This is just to follow the pattern of odp_timer_set_abs/rel. All those
calls return a tick value which the application is expected to save so
that it can be compared with the tick value of any received timeout.
This is then used to check if a timeout is fresh or stale. Strictly
not necessary to return a value here but I want to make the API simple
to use.

>
> The documentation of tmo_buf output is misleading. It should say that if the 
> operation outputs handle if active timer was successfully canceled, if it 
> failed on that BUFFER_INVALID is output. I think that's the intention, so 
> that user do not get handle to a buffer that may be on the way on the queue 
> and potentially double free that.
I think the description is correct. The tmo_buf is written with the
buffer handle if the timer was active (and thus associated with a
timeout buffer) and the cancel operation (which is implied by the free
operation) "succeeds". If the cancel does not succeed (because the
timer had already expired and there is no timer buffer), it does not
write any value to the tmo_buf variable.

The application might already have the tmo_buf, either because it was
received or the timer was cancelled before it had expired (and thus
the timeout buffer could be returned in the cancel call). In such a
case, odp_timer_cancel or odp_timer_free should not overwrite tmo_buf
with ODP_BUFFER_INVALID because then we would leak the buffer that was
stored here.

I was hoping that the example code snippets should show the proper
behavior here.



>
>>
>>  /**
>> - * Request timeout with an absolute timer tick
>> + * Set a timer (absolute time) with a user-provided timeout buffer
>> + *
>> + * Set (arm) the timer to expire at specific time. The timeout
>> + * buffer will be enqueued when the timer expires.
>>   *
>> - * When tick reaches tmo_tick, the timer enqueues the timeout
>> notification into
>> - * the destination queue.
>> + * Note: any invalid parameters will be treated as programming errors and
>> will
>> + * cause the application to abort.
>>   *
>> - * @param timer    Timer
>> - * @param tmo_tick Absolute timer tick value which triggers the timeout
>> - * @param queue    Destination queue for the timeout notification
>> - * @param buf      User defined timeout notification buffer. When
>> - *                 ODP_BUFFER_INVALID, default timeout notification is
>> used.
>> + * @param tim      Timer
>> + * @param abs_tck  Expiration time in absolute timer ticks
>> + * @param tmo_buf  Reference to a buffer variable that points to timeout
>> buffer
>> + * or NULL to reuse the existing timeout buffer
>>   *
>> - * @return Timeout handle if successful, otherwise ODP_TIMER_TMO_INVALID
>> + * @return The expiration tick or ODP_TICK_TOOEARLY, ODP_TICK_TOOLATE or
>> + * ODP_TICK_INVALID (timer not active and no timeout buffer to reuse).
>>   */
>> -odp_timer_tmo_t odp_timer_absolute_tmo(odp_timer_t timer, uint64_t
>> tmo_tick,
>> -                                    odp_queue_t queue, odp_buffer_t buf);
>> +uint64_t odp_timer_set_abs(odp_timer_t tim,
>> +                        uint64_t abs_tck,
>> +                        odp_buffer_t *tmo_buf);
>
> Why this returns tick and not just a status. It promotes the race condition 
> highlighted earlier (my->tick vs. tmo->tick).
>
> Can this accept also user defined buffers (not type timeout)?
Yes. The user is though responsible for the proper initialization of
such a buffer, the timer manager won't touch it (just enqueue it when
the timer expires).

>
>
>>
>>  /**
>> - * Convert buffer handle to timeout handle
>> + * Cancel a timer
>> + *
>> + * Cancel a timer, preventing future expiration and delivery. Return any
>> + * present timeout buffer.
>> + *
>> + * A timer that has already expired may be impossible to cancel and the
>> timeout
>> + * will instead be delivered to the destination queue.
>>   *
>> - * @param buf  Buffer handle
>> + * Note: any invalid parameters will be treated as programming errors and
>> will
>> + * cause the application to abort.
>>   *
>> - * @return Timeout buffer handle
>> + * @param tim     Timer
>> + * @param tmo_buf Reference to a buffer variable which will be written
>> with
>> + * the buffer handle of any present timeout buffer (e.g. for an active
>> timer).
>> + * @return       A tick value which will not match any valid expiration
>> tick.
>>   */
>> -odp_timeout_t odp_timeout_from_buffer(odp_buffer_t buf);
>> +uint64_t odp_timer_cancel(odp_timer_t tim, odp_buffer_t *tmo_buf);
>
> Same comments with the free call.
The free call is described that way because of the behavior of the cancel call.

>
>
>>
>>  /**
>> - * Return absolute timeout tick
>> + * Get metadata from system timeout buffer
>>   *
>> - * @param tmo Timeout buffer handle
>> + * @param buf A timeout buffer
>> + * @param hdl NULL or a pointer where the timer handle will be written.
>> + * @param exp_tck NULL or a pointer where the expiration tick will be
>> written.
>> + * @param user_ptr NULL or a pointer where the user pointer will be
>> written.
>>   *
>> - * @return Absolute timeout tick
>> + * @return True (1) if timeout buffer is of type ODP_BUFFER_TYPE_TIMEOUT
>> and
>> + *         metadata variables have been updated from the timeout.
>> + *         False (0) if timeout buffer is not of type
>> ODP_BUFFER_TYPE_TIMEOUT.
>>   */
>> -uint64_t odp_timeout_tick(odp_timeout_t tmo);
>> +int odp_timer_tmo_metadata(odp_buffer_t buf,
>> +             odp_timer_t *hdl,
>> +             uint64_t *exp_tck,
>> +             void **user_ptr);
>
> Normal odp_buffer_type() call should be used to check the buffer type first. 
> This generic metadata call should be broke into separate calls just like 
> packet API does.
odp_timer_tmo_metadata() does anyway check the buffer type because it
does not trust the user. Better to just check the buffer type once.
You suggest that we should not have any buffer type checks in these
calls and always require the user to check the buffer type first?

Having multiple metadata accessors will potentially cause a lot of
overhead. Converting from generic buffer to timeout buffer multiple
times and then accessing the internal buffer data, this is not free
(on e.g. linux-generic). Why is it better to have multiple accessor
functions? I need an objective argument here.

>
> odp_timer_t odp_timeout_timer(tmo)
> uint64_t odp_timeout_tick(tmo)
> void* odp_timeout_user_ptr(tmo)
>
>
> -Petri
>

_______________________________________________
lng-odp mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [PATCHv2 2/3] api: odp_timer.h: updated API, lock-less implementation

Reply via email to