Steve, new patch included. Steven Dake wrote: > In general the patch looks very good. The issues are as follows: > > 1) > > corosync_timer_add_absolute expects the expire time to be the > nanoseconds since the epoch, or more specifically, > > from the time(7) man page: > The Epoch > Unix systems represent time in seconds since the Epoch, > which is > defined as 0:00:00 UTC on the morning of 1 January 1970. > > In this case, the expire_time may not be sufficient. We want absolute > timers to match the wall clock time rather then be based upon the offset > from the current time. If that is not the case with your patch, that > needs to be fixed. absolute timers are used by services to expire > timers when a specific wall clock time has passed. So in this case, if > ntp moves the time forward, we want that timer to expire, if its > recorded wall clock time is past the current time, the timer should be > expired.
This is biggest change. I added is_absolute_timer to timerlist_timer. If this is true -> it's absolute timer and uses old behaviour. If not, it's relative and uses new behaviour. timerlist_add_absolute sets that variable to true, and timerlist_add_duration to false. This is (I hope) how this thing should behave. > > 2) souce should be spelled source in configure.ac > > also the logic to detect monotonic clocks in configure.ac seems complex. > I'd like Jim to have a look at the configure.ac to determine if there is > a better way. Typo fixed + Jim added to CC. Jim, I don't think logic is so complex, but maybe there is really some better solution, I'm not autotools expert. > > 3) get_monotonic_timeofday used in pload.c and votequorum.c and > totemsrp.c should be in tlist.h and should be named differently. I'd > choose timerlist_timer_duration_current_get() with same semantics of the > function. > > 4) > an additional function in tlist would be helpful for calculating the > difference in times, such as timerlist_timer_duration_calculate (tv1, > tv2, tv_out) Fixed in way point 7 suggest ;) > > 5) > + nano_monotonic_hz = ts.tv_sec + 1000000000ULL / (unsigned long > long )ts. > experienced c programmers may be able to read this code if they have a > good understanding of operator precedence. I'd recommend parens to help > clarify to other maintainers the desired order of precedence. Clarified > > 6) > + timer->expire_time= timerlist_nano_monotonic_time() + nano_duration; > > missing space Missing space added > > 7) > regarding comment 3 and 4, this could be simplified throughout the code > base by making a new function in timerlist called > > timerlist_nano_current_get() which retrieves the current nanoseconds > from either the epoch or the monotonic clock source depending on os > support. > > this gets rid of all of the timersub junk throughout the code. Done. This makes some bigger changes in computing values, but I hope it makes it clearer. As bonus, I added TIMERLIST_?S_IN_?SEC macros, because writing everywhere values like 1000000000ULL, doesn't look good for me. > > 8) > after reviewing the changes, I'm not convinced the absolute timers work > as expected by the coroapi. > > good work I think now will. > > Regards > -steve > > Regards, Honza > > On Tue, 2009-07-21 at 15:42 +0200, Jan Friesse wrote: >> Patch should solve problem with ntp and corosync (what it really does) >> by using clock_gettime. >> >> I tried to find EVERY appearance of gettimeofday and where it makes >> sense (so basically, where this value is used as timer) convert to call >> clock_gettime. >> >> Patch modifies configure.ac to detect if clock_gettime and >> CLOCK_MONOTONIC exists on system. If not, gettimeofday will be used (so >> you will get old behavior). >> >> Comments are welcomed. >> >> Regards, >> Honza >> _______________________________________________ >> Openais mailing list >> Openais@lists.linux-foundation.org >> https://lists.linux-foundation.org/mailman/listinfo/openais >
commit 9f768ad4b1d6004a9a199813fd3d884554297e15 Author: Jan Friesse <jfrie...@redhat.com> Date: Tue Jul 21 15:33:52 2009 +0200 Support for monotime timer This patch should solve problems with corosync and ntp, by using clock_gettime where is available (this is why configure.ac is modified), and when using this functions make sense. diff --git a/trunk/configure.ac b/trunk/configure.ac index bff779a..a30b41c 100644 --- a/trunk/configure.ac +++ b/trunk/configure.ac @@ -107,6 +107,26 @@ AC_CHECK_FUNCS([alarm alphasort atexit bzero dup2 endgrent endpwent fcntl \ pthread_spin_unlock pthread_setschedparam \ sched_get_priority_max sched_setscheduler]) +# Check for time source to use (monotonic in function clock_gettime, or gettimeofday) +if test "x$USE_GETTIMEOFDAY" = "x" +then + AC_CHECK_FUNCS([clock_gettime], + dnl clock_gettime() found + [AC_CHECK_DECLS([CLOCK_MONOTONIC],AC_DEFINE([USE_CLOCK_MONOTONIC],,[Clock type to use]), + dnl no CLOCK_MONOTONIC + [AC_CHECK_FUNCS([gettimeofday],AC_DEFINE([USE_GETTIMEOFDAY],,[Clock type to use]),AC_MSG_ERROR([No suitable clock found]))], + dnl for CLOCK_MONOTONIC + [#include <time.h>])], + dnl clock_gettime() not found + [AC_CHECK_FUNCS([gettimeofday],AC_DEFINE([USE_GETTIMEOFDAY],,[Clock type to use]),AC_MSG_ERROR([No suitable clock found]))] + ) +else + AC_MSG_NOTICE([Forced to use gettimeofday()]) + AC_CHECK_FUNCS([gettimeofday],AC_DEFINE([USE_GETTIMEOFDAY],,[Clock type to use]),AC_MSG_ERROR([No suitable clock found])) + AC_CHECK_FUNCS([clock_gettime]) + AC_CHECK_TYPES([clockid_t]) +fi + AC_CONFIG_FILES([Makefile exec/Makefile include/Makefile diff --git a/trunk/exec/timer.c b/trunk/exec/timer.c index 69f9a95..e5a419a 100644 --- a/trunk/exec/timer.c +++ b/trunk/exec/timer.c @@ -174,7 +174,7 @@ int corosync_timer_init ( } int corosync_timer_add_absolute ( - unsigned long long nanosec_from_epoch, + unsigned long long expire_time, void *data, void (*timer_fn) (void *data), timer_handle *handle) @@ -193,7 +193,7 @@ int corosync_timer_add_absolute ( &timers_timerlist, timer_fn, data, - nanosec_from_epoch, + expire_time, handle); if (unlock) { diff --git a/trunk/exec/timer.h b/trunk/exec/timer.h index 3df9d36..9f13170 100644 --- a/trunk/exec/timer.h +++ b/trunk/exec/timer.h @@ -49,7 +49,7 @@ extern int corosync_timer_add_duration ( corosync_timer_handle *handle); extern int corosync_timer_add_absolute ( - unsigned long long nanoseconds_from_epoch, + unsigned long long expire_time, void *data, void (*timer_fn) (void *data), corosync_timer_handle *handle); diff --git a/trunk/exec/tlist.h b/trunk/exec/tlist.h index 263c8d7..2bef60a 100644 --- a/trunk/exec/tlist.h +++ b/trunk/exec/tlist.h @@ -54,6 +54,13 @@ typedef void * timer_handle; #define TIMER_HANDLE #endif +#define TIMERLIST_MS_IN_SEC 1000ULL +#define TIMERLIST_US_IN_SEC 1000000ULL +#define TIMERLIST_NS_IN_SEC 1000000000ULL +#define TIMERLIST_US_IN_MSEC 1000ULL +#define TIMERLIST_NS_IN_MSEC 1000000ULL +#define TIMERLIST_NS_IN_USEC 1000ULL + struct timerlist { struct list_head timer_head; struct list_head *timer_iter; @@ -61,7 +68,8 @@ struct timerlist { struct timerlist_timer { struct list_head list; - unsigned long long nano_from_epoch; + unsigned long long expire_time; + int is_absolute_timer; void (*timer_fn)(void *data); void *data; timer_handle handle_addr; @@ -78,10 +86,47 @@ static inline unsigned long long timerlist_nano_from_epoch (void) struct timeval time_from_epoch; gettimeofday (&time_from_epoch, 0); - nano_from_epoch = ((time_from_epoch.tv_sec * 1000000000ULL) + (time_from_epoch.tv_usec * 1000ULL)); + nano_from_epoch = ((time_from_epoch.tv_sec * TIMERLIST_NS_IN_SEC) + + (time_from_epoch.tv_usec * TIMERLIST_NS_IN_USEC)); + return (nano_from_epoch); } +#ifdef USE_CLOCK_MONOTONIC +static inline unsigned long long timerlist_nano_monotonic_time (void) +{ + unsigned long long nano_monotonic; + struct timespec ts; + + clock_gettime (CLOCK_MONOTONIC, &ts); + + nano_monotonic = (ts.tv_sec * TIMERLIST_NS_IN_SEC) + (unsigned long long )ts.tv_nsec; + return (nano_monotonic); +} + +static inline unsigned long long timerlist_nano_monotonic_hz (void) { + unsigned long long nano_monotonic_hz; + struct timespec ts; + + clock_getres (CLOCK_MONOTONIC, &ts); + + nano_monotonic_hz = ts.tv_sec + (TIMERLIST_NS_IN_SEC / ((unsigned long long )ts.tv_nsec)); + + return (nano_monotonic_hz); +} +#endif + +/* Return current time. It can be monotonic or time from epoch, depending on system you + are using. */ +static inline unsigned long long timerlist_nano_current_get (void) +{ +#ifdef USE_CLOCK_MONOTONIC + return (timerlist_nano_monotonic_time ()); +#else + return (timerlist_nano_from_epoch ()); +#endif +} + static inline void timerlist_add (struct timerlist *timerlist, struct timerlist_timer *timer) { struct list_head *timer_list = 0; @@ -95,7 +140,7 @@ static inline void timerlist_add (struct timerlist *timerlist, struct timerlist_ timer_from_list = list_entry (timer_list, struct timerlist_timer, list); - if (timer_from_list->nano_from_epoch > timer->nano_from_epoch) { + if (timer_from_list->expire_time > timer->expire_time) { list_add (&timer->list, timer_list->prev); found = 1; break; /* for timer iteration */ @@ -109,7 +154,7 @@ static inline void timerlist_add (struct timerlist *timerlist, struct timerlist_ static inline int timerlist_add_absolute (struct timerlist *timerlist, void (*timer_fn) (void *data), void *data, - unsigned long long nano_from_epoch, + unsigned long long expire_time, timer_handle *handle) { struct timerlist_timer *timer; @@ -120,7 +165,8 @@ static inline int timerlist_add_absolute (struct timerlist *timerlist, return (-1); } - timer->nano_from_epoch = nano_from_epoch; + timer->expire_time = expire_time; + timer->is_absolute_timer = 1; timer->data = data; timer->timer_fn = timer_fn; timer->handle_addr = handle; @@ -144,7 +190,12 @@ static inline int timerlist_add_duration (struct timerlist *timerlist, return (-1); } - timer->nano_from_epoch = timerlist_nano_from_epoch() + nano_duration; +#ifdef USE_CLOCK_MONOTONIC + timer->expire_time = timerlist_nano_monotonic_time() + nano_duration; +#else + timer->expire_time = timerlist_nano_from_epoch() + nano_duration; +#endif + timer->is_absolute_timer = 0; timer->data = data; timer->timer_fn = timer_fn; timer->handle_addr = handle; @@ -177,7 +228,7 @@ static inline unsigned long long timerlist_expire_time (struct timerlist *timerl { struct timerlist_timer *timer = (struct timerlist_timer *)_timer_handle; - return (timer->nano_from_epoch); + return (timer->expire_time); } static inline void timerlist_pre_dispatch (struct timerlist *timerlist, timer_handle _timer_handle) @@ -202,7 +253,7 @@ static inline void timerlist_post_dispatch (struct timerlist *timerlist, timer_h static inline unsigned long long timerlist_msec_duration_to_expire (struct timerlist *timerlist) { struct timerlist_timer *timer_from_list; - volatile unsigned long long nano_from_epoch; + volatile unsigned long long current_time; volatile unsigned long long msec_duration_to_expire; /* @@ -215,16 +266,23 @@ static inline unsigned long long timerlist_msec_duration_to_expire (struct timer timer_from_list = list_entry (timerlist->timer_head.next, struct timerlist_timer, list); - nano_from_epoch = timerlist_nano_from_epoch(); - +#ifdef USE_CLOCK_MONOTONIC + if (timer_from_list->is_absolute_timer) { + current_time = timerlist_nano_from_epoch(); + } else { + current_time = timerlist_nano_monotonic_time(); + } +#else + current_time = timerlist_nano_from_epoch(); +#endif /* * timer at head of list is expired, zero msecs required */ - if (timer_from_list->nano_from_epoch < nano_from_epoch) { + if (timer_from_list->expire_time < current_time) { return (0); } - msec_duration_to_expire = ((timer_from_list->nano_from_epoch - nano_from_epoch) / 1000000ULL) + + msec_duration_to_expire = ((timer_from_list->expire_time - current_time) / TIMERLIST_NS_IN_MSEC) + (1000 / HZ); return (msec_duration_to_expire); } @@ -235,9 +293,16 @@ static inline unsigned long long timerlist_msec_duration_to_expire (struct timer static inline void timerlist_expire (struct timerlist *timerlist) { struct timerlist_timer *timer_from_list; - unsigned long long nano_from_epoch; + unsigned long long current_time_from_epoch; +#ifdef USE_CLOCK_MONOTONIC + unsigned long long current_monotonic_time; +#endif + unsigned long long current_time; - nano_from_epoch = timerlist_nano_from_epoch(); +#ifdef USE_CLOCK_MONOTONIC + current_monotonic_time = timerlist_nano_monotonic_time(); +#endif + current_time_from_epoch = current_time = timerlist_nano_from_epoch(); for (timerlist->timer_iter = timerlist->timer_head.next; timerlist->timer_iter != &timerlist->timer_head;) { @@ -245,7 +310,10 @@ static inline void timerlist_expire (struct timerlist *timerlist) timer_from_list = list_entry (timerlist->timer_iter, struct timerlist_timer, list); - if (timer_from_list->nano_from_epoch < nano_from_epoch) { +#ifdef USE_CLOCK_MONOTONIC + current_time = (timer_from_list->is_absolute_timer ? current_time_from_epoch : current_monotonic_time); +#endif + if (timer_from_list->expire_time < current_time) { timerlist->timer_iter = timerlist->timer_iter->next; timerlist_pre_dispatch (timerlist, timer_from_list); diff --git a/trunk/exec/totemsrp.c b/trunk/exec/totemsrp.c index c5b5235..9ea2d55 100644 --- a/trunk/exec/totemsrp.c +++ b/trunk/exec/totemsrp.c @@ -89,6 +89,7 @@ #include "wthread.h" #include "crypto.h" +#include "tlist.h" #define LOCALHOST_IP inet_addr("127.0.0.1") #define QUEUE_RTR_ITEMS_SIZE_MAX 256 /* allow 256 retransmit items */ @@ -504,7 +505,7 @@ struct totemsrp_instance { unsigned int my_cbl; - struct timeval pause_timestamp; + unsigned long long int pause_timestamp; }; struct message_handlers { @@ -691,14 +692,12 @@ static unsigned int main_msgs_missing (void) static int pause_flush (struct totemsrp_instance *instance) { - struct timeval now; uint64_t now_msec; uint64_t timestamp_msec; int res = 0; - gettimeofday (&now, NULL); - now_msec = ((now.tv_sec * 1000ULL) + (now.tv_usec / 1000ULL)); - timestamp_msec = ((instance->pause_timestamp.tv_sec * 1000ULL) + (instance->pause_timestamp.tv_usec/1000ULL)); + now_msec = (timerlist_nano_current_get () / TIMERLIST_NS_IN_MSEC); + timestamp_msec = instance->pause_timestamp / TIMERLIST_NS_IN_MSEC; if ((now_msec - timestamp_msec) > (instance->totem_config->token_timeout / 2)) { log_printf (instance->totemsrp_log_level_notice, @@ -845,7 +844,7 @@ int totemsrp_initialize ( instance->totemsrp_confchg_fn = confchg_fn; instance->use_heartbeat = 1; - gettimeofday (&instance->pause_timestamp, NULL); + instance->pause_timestamp = timerlist_nano_current_get (); if ( totem_config->heartbeat_failures_allowed == 0 ) { log_printf (instance->totemsrp_log_level_debug, @@ -1524,7 +1523,7 @@ static void timer_function_pause_timeout (void *data) { struct totemsrp_instance *instance = data; - gettimeofday (&instance->pause_timestamp, NULL); + instance->pause_timestamp = timerlist_nano_current_get (); reset_pause_timeout (instance); } @@ -3288,7 +3287,7 @@ static void fcc_token_update ( * Message Handlers */ -struct timeval tv_old; +unsigned long long int tv_old; /* * message handler called when TOKEN message type received */ @@ -3308,17 +3307,15 @@ static int message_handler_orf_token ( unsigned int last_aru; #ifdef GIVEINFO - struct timeval tv_current; - struct timeval tv_diff; + unsigned long long tv_current; + unsigned long long tv_diff; - gettimeofday (&tv_current, NULL); - timersub (&tv_current, &tv_old, &tv_diff); - memcpy (&tv_old, &tv_current, sizeof (struct timeval)); + tv_current = timerlist_nano_current_get (); + tv_diff = tv_current - tv_old; + tv_old = tv_current; log_printf (instance->totemsrp_log_level_debug, - "Time since last token %0.4f ms\n", - (((float)tv_diff.tv_sec) * 1000) + ((float)tv_diff.tv_usec) - / 1000.0); + "Time since last token %0.4f ms\n", ((float)tv_diff) / 1000000.0); #endif #ifdef TEST_DROP_ORF_TOKEN_PERCENTAGE @@ -3555,12 +3552,12 @@ static int message_handler_orf_token ( token_send (instance, token, forward_token); #ifdef GIVEINFO - gettimeofday (&tv_current, NULL); - timersub (&tv_current, &tv_old, &tv_diff); - memcpy (&tv_old, &tv_current, sizeof (struct timeval)); + tv_current = timerlist_nano_current_get (); + tv_diff = tv_current - tv_old; + tv_old = tv_current; log_printf (instance->totemsrp_log_level_debug, "I held %0.4f ms\n", - ((float)tv_diff.tv_usec) / 1000.0); + ((float)tv_diff) / 1000000.0); #endif if (instance->memb_state == MEMB_STATE_OPERATIONAL) { messages_deliver_to_app (instance, 0, diff --git a/trunk/services/pload.c b/trunk/services/pload.c index cf82bbc..19bc79a 100644 --- a/trunk/services/pload.c +++ b/trunk/services/pload.c @@ -62,6 +62,8 @@ #include <corosync/list.h> #include <corosync/engine/logsys.h> +#include "../exec/tlist.h" + LOGSYS_DECLARE_SUBSYS ("PLOAD"); enum pload_exec_message_req_types { @@ -344,9 +346,9 @@ static void message_handler_req_exec_pload_start ( } while (0) #endif -struct timeval tv1; -struct timeval tv2; -struct timeval tv_elapsed; +unsigned long long int tv1; +unsigned long long int tv2; +unsigned long long int tv_elapsed; int last_msg_no = 0; static void message_handler_req_exec_pload_mcast ( @@ -357,19 +359,19 @@ static void message_handler_req_exec_pload_mcast ( last_msg_no = pload_mcast->msg_code; if (msgs_delivered == 0) { - gettimeofday (&tv1, NULL); + tv1 = timerlist_nano_current_get (); } msgs_delivered += 1; if (msgs_delivered == msgs_wanted) { - gettimeofday (&tv2, NULL); - timersub (&tv2, &tv1, &tv_elapsed); + tv2 = timerlist_nano_current_get (); + tv_elapsed = tv2 - tv1; printf ("%5d Writes ", msgs_delivered); printf ("%5d bytes per write ", msg_size); printf ("%7.3f Seconds runtime ", - (tv_elapsed.tv_sec + (tv_elapsed.tv_usec / 1000000.0))); + (tv_elapsed / 1000000000.0)); printf ("%9.3f TP/s ", - ((float)msgs_delivered) / (tv_elapsed.tv_sec + (tv_elapsed.tv_usec / 1000000.0))); + ((float)msgs_delivered) / (tv_elapsed / 1000000000.0)); printf ("%7.3f MB/s.\n", - ((float)msgs_delivered) * ((float)msg_size) / ((tv_elapsed.tv_sec + (tv_elapsed.tv_usec / 1000000.0)) * 1000000.0)); + ((float)msgs_delivered) * ((float)msg_size) / ((tv_elapsed / 1000000000.0)) * 1000000.0); } } diff --git a/trunk/services/votequorum.c b/trunk/services/votequorum.c index 79855bf..273d50e 100644 --- a/trunk/services/votequorum.c +++ b/trunk/services/votequorum.c @@ -68,6 +68,8 @@ #include <corosync/ipc_votequorum.h> #include <corosync/list.h> +#include "../exec/tlist.h" + #define VOTEQUORUM_MAJOR_VERSION 7 #define VOTEQUORUM_MINOR_VERSION 0 #define VOTEQUORUM_PATCH_VERSION 0 @@ -107,7 +109,7 @@ struct cluster_node { nodestate_t state; - struct timeval last_hello; /* Only used for quorum devices */ + unsigned long long int last_hello; /* Only used for quorum devices */ struct list_head list; }; @@ -1307,13 +1309,13 @@ static void message_handler_req_lib_votequorum_leaving (void *conn, const void * static void quorum_device_timer_fn(void *arg) { - struct timeval now; - ENTER(); if (!quorum_device || quorum_device->state == NODESTATE_DEAD) return; - gettimeofday(&now, NULL); - if (quorum_device->last_hello.tv_sec + quorumdev_poll/1000 < now.tv_sec) { + + if ( (quorum_device->last_hello / TIMERLIST_NS_IN_SEC) + quorumdev_poll/1000 < + (timerlist_nano_current_get () / TIMERLIST_NS_IN_SEC)) { + quorum_device->state = NODESTATE_DEAD; log_printf(LOGSYS_LEVEL_INFO, "lost contact with quorum device\n"); recalculate_quorum(0, 0); @@ -1395,7 +1397,7 @@ static void message_handler_req_lib_votequorum_qdisk_poll (void *conn, if (quorum_device) { if (req_lib_votequorum_qdisk_poll->state) { - gettimeofday(&quorum_device->last_hello, NULL); + quorum_device->last_hello = timerlist_nano_current_get (); if (quorum_device->state == NODESTATE_DEAD) { quorum_device->state = NODESTATE_MEMBER; recalculate_quorum(0, 0);
_______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais