Hello,

    The attached patch fixes a minor bug in the event loop. The fixed
code may work a tiny bit faster, especially when Squid is not heavily
loaded.

The current code uses "infinite" precision when it comes to deciding
whether the next timed event is ready but uses millisecond (1e-3)
precision when deciding how long to wait before the next event will be
ready. This inconsistency results in the EventSchedule engine telling
the main loop that it has 0 milliseconds to poll pending I/O, but when
asked again (after the I/O is quickly polled), the EventSchedule engine
often does not schedule the promised event and tells the main loop to
poll for another 0 milliseconds again. This cycling may happen many
times in a row (until enough time is wasted for the next event to become
ready using higher precision).

The proposed change uses the same (millisecond) precision for both
decisions. After this change, the EventSchedule engine will schedule the
next event that has 0 milliseconds to wait instead of telling the main
loop to poll for 0 milliseconds.

Here is a bad-case before/after illustration:

* Before the change:

  +0.000 seconds: no events ready, poll for 0ms
  +0.000 seconds: poll returns
  +0.000 seconds: no events ready, poll for 0ms
  +0.000 seconds: poll returns
  +0.000 seconds: no events ready, poll for 0ms
  +0.000 seconds: poll returns
  +0.000 seconds: no events ready, poll for 0ms
  +0.000 seconds: poll returns
  ...
  +0.000 seconds: event(s) ready, scheduled, poll for 1000ms
  +1.000 seconds: poll returns

* After the change:

  +0.000 seconds: event(s) ready, scheduled, poll for 1000ms
  +1.000 seconds: poll returns

In lab tests, I have seen pointless iterations repeated 3 times (with
ALL,9 logging), but I did not look hard for longer sequences, and I
suspect the real number is often higher.


The only potentially bad side effect of this change I can think of is
that some events will be scheduled and fired a few microseconds before
they are due (instead of a few microseconds after they are due). I hope
that would not break any existing Squid code. If it does, we can adjust
EventSchedule to round up and, hence, fire all events at or after their
deadline.


Context: The next libecap release will support asynchronous (e.g.,
threaded) eCAP adapters. To support that frequently-requested feature in
Squid, we are adding another AsyncEngine to the existing four. I was
surprised to see the new engine called many times within a second on an
idle Squid. The investigation uncovered the bug addressed by the
attached patch.


HTH,

Alex.
Use millisecond precision when deciding whether a timed event is ready.

The current code uses "infinite" precision when it comes to deciding whether
the next timed event is ready but uses millisecond (1e-3) precision when
deciding how long to wait before the next event will be ready. This
inconsistency results in the EventSchedule engine telling the main loop that
it has 0 milliseconds to poll pending I/O, but when asked again (after the I/O
is quickly polled), the EventSchedule engine often does not schedule the
promised event and tells the main loop to poll for another 0 milliseconds
again. This cycling may happen many times in a row (until enough time is
wasted for the next event to become ready using higher precision).

The fixed code uses the same (millisecond) precision for both decisions.
After this change, the EventSchedule engine will schedule the next event that
has 0 milliseconds to wait instead of telling the main loop to poll for 0
milliseconds.

After this change, some events will be scheduled and fired a few microseconds
before they are due (instead of a few microseconds after they are due). If
this breaks any existing Squid code, we should adjust EventSchedule to round
up and, hence, fire all events at or after their deadline.

=== modified file 'src/event.cc'
--- src/event.cc	2012-09-01 14:38:36 +0000
+++ src/event.cc	2013-09-12 15:50:46 +0000
@@ -203,93 +203,84 @@
         if (arg)
             return;
         /*
          * DPW 2007-04-12
          * Since this method may now delete multiple events (when
          * arg is NULL) it no longer returns after a deletion and
          * we have a potential NULL pointer problem.  If we just
          * deleted the last event in the list then *E is now equal
          * to NULL.  We need to break here or else we'll get a NULL
          * pointer dereference in the last clause of the for loop.
          */
         if (NULL == *E)
             break;
     }
 
     if (arg)
         debug_trap("eventDelete: event not found");
 }
 
 int
-EventScheduler::checkDelay()
+EventScheduler::timeRemaining() const
 {
-    if (!tasks)
-        return EVENT_IDLE;
-
-    int result = (int) ((tasks->when - current_dtime) * 1000);
-
-    if (result < 0)
-        return 0;
-
-    return result;
+    if (tasks) {
+       const int diff = static_cast<int>((tasks->when - current_dtime) * 1000);
+       return max(diff, 0);
+    }
+    return EVENT_IDLE;
 }
 
 int
 EventScheduler::checkEvents(int timeout)
 {
-
-    ev_entry *event = NULL;
-
-    if (NULL == tasks)
-        return checkDelay();
-
-    if (tasks->when > current_dtime)
-        return checkDelay();
+    int result = timeRemaining();
+    if (result != 0)
+        return result;
 
     PROF_start(eventRun);
 
-    debugs(41, 5, HERE << "checkEvents");
-
-    while ((event = tasks)) {
-        if (event->when > current_dtime)
-            break;
+    do {
+        ev_entry *event = tasks;
+        assert(event);
 
         /* XXX assumes event->name is static memory! */
         AsyncCall::Pointer call = asyncCall(41,5, event->name,
                                             EventDialer(event->func, event->arg, event->cbdata));
         ScheduleCallHere(call);
 
         last_event_ran = event->name; // XXX: move this to AsyncCallQueue
         const bool heavy = event->weight &&
                            (!event->cbdata || cbdataReferenceValid(event->arg));
 
         tasks = event->next;
         delete event;
 
+        result = timeRemaining();
+
         // XXX: We may be called again during the same event loop iteration.
         // Is there a point in breaking now?
         if (heavy)
             break; // do not dequeue events following a heavy event
-    }
+    } while (result == 0);
 
     PROF_stop(eventRun);
-    return checkDelay();
+    return result;
 }
 
 void
 EventScheduler::clean()
 {
     while (ev_entry * event = tasks) {
         tasks = event->next;
         delete event;
     }
 
     tasks = NULL;
 }
 
 void
 EventScheduler::dump(StoreEntry * sentry)
 {
 
     ev_entry *e = tasks;
 
     if (last_event_ran)

=== modified file 'src/event.h'
--- src/event.h	2013-05-04 11:50:26 +0000
+++ src/event.h	2013-09-12 15:35:33 +0000
@@ -63,37 +63,37 @@
 
     int weight;
     bool cbdata;
 
     ev_entry *next;
 };
 
 MEMPROXY_CLASS_INLINE(ev_entry);
 
 // manages time-based events
 class EventScheduler : public AsyncEngine
 {
 
 public:
     EventScheduler();
     ~EventScheduler();
     /* cancel a scheduled but not dispatched event */
     void cancel(EVH * func, void * arg);
     /* clean up the used memory in the scheduler */
     void clean();
-    /* how long until the next event ? */
-    int checkDelay();
+    /* either EVENT_IDLE or milliseconds remaining until the next event */
+    int timeRemaining() const;
     /* cache manager output for the event queue */
     void dump(StoreEntry *);
     /* find a scheduled event */
     bool find(EVH * func, void * arg);
     /* schedule a callback function to run in when seconds */
     void schedule(const char *name, EVH * func, void *arg, double when, int weight, bool cbdata=true);
     int checkEvents(int timeout);
     static EventScheduler *GetInstance();
 
 private:
     static EventScheduler _instance;
     ev_entry * tasks;
 };
 
 #endif /* SQUID_EVENT_H */

Reply via email to