On 05/03/15 16:34, Martin Lucina wrote:
     for (i= 0; i < 1000000; ++i)
     {
         usecs_end = get_usecs();
         // sched_yield();
         if (usecs_end - usecs_start > 200)
             break;
     }

[snip]

Observed behaviour:

native: Loop completes before limit, usecs_end - usecs_start == ~200.

rr-xen: Loop does not complete before limit,  usecs_end - usecs_start is
zero, i.e. the time from gettimeofday is NEVER updated during the loop.

rr-baremetal: As for -xen.

rr-posix: Loop completes before limit, usecs_end - usecs_start is 10000
(100 Hz, matches the internal rump timecounter frequency).

If I uncomment the sched_yield() call, then the behaviour for -baremetal
and -xen matches -posix.

Looking at the code paths, rump_schedule() and rump_unschedule() are called
around each syscall in rump_syscall() so this should cause the clock softint to 
run *eventually* thus updating the time values inside the rump kernel. However, 
in practice that never happens.

What am I missing here? Why is the extra call to sched_yield() necessary?

rump_schedule() schedules a *rump kernel cpu*, not a host thread. sched_yield() yields the *host thread*.

The problem is that the code you posted is essentially a busy loop. Since the current timekeeping driver depends on the clock interrupt running, and since all interrupts in rump kernels are threads, the clock interrupt never runs because the *host thread* the interrupt uses is not scheduled in a busy loop.

As discussed with Antti yesterday on IRC that a proper solution for
*accurate* timekeeping is a better timecounter driver for -xen and
-baremetal. I am not disputing that but before I start developing one I'd
like to fully understand why code like this is not working *at all* with
the current arrangement.

You should understand what is at play here from the above.

Now, there are actually two issues to consider:

1) accurate timekeeping which has >clockintr resolution
2) detecting if interrupts are pending, and yielding the *host thread* in rump_un/schedule() if so

"1" is rather simple. When I wrote this incarnation of the clock, oh, 5 years ago, accurate timekeeping didn't matter because the motivations for rump kernels were different -- we added the gettimeofday() syscall only in the past year. High-accuracy timekeeping would "accidentally" avoid the issue you are seeing, because the timekeeper would return an "out-of-band" value for the clock which could change with every invocation, not just when a clock interrupt is delivered.

"2" is more controversial. I can see arguments both ways. I'm not going to discuss them in this thread. If someone wants to, start another thread.

Reply via email to