On 05/03/15 16:34, Martin Lucina wrote:
for (i= 0; i < 1000000; ++i)
{
usecs_end = get_usecs();
// sched_yield();
if (usecs_end - usecs_start > 200)
break;
}
[snip]
Observed behaviour:
native: Loop completes before limit, usecs_end - usecs_start == ~200.
rr-xen: Loop does not complete before limit, usecs_end - usecs_start is
zero, i.e. the time from gettimeofday is NEVER updated during the loop.
rr-baremetal: As for -xen.
rr-posix: Loop completes before limit, usecs_end - usecs_start is 10000
(100 Hz, matches the internal rump timecounter frequency).
If I uncomment the sched_yield() call, then the behaviour for -baremetal
and -xen matches -posix.
Looking at the code paths, rump_schedule() and rump_unschedule() are called
around each syscall in rump_syscall() so this should cause the clock softint to
run *eventually* thus updating the time values inside the rump kernel. However,
in practice that never happens.
What am I missing here? Why is the extra call to sched_yield() necessary?
rump_schedule() schedules a *rump kernel cpu*, not a host thread.
sched_yield() yields the *host thread*.
The problem is that the code you posted is essentially a busy loop.
Since the current timekeeping driver depends on the clock interrupt
running, and since all interrupts in rump kernels are threads, the clock
interrupt never runs because the *host thread* the interrupt uses is not
scheduled in a busy loop.
As discussed with Antti yesterday on IRC that a proper solution for
*accurate* timekeeping is a better timecounter driver for -xen and
-baremetal. I am not disputing that but before I start developing one I'd
like to fully understand why code like this is not working *at all* with
the current arrangement.
You should understand what is at play here from the above.
Now, there are actually two issues to consider:
1) accurate timekeeping which has >clockintr resolution
2) detecting if interrupts are pending, and yielding the *host thread*
in rump_un/schedule() if so
"1" is rather simple. When I wrote this incarnation of the clock, oh, 5
years ago, accurate timekeeping didn't matter because the motivations
for rump kernels were different -- we added the gettimeofday() syscall
only in the past year. High-accuracy timekeeping would "accidentally"
avoid the issue you are seeing, because the timekeeper would return an
"out-of-band" value for the clock which could change with every
invocation, not just when a clock interrupt is delivered.
"2" is more controversial. I can see arguments both ways. I'm not
going to discuss them in this thread. If someone wants to, start
another thread.