On 10/26/2010 01:30 AM, Jeremy Fitzhardinge wrote:
Unfortunately this is breaking Xen save/restore: if you restore on a
host which was booted more recently than the save host, causing the
system time to be smaller. The effect is that the domain's time leaps
forward to a fixed point, and stays
On Tue, 2010-10-26 at 10:14 +0200, Avi Kivity wrote:
On 10/26/2010 01:30 AM, Jeremy Fitzhardinge wrote:
Unfortunately this is breaking Xen save/restore: if you restore on a
host which was booted more recently than the save host, causing the
system time to be smaller. The effect is that the
On 10/26/2010 01:14 AM, Avi Kivity wrote:
On 10/26/2010 01:30 AM, Jeremy Fitzhardinge wrote:
Unfortunately this is breaking Xen save/restore: if you restore on a
host which was booted more recently than the save host, causing the
system time to be smaller. The effect is that the domain's
On 04/15/2010 11:37 AM, Glauber Costa wrote:
In recent stress tests, it was found that pvclock-based systems
could seriously warp in smp systems. Using ingo's time-warp-test.c,
I could trigger a scenario as bad as 1.5mi warps a minute in some systems.
(to be fair, it wasn't that bad in most
On 04/24/2010 12:31 AM, Zachary Amsden wrote:
On 04/22/2010 11:34 PM, Avi Kivity wrote:
On 04/23/2010 04:44 AM, Zachary Amsden wrote:
Or apply this patch.
time-warp.patch
diff -rup a/time-warp-test.c b/time-warp-test.c
--- a/time-warp-test.c2010-04-15 16:30:13.955981607 -1000
+++
On 04/24/2010 12:41 AM, Zachary Amsden wrote:
rsm is not technically privileged... but not quite usable from
usermode ;)
rsm under hardware virtualization makes my head hurt
Either one independently is sufficient for me.
--
Do not meddle in the internals of kernels, for they are subtle and
On 04/23/2010 04:44 AM, Zachary Amsden wrote:
Or apply this patch.
time-warp.patch
diff -rup a/time-warp-test.c b/time-warp-test.c
--- a/time-warp-test.c 2010-04-15 16:30:13.955981607 -1000
+++ b/time-warp-test.c 2010-04-15 16:35:37.777982377 -1000
@@ -91,7 +91,7 @@ static inline unsigned
On 04/23/2010 02:34 AM, Avi Kivity wrote:
diff -rup a/time-warp-test.c b/time-warp-test.c
--- a/time-warp-test.c2010-04-15 16:30:13.955981607 -1000
+++ b/time-warp-test.c2010-04-15 16:35:37.777982377 -1000
@@ -91,7 +91,7 @@ static inline unsigned long long __rdtsc
{
On 04/23/2010 10:22 PM, Jeremy Fitzhardinge wrote:
On 04/23/2010 02:34 AM, Avi Kivity wrote:
diff -rup a/time-warp-test.c b/time-warp-test.c
--- a/time-warp-test.c2010-04-15 16:30:13.955981607 -1000
+++ b/time-warp-test.c2010-04-15 16:35:37.777982377 -1000
@@ -91,7 +91,7 @@ static
On 04/22/2010 11:34 PM, Avi Kivity wrote:
On 04/23/2010 04:44 AM, Zachary Amsden wrote:
Or apply this patch.
time-warp.patch
diff -rup a/time-warp-test.c b/time-warp-test.c
--- a/time-warp-test.c2010-04-15 16:30:13.955981607 -1000
+++ b/time-warp-test.c2010-04-15 16:35:37.777982377
On 04/23/2010 02:31 PM, Zachary Amsden wrote:
Does lfence / mfence actually serialize? I thought there was some
great confusion about that not being the case on all AMD processors,
and possibly not at all on Intel.
A trap, however is a great way to serialize.
I think, there is no
On 04/23/2010 11:35 AM, Jeremy Fitzhardinge wrote:
On 04/23/2010 02:31 PM, Zachary Amsden wrote:
Does lfence / mfence actually serialize? I thought there was some
great confusion about that not being the case on all AMD processors,
and possibly not at all on Intel.
A trap, however is a
On Tue, Apr 20, 2010 at 12:42:17PM -0700, Jeremy Fitzhardinge wrote:
On 04/20/2010 11:54 AM, Avi Kivity wrote:
On 04/20/2010 09:23 PM, Jeremy Fitzhardinge wrote:
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you want this code in pvclock.c, or shall we keep it kvmclock
specific?
On 04/22/2010 03:11 AM, Glauber Costa wrote:
On Tue, Apr 20, 2010 at 12:42:17PM -0700, Jeremy Fitzhardinge wrote:
On 04/20/2010 11:54 AM, Avi Kivity wrote:
On 04/20/2010 09:23 PM, Jeremy Fitzhardinge wrote:
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you
On 04/21/2010 03:01 AM, Zachary Amsden wrote:
on this machine Glauber mentioned, or even on a multi-core Core 2 Duo),
but the delta calculation is very hard (if not impossible) to get
right.
The timewarps i've seen were in the 0-200ns range, and very rare (once
every 10 minutes or so).
On 04/21/2010 03:05 AM, Zachary Amsden wrote:
On 04/19/2010 11:39 PM, Avi Kivity wrote:
On 04/19/2010 09:35 PM, Zachary Amsden wrote:
Sockets and boards too? (IOW, how reliable is TSC_RELIABLE)?
Not sure, IIRC we clear that when the TSC sync test fails, eg when we
mark the tsc clocksource
On 04/19/2010 07:18 PM, Jeremy Fitzhardinge wrote:
On 04/19/2010 07:46 AM, Peter Zijlstra wrote:
What avi says! :-)
On a 32bit machine a 64bit read are two 32bit reads, so
last = last_value;
becomes:
last.high = last_value.high;
last.low = last_vlue.low;
(or the reverse of
On 04/20/2010 04:57 AM, Marcelo Tosatti wrote:
Marcelo can probably confirm it, but he has a nehalem with an appearently
very good tsc source. Even this machine warps.
It stops warping if we only write pvclock data structure once and forget it,
(which only updated tsc_timestamp once),
On 04/19/2010 09:35 PM, Zachary Amsden wrote:
Sockets and boards too? (IOW, how reliable is TSC_RELIABLE)?
Not sure, IIRC we clear that when the TSC sync test fails, eg when we
mark the tsc clocksource unusable.
Worrying. By the time we detect this the guest may already have
gotten
On Tue, Apr 20, 2010 at 12:35:19PM +0300, Avi Kivity wrote:
On 04/20/2010 04:57 AM, Marcelo Tosatti wrote:
Marcelo can probably confirm it, but he has a nehalem with an appearently
very good tsc source. Even this machine warps.
It stops warping if we only write pvclock data structure once
On 04/20/2010 03:59 PM, Glauber Costa wrote:
Might be due to NMIs or SMIs interrupting the rdtsc(); ktime_get()
operation which establishes the timeline. We could limit it by
having a loop doing rdtsc(); ktime_get(); rdtsc(); and checking for
some bound, but it isn't worthwhile (and will
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you want this code in pvclock.c, or shall we keep it kvmclock
specific?
I think its a pvclock-level fix. I'd been hoping to avoid having
something like this, but I think its ultimately necessary.
J
--
To unsubscribe from this list: send
On 04/20/2010 09:23 PM, Jeremy Fitzhardinge wrote:
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you want this code in pvclock.c, or shall we keep it kvmclock
specific?
I think its a pvclock-level fix. I'd been hoping to avoid having
something like this, but I think its
On 04/20/2010 11:54 AM, Avi Kivity wrote:
On 04/20/2010 09:23 PM, Jeremy Fitzhardinge wrote:
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you want this code in pvclock.c, or shall we keep it kvmclock
specific?
I think its a pvclock-level fix. I'd been hoping to avoid having
On 04/19/2010 11:35 PM, Avi Kivity wrote:
On 04/20/2010 04:57 AM, Marcelo Tosatti wrote:
Marcelo can probably confirm it, but he has a nehalem with an
appearently
very good tsc source. Even this machine warps.
It stops warping if we only write pvclock data structure once and
forget it,
On 04/20/2010 09:42 AM, Jeremy Fitzhardinge wrote:
On 04/20/2010 11:54 AM, Avi Kivity wrote:
On 04/20/2010 09:23 PM, Jeremy Fitzhardinge wrote:
On 04/20/2010 02:31 AM, Avi Kivity wrote:
btw, do you want this code in pvclock.c, or shall we keep it kvmclock
specific?
On Fri, 2010-04-16 at 13:36 -0700, Jeremy Fitzhardinge wrote:
+ do {
+ last = last_value;
Does this need a barrier() to prevent the compiler from re-reading
last_value for the subsequent lines? Otherwise (ret last) and
return last could execute with different values
On Sat, 2010-04-17 at 21:49 +0300, Avi Kivity wrote:
On 04/17/2010 09:48 PM, Avi Kivity wrote:
+static u64 last_value = 0;
Needs to be atomic64_t.
+
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
{
struct pvclock_shadow_time shadow;
On Sat, 2010-04-17 at 21:48 +0300, Avi Kivity wrote:
After this patch is applied, I don't see a single warp in time during 5 days
of execution, in any of the machines I saw them before.
Please define a cpuid bit that makes this optional. When we eventually
enable it in the
On 04/19/2010 01:43 PM, Peter Zijlstra wrote:
+
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
{
struct pvclock_shadow_time shadow;
unsigned version;
cycle_t ret, offset;
+u64 last;
+do {
+last = last_value;
On 04/19/2010 01:46 PM, Peter Zijlstra wrote:
On Sat, 2010-04-17 at 21:48 +0300, Avi Kivity wrote:
After this patch is applied, I don't see a single warp in time during 5 days
of execution, in any of the machines I saw them before.
Please define a cpuid bit that makes this
On Mon, 2010-04-19 at 12:46 +0200, Peter Zijlstra wrote:
On Sat, 2010-04-17 at 21:48 +0300, Avi Kivity wrote:
After this patch is applied, I don't see a single warp in time during 5
days
of execution, in any of the machines I saw them before.
Please define a cpuid bit
On 04/19/2010 01:39 PM, Peter Zijlstra wrote:
On Fri, 2010-04-16 at 13:36 -0700, Jeremy Fitzhardinge wrote:
+ do {
+ last = last_value;
Does this need a barrier() to prevent the compiler from re-reading
last_value for the subsequent lines? Otherwise (ret last)
On Mon, 2010-04-19 at 13:49 +0300, Avi Kivity wrote:
On 04/19/2010 01:46 PM, Peter Zijlstra wrote:
On Sat, 2010-04-17 at 21:48 +0300, Avi Kivity wrote:
After this patch is applied, I don't see a single warp in time during 5
days
of execution, in any of the machines I saw them
On 04/19/2010 01:49 PM, Peter Zijlstra wrote:
Right, so on x86 we have:
X86_FEATURE_CONSTANT_TSC, which only states that TSC is frequency
independent, not that it doesn't stop in C states and similar fun stuff.
X86_FEATURE_TSC_RELIABLE, which IIRC should indicate the TSC is constant
and
On 04/19/2010 01:51 PM, Peter Zijlstra wrote:
Right, so on x86 we have:
X86_FEATURE_CONSTANT_TSC, which only states that TSC is frequency
independent, not that it doesn't stop in C states and similar fun stuff.
X86_FEATURE_TSC_RELIABLE, which IIRC should indicate the TSC is constant
and
On Mon, 2010-04-19 at 13:47 +0300, Avi Kivity wrote:
On 04/19/2010 01:43 PM, Peter Zijlstra wrote:
+
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
{
struct pvclock_shadow_time shadow;
unsigned version;
cycle_t ret, offset;
+
On Mon, 2010-04-19 at 13:53 +0300, Avi Kivity wrote:
On 04/19/2010 01:49 PM, Peter Zijlstra wrote:
Right, so on x86 we have:
X86_FEATURE_CONSTANT_TSC, which only states that TSC is frequency
independent, not that it doesn't stop in C states and similar fun stuff.
On Mon, 2010-04-19 at 13:50 +0300, Avi Kivity wrote:
On 04/19/2010 01:39 PM, Peter Zijlstra wrote:
On Fri, 2010-04-16 at 13:36 -0700, Jeremy Fitzhardinge wrote:
+ do {
+ last = last_value;
Does this need a barrier() to prevent the compiler from re-reading
On 04/19/2010 02:05 PM, Peter Zijlstra wrote:
ACCESS_ONCE() is your friend.
I think it's implied with atomic64_read().
Yes it would be. I was merely trying to point out that
last = ACCESS_ONCE(last_value);
Is a narrower way of writing:
last = last_value;
barrier();
On 04/19/2010 01:56 PM, Peter Zijlstra wrote:
Right, do bear in mind that the x86 implementation of atomic64_read() is
terrifyingly expensive, it is better to not do that read and simply use
the result of the cmpxchg.
atomic64_read() _is_ cmpxchg64b. Are you thinking of some
On Mon, 2010-04-19 at 14:13 +0300, Avi Kivity wrote:
On 04/19/2010 01:56 PM, Peter Zijlstra wrote:
Right, do bear in mind that the x86 implementation of atomic64_read() is
terrifyingly expensive, it is better to not do that read and simply use
the result of the cmpxchg.
On 04/19/2010 01:59 PM, Peter Zijlstra wrote:
So what do we need? test for both TSC_RELIABLE and NONSTOP_TSC? IMO
TSC_RELIABLE should imply NONSTOP_TSC.
Yeah, I think RELIABLE does imply NONSTOP and CONSTANT, but NONSTOP
CONSTANT does not make RELIABLE.
The manual says:
On 04/19/2010 02:19 PM, Peter Zijlstra wrote:
Still have two cmpxchgs in the common case. The first iteration will
fail, fetching last_value, the second will work.
It will be better when we have contention, though, so it's worthwhile.
Right, another option is to put the initial read
On Mon, Apr 19, 2010 at 02:10:54PM +0300, Avi Kivity wrote:
On 04/19/2010 02:05 PM, Peter Zijlstra wrote:
ACCESS_ONCE() is your friend.
I think it's implied with atomic64_read().
Yes it would be. I was merely trying to point out that
last = ACCESS_ONCE(last_value);
Is a narrower
On Fri, Apr 16, 2010 at 01:36:34PM -0700, Jeremy Fitzhardinge wrote:
On 04/15/2010 11:37 AM, Glauber Costa wrote:
In recent stress tests, it was found that pvclock-based systems
could seriously warp in smp systems. Using ingo's time-warp-test.c,
I could trigger a scenario as bad as 1.5mi
On Mon, Apr 19, 2010 at 01:19:43PM +0200, Peter Zijlstra wrote:
On Mon, 2010-04-19 at 14:13 +0300, Avi Kivity wrote:
On 04/19/2010 01:56 PM, Peter Zijlstra wrote:
Right, do bear in mind that the x86 implementation of atomic64_read() is
terrifyingly expensive, it is better to not
On 04/19/2010 05:21 PM, Glauber Costa wrote:
Oh yes, just trying to avoid a patch with both atomic64_read() and
ACCESS_ONCE().
you're mixing the private version of the patch you saw with this one.
there isn't any atomic reads in here. I'll use a barrier then
This patch writes
On 04/19/2010 05:32 PM, Glauber Costa wrote:
Right, another option is to put the initial read outside of the loop,
that way you'll have the best of all cases, a single LOCK'ed op in the
loop, and only a single LOCK'ed op for the fast path on sensible
architectures ;-)
last =
On Mon, 2010-04-19 at 17:33 +0300, Avi Kivity wrote:
On 04/19/2010 05:21 PM, Glauber Costa wrote:
Oh yes, just trying to avoid a patch with both atomic64_read() and
ACCESS_ONCE().
you're mixing the private version of the patch you saw with this one.
there isn't any atomic reads
On 04/19/2010 07:33 AM, Avi Kivity wrote:
On 04/19/2010 05:21 PM, Glauber Costa wrote:
Oh yes, just trying to avoid a patch with both atomic64_read() and
ACCESS_ONCE().
you're mixing the private version of the patch you saw with this one.
there isn't any atomic reads in here. I'll use
On 04/19/2010 07:46 AM, Peter Zijlstra wrote:
What avi says! :-)
On a 32bit machine a 64bit read are two 32bit reads, so
last = last_value;
becomes:
last.high = last_value.high;
last.low = last_vlue.low;
(or the reverse of course)
Now imagine a write getting interleaved with
On 04/19/2010 07:26 AM, Glauber Costa wrote:
Is the problem that the tscs are starting out of sync, or that they're
drifting relative to each other over time? Do the problems become worse
the longer the uptime? How large are the offsets we're talking about here?
The offsets usually
On Mon, Apr 19, 2010 at 09:19:38AM -0700, Jeremy Fitzhardinge wrote:
On 04/19/2010 07:26 AM, Glauber Costa wrote:
Is the problem that the tscs are starting out of sync, or that they're
drifting relative to each other over time? Do the problems become worse
the longer the uptime? How large
On 04/19/2010 12:54 AM, Avi Kivity wrote:
On 04/19/2010 01:51 PM, Peter Zijlstra wrote:
Right, so on x86 we have:
X86_FEATURE_CONSTANT_TSC, which only states that TSC is frequency
independent, not that it doesn't stop in C states and similar fun
stuff.
X86_FEATURE_TSC_RELIABLE, which IIRC
On Mon, Apr 19, 2010 at 03:25:43PM -0300, Glauber Costa wrote:
On Mon, Apr 19, 2010 at 09:19:38AM -0700, Jeremy Fitzhardinge wrote:
On 04/19/2010 07:26 AM, Glauber Costa wrote:
Is the problem that the tscs are starting out of sync, or that they're
drifting relative to each other over
On 04/17/2010 09:48 PM, Avi Kivity wrote:
+static u64 last_value = 0;
Needs to be atomic64_t.
+
cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
{
struct pvclock_shadow_time shadow;
unsigned version;
cycle_t ret, offset;
+u64 last;
+do {
On Thu, Apr 15, 2010 at 02:37:24PM -0400, Glauber Costa wrote:
In recent stress tests, it was found that pvclock-based systems
could seriously warp in smp systems. Using ingo's time-warp-test.c,
I could trigger a scenario as bad as 1.5mi warps a minute in some systems.
(to be fair, it wasn't
On 04/15/2010 11:37 AM, Glauber Costa wrote:
In recent stress tests, it was found that pvclock-based systems
could seriously warp in smp systems. Using ingo's time-warp-test.c,
I could trigger a scenario as bad as 1.5mi warps a minute in some systems.
Is that 1.5 million?
(to be fair, it
On 04/16/2010 10:36 AM, Jeremy Fitzhardinge wrote:
On 04/15/2010 11:37 AM, Glauber Costa wrote:
In recent stress tests, it was found that pvclock-based systems
could seriously warp in smp systems. Using ingo's time-warp-test.c,
I could trigger a scenario as bad as 1.5mi warps a minute in
60 matches
Mail list logo