Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-11-29 Thread Olaf Hering
Thanks Andrew and Ian for taking the time to look at this change.
In turn it took me some time to get back to this topic.

Am Mon, 1 Oct 2018 13:39:51 +0100
schrieb Andrew Cooper :

> On 07/06/18 14:08, Olaf Hering wrote:
> > Add an option to control when vTSC emulation will be activated for a
> > domU with tsc_mode=default. Without such option each TSC access from
> > domU will be emulated, which causes a significant perfomance drop for
> > workloads that make use of rdtsc.
> >
> > One option to avoid the TSC option is to run domUs with tsc_mode=native.
> > This has the drawback that migrating a domU from a "2.3GHz" class host
> > to a "2.4GHz" class host may change the rate at wich the TSC counter
> > increases, the domU may not be prepared for that.
> >
> > With the new option the host admin can decide how a domU should behave
> > when it is migrated across systems of the same class. Since there is
> > always some jitter when Xen calibrates the cpu_khz value, all hosts of
> > the same class will most likely have slightly different values. As a
> > result vTSC emulation is unavoidable. Data collected during the incident
> > which triggered this change showed a jitter of up to 200 KHz across
> > systems of the same class.  
> 
> Do you have any further details of the systems involved?  If they are
> identical systems, they should all have the same real TSC frequency, and
> its a known issue that Xen isn't very good at working out the
> frequency.  TBH, fixing that would be far better overall.

My test hosts have a E5504 cpu. The ones where the issue was reported
use "E7-8880 v3" today, the used hardware two years ago was likely older.

From what I understand the TSC frequency stored in "cpu_khz" is just an
estimated value, not the real hardware frequency. Still, it is used to
decide if two hosts tick at the same speed. The domU kernel may use the
estimated value for its timekeeping, I think it does the same estimation
as Xen itself. But, I have to dig into that.

To me it looks like domUs should run ntpd themselves if there is a plan
to migrate them at some point in the future. At least if they use TSC
for timekeeping. With ntpd the domU would detect time skew, even if it
was not yet migrated to another host. I do not know much about timekeeping,
so this is just a guess on my side.


> > Existing padding fields are reused to store vtsc_khz_tolerance as u16.
> > The padding is sent as zero in write_tsc_info to the receving host.
> > The padding is undefined if the changed code runs as receiver.  
> I'm not sure what you mean by this final sentence.

I have removed that part, since incoming padding is in practice always zero.

> > handle_tsc_info has no code to verify that padding is indeed zero. Due
> > to the lack of a version field it is impossible to know if the sender
> > already has the newly introduced vtsc_tolerance field. In the worst
> > case the receiving domU will get an unemulated TSC.  
> 
> The lack of padding verification is deliberate, for forwards
> compatibility.  Why does the sending code matter?  One way or another,
> if the field is 0, the option wasn't present or wasn't configured. 
> Neither of these situations affect the decision-making that the
> receiving side needs to perform.

> > Signed-off-by: Olaf Hering 
> > Reviewed-by: Wei Liu  (v07/v08)
> > Reviewed-by: Jan Beulich  (v08)  
> 
> I'm still -0.5 for this patch.  I can appreciate why you want it, but it
> is a gross hack which only works when you don't skew time more than NTP
> in the guest can cope with.  My gut feeling is that there will be other
> more subtle fallout.


I will do some research regarding how much skewing a domU can handle.
As said in another reply, the expected time drift with a difference of
just 11 kHz in the cpu_khz variable on a 2.6GHz system is about 0.3 seconds
during a day.

IanJ requested clarification for how much time skew a system can handle.
Perhaps this should have been part of the initial submission for
tsc_mode=native already. I will do some research. Also some of the concerns
about missing documentation are already covered in paragraph #3 of the
commit message.

I will do some more testing with staging and send v10 next week.
The accumulated changes for a v10, so far:
v10:
 - rebase to 402411ec40
 - update write_libxc_tsc_info to handle the new parameter vtsc_tolerance_khz
   without this change, migration from xen-4.5 will fail (Andrew)
 - add newline to tsc_set_info (Andrew)
 - add measurment unit to libxc-migration-stream.pandoc (Andrew)
 - add pointer to xen-tscmode(7) in xl.cfg(5)/vtsc_tolerance_khz (Andrew)
 - reword the newly added paragraph in xen-tscmode(7) (Andrew),
   and also mention that it is about the measured/estimated TSC value
   rather than the real value. The latter is simply unknown.
 - simplify wording regarding the value of padding field in old Xen
   versions, the previous one turned out to be confusing and not helpful


Olaf


pgpqIIyvcccC2.pgp
Description: 

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-09 Thread Olaf Hering
Thanks for all the replies. I will work through them.
One remark now:

Am Mon, 1 Oct 2018 13:39:51 +0100
schrieb Andrew Cooper :

> > With the new option the host admin can decide how a domU should behave
> > when it is migrated across systems of the same class. Since there is
> > always some jitter when Xen calibrates the cpu_khz value, all hosts of
> > the same class will most likely have slightly different values. As a
> > result vTSC emulation is unavoidable. Data collected during the incident
> > which triggered this change showed a jitter of up to 200 KHz across
> > systems of the same class.  
> Do you have any further details of the systems involved?  If they are
> identical systems, they should all have the same real TSC frequency, and
> its a known issue that Xen isn't very good at working out the
> frequency.

If "identical" systems really have an identical frequency, but Xen is
unable to figure out the exact value of that frequency that it is going
to use for its calculations, why would it be a good idea in the first place
to rely on such an estimated value? If there is drift expected, shouldn't
that be handled by running ntpd inside dom0 and the domUs?

If I did that math correctly: a domU that is started with the estimated
cpu_khz value of 2666766, and is migrated to another host with the
estimated cpu_khz value of 2666755, would see 950400K fewer ticks during
a day. How many ticks actually happen, noone knows. But since that amount
of ticks represents a time span of ca. 0.3 seconds I would assume that
ntpd can handle such drift.

With larger differences, like the 200 KHz mentioned above, the drift will
be larger.


What I mean to say is: should the comparison of both hosts be based
on different metrics, instead of a plain estimated value in 'cpu_khz'?


Olaf


pgpRDgCpvgsWi.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Wei Liu
On Mon, Oct 01, 2018 at 03:25:36PM +0100, George Dunlap wrote:
> On 10/01/2018 03:00 PM, Jan Beulich wrote:
>  On 01.10.18 at 15:38,  wrote:
> >> On 10/01/2018 12:25 PM, Jan Beulich wrote:
> >>> I think the main concern
> >>> was with the way migration of the new value was implemented. But I
> >>> really have to defer to Andrew for that, irrespective of him not
> >>> having responded (on the list) to prior pings.
> >>
> >> Is Andrew really the only person who knows enough about migration to
> >> give this the thumbs-up?
> > 
> > That's not the point here, at least afaic: He had voiced _some_
> > concern on an earlier version. In such a case it is, I think, only
> > appropriate to wait with committing until there was indication
> > that the concerns were sufficiently addressed (verbally or by
> > adjustments to the code).
> 
> Right -- but it's not your job to make sure the migration stuff is
> properly addressed; it's Wei and Ian's job.  Wei's R-b was a statement
> from him that the code was good; when Andy questioned that, I think it
> was then *Wei's* job to address the question, not yours or Andy's (or
> even Olaf's).  If Wei says, "I've considered Andy's objections and I
> think the patch is fine as-is", then it can be checked in (given a
> reasonable amount of time for Andy to respond); and Wei can own whatever
> consequences there are.

This patch touched more than toolstack code, that's why Jan gave his R-b
in the first place.

The contention is not on the correctness of the code, but on if this
mechanism had unintended consequences. Both Jan and I thought the code
was correct, but we didn't feel comfortable enough to ignore objections.

Sorry Olaf.

Wei.

> 
>  -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Jan Beulich
>>> On 01.10.18 at 17:17,  wrote:
> On 01/10/18 14:29, Jan Beulich wrote:
> On 01.10.18 at 14:39,  wrote:
>>> On 07/06/18 14:08, Olaf Hering wrote:
 Add an option to control when vTSC emulation will be activated for a
 domU with tsc_mode=default. Without such option each TSC access from
 domU will be emulated, which causes a significant perfomance drop for
 workloads that make use of rdtsc.

 One option to avoid the TSC option is to run domUs with tsc_mode=native.
 This has the drawback that migrating a domU from a "2.3GHz" class host
 to a "2.4GHz" class host may change the rate at wich the TSC counter
 increases, the domU may not be prepared for that.

 With the new option the host admin can decide how a domU should behave
 when it is migrated across systems of the same class. Since there is
 always some jitter when Xen calibrates the cpu_khz value, all hosts of
 the same class will most likely have slightly different values. As a
 result vTSC emulation is unavoidable. Data collected during the incident
 which triggered this change showed a jitter of up to 200 KHz across
 systems of the same class.
>>> Do you have any further details of the systems involved?  If they are
>>> identical systems, they should all have the same real TSC frequency, and
>>> its a known issue that Xen isn't very good at working out the
>>> frequency.  TBH, fixing that would be far better overall.
>> Are you convinced all parts match their nominal frequency without
>> _any_ deviation?
> 
> That is the intent of publishing the numbers, yes.
> 
>> If that was the case, we could indeed use CPUID
>> leaves 0x15 / 0x16 output, if available.
> 
> We very much should be doing this.  There are also model-specific ways
> of getting the same data on older processors.
> 
>> But I very much doubt this.
>> As an example, here's what bare metal Linux says on my newest
>> system:
>>
>> tsc: Detected 2600.000 MHz processor
>> tsc: Refined TSC clocksource calibration: 2591.990 MHz
>>
>> Xen figures:
>>
>> (XEN) Detected 2592.107 MHz processor.
>>
>> And then after another re-boot bare metal Linux again
>>
>> tsc: Refined TSC clocksource calibration: 2592.008 MHz
> 
> What is surprising here?  The calibration loop is not 100% accurate and
> cannot be made to be perfect.
> 
> The fact that Linux and Xen agree is because they basically share the
> same calibration algorithm - not that the processor is really running at
> 2592MHz.

And I'm not claiming it is. I'm merely voicing my doubt that the
processor is running at exactly the announced 2600.000 MHz.
In which case it is simply unknown whether calibrated or
nominal values come closer to the truth.

>  For one, all calibration options will read slow by the amount
> of time it takes an interrupt to propagate through the system fabric,
> and there is basically nothing software can do to account for this.

Well, if you measure under otherwise identical conditions twice
the arrival of instances of the same signal, then its time to travel
through the fabric doesn't matter for the distance in time between
the arrival of the two instances. But of course this is as idealized
as is an assumption that humans would be able to manufacture
clocks ticking at exactly their nominal frequency.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Andrew Cooper
On 01/10/18 14:29, Jan Beulich wrote:
 On 01.10.18 at 14:39,  wrote:
>> On 07/06/18 14:08, Olaf Hering wrote:
>>> Add an option to control when vTSC emulation will be activated for a
>>> domU with tsc_mode=default. Without such option each TSC access from
>>> domU will be emulated, which causes a significant perfomance drop for
>>> workloads that make use of rdtsc.
>>>
>>> One option to avoid the TSC option is to run domUs with tsc_mode=native.
>>> This has the drawback that migrating a domU from a "2.3GHz" class host
>>> to a "2.4GHz" class host may change the rate at wich the TSC counter
>>> increases, the domU may not be prepared for that.
>>>
>>> With the new option the host admin can decide how a domU should behave
>>> when it is migrated across systems of the same class. Since there is
>>> always some jitter when Xen calibrates the cpu_khz value, all hosts of
>>> the same class will most likely have slightly different values. As a
>>> result vTSC emulation is unavoidable. Data collected during the incident
>>> which triggered this change showed a jitter of up to 200 KHz across
>>> systems of the same class.
>> Do you have any further details of the systems involved?  If they are
>> identical systems, they should all have the same real TSC frequency, and
>> its a known issue that Xen isn't very good at working out the
>> frequency.  TBH, fixing that would be far better overall.
> Are you convinced all parts match their nominal frequency without
> _any_ deviation?

That is the intent of publishing the numbers, yes.

> If that was the case, we could indeed use CPUID
> leaves 0x15 / 0x16 output, if available.

We very much should be doing this.  There are also model-specific ways
of getting the same data on older processors.

> But I very much doubt this.
> As an example, here's what bare metal Linux says on my newest
> system:
>
> tsc: Detected 2600.000 MHz processor
> tsc: Refined TSC clocksource calibration: 2591.990 MHz
>
> Xen figures:
>
> (XEN) Detected 2592.107 MHz processor.
>
> And then after another re-boot bare metal Linux again
>
> tsc: Refined TSC clocksource calibration: 2592.008 MHz

What is surprising here?  The calibration loop is not 100% accurate and
cannot be made to be perfect.

The fact that Linux and Xen agree is because they basically share the
same calibration algorithm - not that the processor is really running at
2592MHz.  For one, all calibration options will read slow by the amount
of time it takes an interrupt to propagate through the system fabric,
and there is basically nothing software can do to account for this.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread George Dunlap
On 10/01/2018 03:00 PM, Jan Beulich wrote:
 On 01.10.18 at 15:38,  wrote:
>> On 10/01/2018 12:25 PM, Jan Beulich wrote:
>>> I think the main concern
>>> was with the way migration of the new value was implemented. But I
>>> really have to defer to Andrew for that, irrespective of him not
>>> having responded (on the list) to prior pings.
>>
>> Is Andrew really the only person who knows enough about migration to
>> give this the thumbs-up?
> 
> That's not the point here, at least afaic: He had voiced _some_
> concern on an earlier version. In such a case it is, I think, only
> appropriate to wait with committing until there was indication
> that the concerns were sufficiently addressed (verbally or by
> adjustments to the code).

Right -- but it's not your job to make sure the migration stuff is
properly addressed; it's Wei and Ian's job.  Wei's R-b was a statement
from him that the code was good; when Andy questioned that, I think it
was then *Wei's* job to address the question, not yours or Andy's (or
even Olaf's).  If Wei says, "I've considered Andy's objections and I
think the patch is fine as-is", then it can be checked in (given a
reasonable amount of time for Andy to respond); and Wei can own whatever
consequences there are.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Jan Beulich
>>> On 01.10.18 at 15:38,  wrote:
> On 10/01/2018 12:25 PM, Jan Beulich wrote:
>> I think the main concern
>> was with the way migration of the new value was implemented. But I
>> really have to defer to Andrew for that, irrespective of him not
>> having responded (on the list) to prior pings.
> 
> Is Andrew really the only person who knows enough about migration to
> give this the thumbs-up?

That's not the point here, at least afaic: He had voiced _some_
concern on an earlier version. In such a case it is, I think, only
appropriate to wait with committing until there was indication
that the concerns were sufficiently addressed (verbally or by
adjustments to the code). It is instead my incomplete
knowledge on the various parts of migration that made it
undesirable to try to judge in his place, after pings lead no-
where.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]

2018-10-01 Thread Andrew Cooper
On 01/10/18 14:24, Ian Jackson wrote:
>
>>> handle_tsc_info has no code to verify that padding is indeed zero. Due
>>> to the lack of a version field it is impossible to know if the sender
>>> already has the newly introduced vtsc_tolerance field. In the worst
>>> case the receiving domU will get an unemulated TSC.
>> The lack of padding verification is deliberate, for forwards
>> compatibility.  Why does the sending code matter?  One way or another,
>> if the field is 0, the option wasn't present or wasn't configured. 
>> Neither of these situations affect the decision-making that the
>> receiving side needs to perform.
> We are talking here about an unused field that is (supposedly?)
> always sent as zero ?

The spec requires the padding to always be sent as zero, and
verify-stream-v2 does check the padding and object to non-zero values.

libxc's write_tsc_info() has always fully zeroed the structure, and
convert-legacy-stream also behaves the same way.  However, I notice this
change will break migration from pre 4.6 as write_libxc_tsc_info() now
has a mismatched number of parameters to pack and throw an exception.

>
> AIUI:
>
> In the new code the semantics of zero is "do not allow any tolerance".
> The old code handles this correctly.  The issue is with migrations
> from new code to old: the tolerance value will silently ignored.

Migrating from new to old doesn't remotely work, and whether things
explode or not is very context dependent.

I'm not sure its worth caring about this case.  We never have before. 
(Getting backwards migration working for emergency cases is on my TODO
list, but it is dependent on rewriting the hypervisor interfaces for
getting/setting guest state.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread George Dunlap
On 10/01/2018 12:25 PM, Jan Beulich wrote:
 On 01.10.18 at 12:52,  wrote:
>> Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to 
>> avoid TSC emulation"):
>>> Am Thu, 13 Sep 2018 09:39:13 +0200
>>> schrieb Olaf Hering :
 this patch was not applied yet, even after a few "pings".
>>>
>>> No reaction since months.
>>> So scrap that patch, just in case it is still part of someones to-consider 
>> queue.
>>
>> I think it would be worth exploring whether this patch could be
>> applied without an explicit ack from Andrew.
>>
>> I confess I ignored all the previous mails because they all started
>>Andrew,
>> or
>>Andrew, Lars,
>> so I assumed that you didn't want attention from other
>> maintainers/committers.
>>
>> Now that I look at the thread it is difficult for me to see the wood
>> for the trees but I don't see unanswered concerns.
> 
> Problem is - discussion around this was (iirc) happening not only on
> the list, but also on irc (including perhaps private chats). It was for
> that reason that I made my R-b conditional upon Andrew at least
> giving an informal go-ahead (otherwise, together with Wei's R-b,
> the patch could have gone in).
> 
> Besides the question of correctness from the perspective of guests
> (which imo is not a problem as the feature needs to be actively
> enabled, and I think we have no reason to keep admins from
> breaking their guests if they really mean to), 

I agree with this, BTW.

> I think the main concern
> was with the way migration of the new value was implemented. But I
> really have to defer to Andrew for that, irrespective of him not
> having responded (on the list) to prior pings.

Is Andrew really the only person who knows enough about migration to
give this the thumbs-up?  Technically migration is in the toolstack, and
so Wei should have double-checked that before giving his review; and
when a question was raised, Wei (as the relevant maintainer who had
given it an R-b) should either have asserted that the code was indeed
correct, or withdrawn his R-b and given Olaf feedback to allow him to
get it into shape.

(This is all based on the history sketched out by Jan above; if there is
more to it then of course this analysis may not be correct.)

I was only skimming the thread, and intended to weigh in at some point;
but I didn't really understand why it was blocked on Andrew in the first
place.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Jan Beulich
>>> On 01.10.18 at 14:39,  wrote:
> On 07/06/18 14:08, Olaf Hering wrote:
>> Add an option to control when vTSC emulation will be activated for a
>> domU with tsc_mode=default. Without such option each TSC access from
>> domU will be emulated, which causes a significant perfomance drop for
>> workloads that make use of rdtsc.
>>
>> One option to avoid the TSC option is to run domUs with tsc_mode=native.
>> This has the drawback that migrating a domU from a "2.3GHz" class host
>> to a "2.4GHz" class host may change the rate at wich the TSC counter
>> increases, the domU may not be prepared for that.
>>
>> With the new option the host admin can decide how a domU should behave
>> when it is migrated across systems of the same class. Since there is
>> always some jitter when Xen calibrates the cpu_khz value, all hosts of
>> the same class will most likely have slightly different values. As a
>> result vTSC emulation is unavoidable. Data collected during the incident
>> which triggered this change showed a jitter of up to 200 KHz across
>> systems of the same class.
> 
> Do you have any further details of the systems involved?  If they are
> identical systems, they should all have the same real TSC frequency, and
> its a known issue that Xen isn't very good at working out the
> frequency.  TBH, fixing that would be far better overall.

Are you convinced all parts match their nominal frequency without
_any_ deviation? If that was the case, we could indeed use CPUID
leaves 0x15 / 0x16 output, if available. But I very much doubt this.
As an example, here's what bare metal Linux says on my newest
system:

tsc: Detected 2600.000 MHz processor
tsc: Refined TSC clocksource calibration: 2591.990 MHz

Xen figures:

(XEN) Detected 2592.107 MHz processor.

And then after another re-boot bare metal Linux again

tsc: Refined TSC clocksource calibration: 2592.008 MHz

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]

2018-10-01 Thread Ian Jackson
Jan Beulich writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to 
avoid TSC emulation"):
> Problem is - discussion around this was (iirc) happening not only on
> the list, but also on irc (including perhaps private chats).

Hrm.  Well, if it didn't happen on the list, it didn't happen.  It is
often useful and productive, of course, to thrash something out on
irc, or ping there, or whatever but:

Conclusions from irc (and from in-person conversations, phone calls,
or other kinds of un-minuted discussions) must be transferred to email
as otherwise they are lost (and the effort of having them is often
wasted as the conversation has to be had again).

If properly writing up the conclusion from an irc conversation is too
hard, pasting a transcript into an email seems a bare minimum.

> It was for that reason that I made my R-b conditional upon Andrew at
> least giving an informal go-ahead (otherwise, together with Wei's
> R-b, the patch could have gone in).

Right.  Thanks for the clarification.


Andrew Cooper writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to 
avoid TSC emulation"):
> [lots of stuff]

Thanks for the review.  I hope Olaf will be able to address most of
it, but:

Andrew:
> [Olaf:]
> > Existing padding fields are reused to store vtsc_khz_tolerance as u16.
> > The padding is sent as zero in write_tsc_info to the receving host.
> > The padding is undefined if the changed code runs as receiver.
> 
> I'm not sure what you mean by this final sentence.

Me neither.

> > handle_tsc_info has no code to verify that padding is indeed zero. Due
> > to the lack of a version field it is impossible to know if the sender
> > already has the newly introduced vtsc_tolerance field. In the worst
> > case the receiving domU will get an unemulated TSC.
> 
> The lack of padding verification is deliberate, for forwards
> compatibility.  Why does the sending code matter?  One way or another,
> if the field is 0, the option wasn't present or wasn't configured. 
> Neither of these situations affect the decision-making that the
> receiving side needs to perform.

We are talking here about an unused field that is (supposedly?)
always sent as zero ?

AIUI:

In the new code the semantics of zero is "do not allow any tolerance".
The old code handles this correctly.  The issue is with migrations
from new code to old: the tolerance value will silently ignored.

Presumably this is considered preferable to the alternative, which is
to extend the migration stream in a deliberately incompatible way so
that the migration fails.  I think this is what Olaf is trying to
say ?

> > Signed-off-by: Olaf Hering 
> > Reviewed-by: Wei Liu  (v07/v08)
> > Reviewed-by: Jan Beulich  (v08)
> 
> I'm still -0.5 for this patch.  I can appreciate why you want it, but it
> is a gross hack which only works when you don't skew time more than NTP
> in the guest can cope with.

That surely is why there is a limit to the tolerance.  I've asked Olaf
to try to quantify an appropriate limit.

>   My gut feeling is that there will be other
> more subtle fallout.

That's not particularly convincing to me.  But if you could try to be
more specific I think this could be usefull addressed in the
documentation for the feature ?


Olaf, I think the ball is now in your court.  I hope you can address
Andrew's comments, and maybe mine too.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Andrew Cooper
On 07/06/18 14:08, Olaf Hering wrote:
> Add an option to control when vTSC emulation will be activated for a
> domU with tsc_mode=default. Without such option each TSC access from
> domU will be emulated, which causes a significant perfomance drop for
> workloads that make use of rdtsc.
>
> One option to avoid the TSC option is to run domUs with tsc_mode=native.
> This has the drawback that migrating a domU from a "2.3GHz" class host
> to a "2.4GHz" class host may change the rate at wich the TSC counter
> increases, the domU may not be prepared for that.
>
> With the new option the host admin can decide how a domU should behave
> when it is migrated across systems of the same class. Since there is
> always some jitter when Xen calibrates the cpu_khz value, all hosts of
> the same class will most likely have slightly different values. As a
> result vTSC emulation is unavoidable. Data collected during the incident
> which triggered this change showed a jitter of up to 200 KHz across
> systems of the same class.

Do you have any further details of the systems involved?  If they are
identical systems, they should all have the same real TSC frequency, and
its a known issue that Xen isn't very good at working out the
frequency.  TBH, fixing that would be far better overall.

>
> Existing padding fields are reused to store vtsc_khz_tolerance as u16.
> The padding is sent as zero in write_tsc_info to the receving host.
> The padding is undefined if the changed code runs as receiver.

I'm not sure what you mean by this final sentence.

> handle_tsc_info has no code to verify that padding is indeed zero. Due
> to the lack of a version field it is impossible to know if the sender
> already has the newly introduced vtsc_tolerance field. In the worst
> case the receiving domU will get an unemulated TSC.

The lack of padding verification is deliberate, for forwards
compatibility.  Why does the sending code matter?  One way or another,
if the field is 0, the option wasn't present or wasn't configured. 
Neither of these situations affect the decision-making that the
receiving side needs to perform.

>
> Signed-off-by: Olaf Hering 
> Reviewed-by: Wei Liu  (v07/v08)
> Reviewed-by: Jan Beulich  (v08)

I'm still -0.5 for this patch.  I can appreciate why you want it, but it
is a gross hack which only works when you don't skew time more than NTP
in the guest can cope with.  My gut feeling is that there will be other
more subtle fallout.

As for the implementation itself, a few trivial comments.

> diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7
> index 3bbc96f201..122ae36679 100644
> --- a/docs/man/xen-tscmode.pod.7
> +++ b/docs/man/xen-tscmode.pod.7
> @@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated
>  
>  =back
>  
> +If the tsc_mode is set to "default" the decision to emulate TSC can be
> +tweaked further with the "vtsc_tolerance_khz" option.
> +
>  To understand this in more detail, the rest of this document must
>  be read.
>  
> @@ -211,6 +214,19 @@ is emulated.  Note that, though emulated, the "apparent" 
> TSC frequency
>  will be the TSC frequency of the initial physical machine, even after
>  migration.
>  
> +Since the calibration of the TSC frequency may not be 100% accurate, the
> +exact value of the frequency can change even across reboots.

It can change across reboots for other reasons, e.g. firmware settings.

I'd phrase this as "Since the calibration of the TSC frequency isn't
100% accurate, the value measured by Xen can vary across reboots".

>  This means
> +also several otherwise identical systems can have a slightly different
> +TSC frequency. As a result TSC access will be emulated if a domU is
> +migrated from one host to another, identical host. To avoid the
> +performance impact of TSC emulation a certain tolerance of the measured
> +host TSC frequency can be specified with "vtsc_tolerance_khz". If the
> +measured "cpu_khz" value is within the tolerance range, TSC access
> +remains native. Otherwise it will be emulated. This allows to migrate
> +domUs between identical hardware. If the domU will be migrated to a
> +different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system,
> +TSC will be emualted to maintain the TSC frequency expected by the domU.
> +
>  For environments where both TSC-safeness AND highest performance
>  even across migration is a requirement, application code can be specially
>  modified to use an algorithm explicitly designed into Xen for this purpose.
> diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
> index 47d88243b1..995277794f 100644
> --- a/docs/man/xl.cfg.pod.5.in
> +++ b/docs/man/xl.cfg.pod.5.in
> @@ -1898,6 +1898,16 @@ determined in a similar way to that of B TSC 
> mode.
>  
>  Please see B for more information on this option.
>  
> +=item B
> +
> +B<(x86 only, relevant only for tsc_mode=default)>
> +When a domU is started, the CPU frequency of the host is used by the domU for
> +TSC related time 

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Jan Beulich
>>> On 01.10.18 at 12:52,  wrote:
> Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to 
> avoid TSC emulation"):
>> Am Thu, 13 Sep 2018 09:39:13 +0200
>> schrieb Olaf Hering :
>> > this patch was not applied yet, even after a few "pings".
>> 
>> No reaction since months.
>> So scrap that patch, just in case it is still part of someones to-consider 
> queue.
> 
> I think it would be worth exploring whether this patch could be
> applied without an explicit ack from Andrew.
> 
> I confess I ignored all the previous mails because they all started
>Andrew,
> or
>Andrew, Lars,
> so I assumed that you didn't want attention from other
> maintainers/committers.
> 
> Now that I look at the thread it is difficult for me to see the wood
> for the trees but I don't see unanswered concerns.

Problem is - discussion around this was (iirc) happening not only on
the list, but also on irc (including perhaps private chats). It was for
that reason that I made my R-b conditional upon Andrew at least
giving an informal go-ahead (otherwise, together with Wei's R-b,
the patch could have gone in).

Besides the question of correctness from the perspective of guests
(which imo is not a problem as the feature needs to be actively
enabled, and I think we have no reason to keep admins from
breaking their guests if they really mean to), I think the main concern
was with the way migration of the new value was implemented. But I
really have to defer to Andrew for that, irrespective of him not
having responded (on the list) to prior pings.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-10-01 Thread Ian Jackson
Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to 
avoid TSC emulation"):
> Am Thu, 13 Sep 2018 09:39:13 +0200
> schrieb Olaf Hering :
> > this patch was not applied yet, even after a few "pings".
> 
> No reaction since months.
> So scrap that patch, just in case it is still part of someones to-consider 
> queue.

I think it would be worth exploring whether this patch could be
applied without an explicit ack from Andrew.

I confess I ignored all the previous mails because they all started
   Andrew,
or
   Andrew, Lars,
so I assumed that you didn't want attention from other
maintainers/committers.

Now that I look at the thread it is difficult for me to see the wood
for the trees but I don't see unanswered concerns.  I guess the patch
should maybe therefore be committed.

As committer, before I did that despite a missing ack, I would want to
(i) try to understand if possibly why the ack was missing (ii) do my
own review to satisfy any doubts (iii) give the maintainer a clear
tike interval to say "nack".


As for (i) (reasons for lack of ack), see above, and it's not clear to
me; but AFAICT there are two difficulties.  One (see (a) below) is
with the principle of the feature.  One (see (b) below) is about a
detail of the implementation.

As for (ii) (own review), see below.  That addresses (a) and (b).

As for (iii): If anyone actually objects to this patch for some
different reason, or to my proposed approach, please comment in the
next 7 days.


(a) Principle

I read the documentation for the new feature and it seems to make
sense.  But it would be nice for the documentation to explain what is
considered a likely safe and sufficient value.

For example, one might say something like this:

  Typical TSC speed variation between supposedly identical hosts is
  about X%.  A Unix guest running NTP for time synchronisation can
  cope with clock drift rates of up to about Y%.  So a
  vtsc_tolerance_khz setting between these two values is likely to be
  effective for migration between "identical" hosts, and not
  disruptive.

Olaf, please could you fill in the blanks in that text, and consider
what the exact wording should be (depending on what research you
conducted etc.), and then consider whether to put it in the
documentation, commit meesage, or just on the mailing list ?

Personally I think documentation would be ideal if we have firm
information but if firm information is hard to come by and all we have
is empirical data, then maybe a commit message describing those
experiences would be sufficient.


(b) Protocol compatibility

  Olaf:
  > > On 07.06.18 at 16:49,  wrote:
  > > > Am Thu, 07 Jun 2018 08:44:41 -0600
  > > > schrieb "Jan Beulich" :
  > > >> The re-use of the field is acceptable only if all existing
  > > >> senders reliably fill zero in there.
  > > >
  > > > How do we know all senders? I just know about write_tsc_info
  > > > from xen-4.6+.
  > >
  > > I don't think we care about senders other than ones using libxc, so
  > > by knowing whether all libxc versions conform to the requirement,
  > > we could declare we're fine.
  > 
  > Yes, I think we are fine. And if migration from pre-4.6 is supported
  > anyway is another question. Most likely the answer is no.

This comment from Olaf "Yes, I think we are fine" is not particularly
reassuring.  Olaf, can you please say something more definite about
how you verified whether your assertion is true.

And yes, we do mean also for versions from pre-4.6.  It is not
supported, but it must not go wrong in unexpected ways.  You can use
the git history to see older implementations in libxc.


Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-09-28 Thread Olaf Hering
Am Thu, 13 Sep 2018 09:39:13 +0200
schrieb Olaf Hering :

> this patch was not applied yet, even after a few "pings".

No reaction since months.

So scrap that patch, just in case it is still part of someones to-consider 
queue.


Olaf


pgpFtD6Pj968F.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-09-13 Thread Olaf Hering
Andrew, Lars,

this patch was not applied yet, even after a few "pings".

I think it needs an approval from Andrew before it can go in, unless the
bug has to be fixed in some other way.

Please take some time to review that change.

Thanks,
Olaf

Am Wed, 1 Aug 2018 18:17:47 +0200
schrieb Olaf Hering :

> Am Tue, 26 Jun 2018 19:11:13 +0200
> schrieb Olaf Hering :
> > Am Thu,  7 Jun 2018 15:08:29 +0200
> > schrieb Olaf Hering :  
> > > Add an option to control when vTSC emulation will be activated for a
> > > domU with tsc_mode=default. Without such option each TSC access from
> > > domU will be emulated, which causes a significant perfomance drop for
> > > workloads that make use of rdtsc.
> > Andrew,
> > do you have any comments on this patch/approach?  
> Andrew,
> any comments on this patch?
> https://lists.xen.org/archives/html/xen-devel/2018-06/msg00398.html


pgp5Rh4d0PxsH.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-08-01 Thread Olaf Hering
Am Tue, 26 Jun 2018 19:11:13 +0200
schrieb Olaf Hering :

> Am Thu,  7 Jun 2018 15:08:29 +0200
> schrieb Olaf Hering :
> > Add an option to control when vTSC emulation will be activated for a
> > domU with tsc_mode=default. Without such option each TSC access from
> > domU will be emulated, which causes a significant perfomance drop for
> > workloads that make use of rdtsc.  
> Andrew,
> do you have any comments on this patch/approach?


Andrew,

any comments on this patch?

https://lists.xen.org/archives/html/xen-devel/2018-06/msg00398.html

Olaf


pgpIYIYqVDtEN.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-26 Thread Olaf Hering
Am Thu,  7 Jun 2018 15:08:29 +0200
schrieb Olaf Hering :

> Add an option to control when vTSC emulation will be activated for a
> domU with tsc_mode=default. Without such option each TSC access from
> domU will be emulated, which causes a significant perfomance drop for
> workloads that make use of rdtsc.

Andrew,

do you have any comments on this patch/approach?

Olaf


pgpclOaoS7m1Q.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-07 Thread Olaf Hering
On Thu, Jun 07, Jan Beulich wrote:

> >>> On 07.06.18 at 16:49,  wrote:
> > Am Thu, 07 Jun 2018 08:44:41 -0600
> > schrieb "Jan Beulich" :
> >> The re-use of the field is acceptable only if all existing senders reliably
> >> fill zero in there.
> > How do we know all senders? I just know about write_tsc_info from xen-4.6+.
> I don't think we care about senders other than ones using libxc, so
> by knowing whether all libxc versions conform to the requirement,
> we could declare we're fine.

Yes, I think we are fine. And if migration from pre-4.6 is supported
anyway is another question. Most likely the answer is no.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-07 Thread Jan Beulich
>>> On 07.06.18 at 16:49,  wrote:
> Am Thu, 07 Jun 2018 08:44:41 -0600
> schrieb "Jan Beulich" :
> 
>> The re-use of the field is acceptable only if all existing senders reliably
>> fill zero in there.
> 
> How do we know all senders? I just know about write_tsc_info from xen-4.6+.

I don't think we care about senders other than ones using libxc, so
by knowing whether all libxc versions conform to the requirement,
we could declare we're fine.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-07 Thread Olaf Hering
Am Thu, 07 Jun 2018 08:44:41 -0600
schrieb "Jan Beulich" :

> The re-use of the field is acceptable only if all existing senders reliably 
> fill zero in there.

How do we know all senders? I just know about write_tsc_info from xen-4.6+.

Olaf


pgp7WNmQ7W4Jd.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-07 Thread Jan Beulich
>>> On 07.06.18 at 15:47,  wrote:
> Am Thu, 07 Jun 2018 07:31:26 -0600
> schrieb "Jan Beulich" :
> 
>> Hmm, I find this description concerning. Is the field reliably zero when
>> coming from older Xen, or is it not?
> 
> It depends on how old.
> If the other sending side has write_tsc_info from 4.6+, then _res1 is zero.
> If it is older, the XC_SAVE_ID_TSC_INFO is eventually not handled anymore.
> At least I could not find the relevant code, even in staging-4.6.
> But then, it is unlikely that migration from 4.5 to 4.6 did not work.
> 
> Also, in general the input comes from "remote". So it is untrusted.

But isn't this exactly the concern Andrew had voiced? If so, we're not
any closer to this having a chance to go in. The re-use of the field is
acceptable only if all existing senders reliably fill zero in there.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation

2018-06-07 Thread Jan Beulich
>>> On 07.06.18 at 15:08,  wrote:
> Add an option to control when vTSC emulation will be activated for a
> domU with tsc_mode=default. Without such option each TSC access from
> domU will be emulated, which causes a significant perfomance drop for
> workloads that make use of rdtsc.
> 
> One option to avoid the TSC option is to run domUs with tsc_mode=native.
> This has the drawback that migrating a domU from a "2.3GHz" class host
> to a "2.4GHz" class host may change the rate at wich the TSC counter
> increases, the domU may not be prepared for that.
> 
> With the new option the host admin can decide how a domU should behave
> when it is migrated across systems of the same class. Since there is
> always some jitter when Xen calibrates the cpu_khz value, all hosts of
> the same class will most likely have slightly different values. As a
> result vTSC emulation is unavoidable. Data collected during the incident
> which triggered this change showed a jitter of up to 200 KHz across
> systems of the same class.
> 
> Existing padding fields are reused to store vtsc_khz_tolerance as u16.
> The padding is sent as zero in write_tsc_info to the receving host.
> The padding is undefined if the changed code runs as receiver.
> handle_tsc_info has no code to verify that padding is indeed zero. Due
> to the lack of a version field it is impossible to know if the sender
> already has the newly introduced vtsc_tolerance field. In the worst
> case the receiving domU will get an unemulated TSC.

Hmm, I find this description concerning. Is the field reliably zero when
coming from older Xen, or is it not?

> Signed-off-by: Olaf Hering 
> Reviewed-by: Wei Liu  (v07/v08)
> Reviewed-by: Jan Beulich  (v08)

To avoid this getting committed prematurely: I gave this R-b pending
Andrew finding his prior concerns addressed.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel