Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Thanks Andrew and Ian for taking the time to look at this change. In turn it took me some time to get back to this topic. Am Mon, 1 Oct 2018 13:39:51 +0100 schrieb Andrew Cooper : > On 07/06/18 14:08, Olaf Hering wrote: > > Add an option to control when vTSC emulation will be activated for a > > domU with tsc_mode=default. Without such option each TSC access from > > domU will be emulated, which causes a significant perfomance drop for > > workloads that make use of rdtsc. > > > > One option to avoid the TSC option is to run domUs with tsc_mode=native. > > This has the drawback that migrating a domU from a "2.3GHz" class host > > to a "2.4GHz" class host may change the rate at wich the TSC counter > > increases, the domU may not be prepared for that. > > > > With the new option the host admin can decide how a domU should behave > > when it is migrated across systems of the same class. Since there is > > always some jitter when Xen calibrates the cpu_khz value, all hosts of > > the same class will most likely have slightly different values. As a > > result vTSC emulation is unavoidable. Data collected during the incident > > which triggered this change showed a jitter of up to 200 KHz across > > systems of the same class. > > Do you have any further details of the systems involved? If they are > identical systems, they should all have the same real TSC frequency, and > its a known issue that Xen isn't very good at working out the > frequency. TBH, fixing that would be far better overall. My test hosts have a E5504 cpu. The ones where the issue was reported use "E7-8880 v3" today, the used hardware two years ago was likely older. From what I understand the TSC frequency stored in "cpu_khz" is just an estimated value, not the real hardware frequency. Still, it is used to decide if two hosts tick at the same speed. The domU kernel may use the estimated value for its timekeeping, I think it does the same estimation as Xen itself. But, I have to dig into that. To me it looks like domUs should run ntpd themselves if there is a plan to migrate them at some point in the future. At least if they use TSC for timekeeping. With ntpd the domU would detect time skew, even if it was not yet migrated to another host. I do not know much about timekeeping, so this is just a guess on my side. > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > > The padding is sent as zero in write_tsc_info to the receving host. > > The padding is undefined if the changed code runs as receiver. > I'm not sure what you mean by this final sentence. I have removed that part, since incoming padding is in practice always zero. > > handle_tsc_info has no code to verify that padding is indeed zero. Due > > to the lack of a version field it is impossible to know if the sender > > already has the newly introduced vtsc_tolerance field. In the worst > > case the receiving domU will get an unemulated TSC. > > The lack of padding verification is deliberate, for forwards > compatibility. Why does the sending code matter? One way or another, > if the field is 0, the option wasn't present or wasn't configured. > Neither of these situations affect the decision-making that the > receiving side needs to perform. > > Signed-off-by: Olaf Hering > > Reviewed-by: Wei Liu (v07/v08) > > Reviewed-by: Jan Beulich (v08) > > I'm still -0.5 for this patch. I can appreciate why you want it, but it > is a gross hack which only works when you don't skew time more than NTP > in the guest can cope with. My gut feeling is that there will be other > more subtle fallout. I will do some research regarding how much skewing a domU can handle. As said in another reply, the expected time drift with a difference of just 11 kHz in the cpu_khz variable on a 2.6GHz system is about 0.3 seconds during a day. IanJ requested clarification for how much time skew a system can handle. Perhaps this should have been part of the initial submission for tsc_mode=native already. I will do some research. Also some of the concerns about missing documentation are already covered in paragraph #3 of the commit message. I will do some more testing with staging and send v10 next week. The accumulated changes for a v10, so far: v10: - rebase to 402411ec40 - update write_libxc_tsc_info to handle the new parameter vtsc_tolerance_khz without this change, migration from xen-4.5 will fail (Andrew) - add newline to tsc_set_info (Andrew) - add measurment unit to libxc-migration-stream.pandoc (Andrew) - add pointer to xen-tscmode(7) in xl.cfg(5)/vtsc_tolerance_khz (Andrew) - reword the newly added paragraph in xen-tscmode(7) (Andrew), and also mention that it is about the measured/estimated TSC value rather than the real value. The latter is simply unknown. - simplify wording regarding the value of padding field in old Xen versions, the previous one turned out to be confusing and not helpful Olaf pgpqIIyvcccC2.pgp Description:
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Thanks for all the replies. I will work through them. One remark now: Am Mon, 1 Oct 2018 13:39:51 +0100 schrieb Andrew Cooper : > > With the new option the host admin can decide how a domU should behave > > when it is migrated across systems of the same class. Since there is > > always some jitter when Xen calibrates the cpu_khz value, all hosts of > > the same class will most likely have slightly different values. As a > > result vTSC emulation is unavoidable. Data collected during the incident > > which triggered this change showed a jitter of up to 200 KHz across > > systems of the same class. > Do you have any further details of the systems involved? If they are > identical systems, they should all have the same real TSC frequency, and > its a known issue that Xen isn't very good at working out the > frequency. If "identical" systems really have an identical frequency, but Xen is unable to figure out the exact value of that frequency that it is going to use for its calculations, why would it be a good idea in the first place to rely on such an estimated value? If there is drift expected, shouldn't that be handled by running ntpd inside dom0 and the domUs? If I did that math correctly: a domU that is started with the estimated cpu_khz value of 2666766, and is migrated to another host with the estimated cpu_khz value of 2666755, would see 950400K fewer ticks during a day. How many ticks actually happen, noone knows. But since that amount of ticks represents a time span of ca. 0.3 seconds I would assume that ntpd can handle such drift. With larger differences, like the 200 KHz mentioned above, the drift will be larger. What I mean to say is: should the comparison of both hosts be based on different metrics, instead of a plain estimated value in 'cpu_khz'? Olaf pgpRDgCpvgsWi.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On Mon, Oct 01, 2018 at 03:25:36PM +0100, George Dunlap wrote: > On 10/01/2018 03:00 PM, Jan Beulich wrote: > On 01.10.18 at 15:38, wrote: > >> On 10/01/2018 12:25 PM, Jan Beulich wrote: > >>> I think the main concern > >>> was with the way migration of the new value was implemented. But I > >>> really have to defer to Andrew for that, irrespective of him not > >>> having responded (on the list) to prior pings. > >> > >> Is Andrew really the only person who knows enough about migration to > >> give this the thumbs-up? > > > > That's not the point here, at least afaic: He had voiced _some_ > > concern on an earlier version. In such a case it is, I think, only > > appropriate to wait with committing until there was indication > > that the concerns were sufficiently addressed (verbally or by > > adjustments to the code). > > Right -- but it's not your job to make sure the migration stuff is > properly addressed; it's Wei and Ian's job. Wei's R-b was a statement > from him that the code was good; when Andy questioned that, I think it > was then *Wei's* job to address the question, not yours or Andy's (or > even Olaf's). If Wei says, "I've considered Andy's objections and I > think the patch is fine as-is", then it can be checked in (given a > reasonable amount of time for Andy to respond); and Wei can own whatever > consequences there are. This patch touched more than toolstack code, that's why Jan gave his R-b in the first place. The contention is not on the correctness of the code, but on if this mechanism had unintended consequences. Both Jan and I thought the code was correct, but we didn't feel comfortable enough to ignore objections. Sorry Olaf. Wei. > > -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 17:17, wrote: > On 01/10/18 14:29, Jan Beulich wrote: > On 01.10.18 at 14:39, wrote: >>> On 07/06/18 14:08, Olaf Hering wrote: Add an option to control when vTSC emulation will be activated for a domU with tsc_mode=default. Without such option each TSC access from domU will be emulated, which causes a significant perfomance drop for workloads that make use of rdtsc. One option to avoid the TSC option is to run domUs with tsc_mode=native. This has the drawback that migrating a domU from a "2.3GHz" class host to a "2.4GHz" class host may change the rate at wich the TSC counter increases, the domU may not be prepared for that. With the new option the host admin can decide how a domU should behave when it is migrated across systems of the same class. Since there is always some jitter when Xen calibrates the cpu_khz value, all hosts of the same class will most likely have slightly different values. As a result vTSC emulation is unavoidable. Data collected during the incident which triggered this change showed a jitter of up to 200 KHz across systems of the same class. >>> Do you have any further details of the systems involved? If they are >>> identical systems, they should all have the same real TSC frequency, and >>> its a known issue that Xen isn't very good at working out the >>> frequency. TBH, fixing that would be far better overall. >> Are you convinced all parts match their nominal frequency without >> _any_ deviation? > > That is the intent of publishing the numbers, yes. > >> If that was the case, we could indeed use CPUID >> leaves 0x15 / 0x16 output, if available. > > We very much should be doing this. There are also model-specific ways > of getting the same data on older processors. > >> But I very much doubt this. >> As an example, here's what bare metal Linux says on my newest >> system: >> >> tsc: Detected 2600.000 MHz processor >> tsc: Refined TSC clocksource calibration: 2591.990 MHz >> >> Xen figures: >> >> (XEN) Detected 2592.107 MHz processor. >> >> And then after another re-boot bare metal Linux again >> >> tsc: Refined TSC clocksource calibration: 2592.008 MHz > > What is surprising here? The calibration loop is not 100% accurate and > cannot be made to be perfect. > > The fact that Linux and Xen agree is because they basically share the > same calibration algorithm - not that the processor is really running at > 2592MHz. And I'm not claiming it is. I'm merely voicing my doubt that the processor is running at exactly the announced 2600.000 MHz. In which case it is simply unknown whether calibrated or nominal values come closer to the truth. > For one, all calibration options will read slow by the amount > of time it takes an interrupt to propagate through the system fabric, > and there is basically nothing software can do to account for this. Well, if you measure under otherwise identical conditions twice the arrival of instances of the same signal, then its time to travel through the fabric doesn't matter for the distance in time between the arrival of the two instances. But of course this is as idealized as is an assumption that humans would be able to manufacture clocks ticking at exactly their nominal frequency. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 01/10/18 14:29, Jan Beulich wrote: On 01.10.18 at 14:39, wrote: >> On 07/06/18 14:08, Olaf Hering wrote: >>> Add an option to control when vTSC emulation will be activated for a >>> domU with tsc_mode=default. Without such option each TSC access from >>> domU will be emulated, which causes a significant perfomance drop for >>> workloads that make use of rdtsc. >>> >>> One option to avoid the TSC option is to run domUs with tsc_mode=native. >>> This has the drawback that migrating a domU from a "2.3GHz" class host >>> to a "2.4GHz" class host may change the rate at wich the TSC counter >>> increases, the domU may not be prepared for that. >>> >>> With the new option the host admin can decide how a domU should behave >>> when it is migrated across systems of the same class. Since there is >>> always some jitter when Xen calibrates the cpu_khz value, all hosts of >>> the same class will most likely have slightly different values. As a >>> result vTSC emulation is unavoidable. Data collected during the incident >>> which triggered this change showed a jitter of up to 200 KHz across >>> systems of the same class. >> Do you have any further details of the systems involved? If they are >> identical systems, they should all have the same real TSC frequency, and >> its a known issue that Xen isn't very good at working out the >> frequency. TBH, fixing that would be far better overall. > Are you convinced all parts match their nominal frequency without > _any_ deviation? That is the intent of publishing the numbers, yes. > If that was the case, we could indeed use CPUID > leaves 0x15 / 0x16 output, if available. We very much should be doing this. There are also model-specific ways of getting the same data on older processors. > But I very much doubt this. > As an example, here's what bare metal Linux says on my newest > system: > > tsc: Detected 2600.000 MHz processor > tsc: Refined TSC clocksource calibration: 2591.990 MHz > > Xen figures: > > (XEN) Detected 2592.107 MHz processor. > > And then after another re-boot bare metal Linux again > > tsc: Refined TSC clocksource calibration: 2592.008 MHz What is surprising here? The calibration loop is not 100% accurate and cannot be made to be perfect. The fact that Linux and Xen agree is because they basically share the same calibration algorithm - not that the processor is really running at 2592MHz. For one, all calibration options will read slow by the amount of time it takes an interrupt to propagate through the system fabric, and there is basically nothing software can do to account for this. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 10/01/2018 03:00 PM, Jan Beulich wrote: On 01.10.18 at 15:38, wrote: >> On 10/01/2018 12:25 PM, Jan Beulich wrote: >>> I think the main concern >>> was with the way migration of the new value was implemented. But I >>> really have to defer to Andrew for that, irrespective of him not >>> having responded (on the list) to prior pings. >> >> Is Andrew really the only person who knows enough about migration to >> give this the thumbs-up? > > That's not the point here, at least afaic: He had voiced _some_ > concern on an earlier version. In such a case it is, I think, only > appropriate to wait with committing until there was indication > that the concerns were sufficiently addressed (verbally or by > adjustments to the code). Right -- but it's not your job to make sure the migration stuff is properly addressed; it's Wei and Ian's job. Wei's R-b was a statement from him that the code was good; when Andy questioned that, I think it was then *Wei's* job to address the question, not yours or Andy's (or even Olaf's). If Wei says, "I've considered Andy's objections and I think the patch is fine as-is", then it can be checked in (given a reasonable amount of time for Andy to respond); and Wei can own whatever consequences there are. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 15:38, wrote: > On 10/01/2018 12:25 PM, Jan Beulich wrote: >> I think the main concern >> was with the way migration of the new value was implemented. But I >> really have to defer to Andrew for that, irrespective of him not >> having responded (on the list) to prior pings. > > Is Andrew really the only person who knows enough about migration to > give this the thumbs-up? That's not the point here, at least afaic: He had voiced _some_ concern on an earlier version. In such a case it is, I think, only appropriate to wait with committing until there was indication that the concerns were sufficiently addressed (verbally or by adjustments to the code). It is instead my incomplete knowledge on the various parts of migration that made it undesirable to try to judge in his place, after pings lead no- where. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]
On 01/10/18 14:24, Ian Jackson wrote: > >>> handle_tsc_info has no code to verify that padding is indeed zero. Due >>> to the lack of a version field it is impossible to know if the sender >>> already has the newly introduced vtsc_tolerance field. In the worst >>> case the receiving domU will get an unemulated TSC. >> The lack of padding verification is deliberate, for forwards >> compatibility. Why does the sending code matter? One way or another, >> if the field is 0, the option wasn't present or wasn't configured. >> Neither of these situations affect the decision-making that the >> receiving side needs to perform. > We are talking here about an unused field that is (supposedly?) > always sent as zero ? The spec requires the padding to always be sent as zero, and verify-stream-v2 does check the padding and object to non-zero values. libxc's write_tsc_info() has always fully zeroed the structure, and convert-legacy-stream also behaves the same way. However, I notice this change will break migration from pre 4.6 as write_libxc_tsc_info() now has a mismatched number of parameters to pack and throw an exception. > > AIUI: > > In the new code the semantics of zero is "do not allow any tolerance". > The old code handles this correctly. The issue is with migrations > from new code to old: the tolerance value will silently ignored. Migrating from new to old doesn't remotely work, and whether things explode or not is very context dependent. I'm not sure its worth caring about this case. We never have before. (Getting backwards migration working for emergency cases is on my TODO list, but it is dependent on rewriting the hypervisor interfaces for getting/setting guest state.) ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 10/01/2018 12:25 PM, Jan Beulich wrote: On 01.10.18 at 12:52, wrote: >> Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to >> avoid TSC emulation"): >>> Am Thu, 13 Sep 2018 09:39:13 +0200 >>> schrieb Olaf Hering : this patch was not applied yet, even after a few "pings". >>> >>> No reaction since months. >>> So scrap that patch, just in case it is still part of someones to-consider >> queue. >> >> I think it would be worth exploring whether this patch could be >> applied without an explicit ack from Andrew. >> >> I confess I ignored all the previous mails because they all started >>Andrew, >> or >>Andrew, Lars, >> so I assumed that you didn't want attention from other >> maintainers/committers. >> >> Now that I look at the thread it is difficult for me to see the wood >> for the trees but I don't see unanswered concerns. > > Problem is - discussion around this was (iirc) happening not only on > the list, but also on irc (including perhaps private chats). It was for > that reason that I made my R-b conditional upon Andrew at least > giving an informal go-ahead (otherwise, together with Wei's R-b, > the patch could have gone in). > > Besides the question of correctness from the perspective of guests > (which imo is not a problem as the feature needs to be actively > enabled, and I think we have no reason to keep admins from > breaking their guests if they really mean to), I agree with this, BTW. > I think the main concern > was with the way migration of the new value was implemented. But I > really have to defer to Andrew for that, irrespective of him not > having responded (on the list) to prior pings. Is Andrew really the only person who knows enough about migration to give this the thumbs-up? Technically migration is in the toolstack, and so Wei should have double-checked that before giving his review; and when a question was raised, Wei (as the relevant maintainer who had given it an R-b) should either have asserted that the code was indeed correct, or withdrawn his R-b and given Olaf feedback to allow him to get it into shape. (This is all based on the history sketched out by Jan above; if there is more to it then of course this analysis may not be correct.) I was only skimming the thread, and intended to weigh in at some point; but I didn't really understand why it was blocked on Andrew in the first place. -George ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 14:39, wrote: > On 07/06/18 14:08, Olaf Hering wrote: >> Add an option to control when vTSC emulation will be activated for a >> domU with tsc_mode=default. Without such option each TSC access from >> domU will be emulated, which causes a significant perfomance drop for >> workloads that make use of rdtsc. >> >> One option to avoid the TSC option is to run domUs with tsc_mode=native. >> This has the drawback that migrating a domU from a "2.3GHz" class host >> to a "2.4GHz" class host may change the rate at wich the TSC counter >> increases, the domU may not be prepared for that. >> >> With the new option the host admin can decide how a domU should behave >> when it is migrated across systems of the same class. Since there is >> always some jitter when Xen calibrates the cpu_khz value, all hosts of >> the same class will most likely have slightly different values. As a >> result vTSC emulation is unavoidable. Data collected during the incident >> which triggered this change showed a jitter of up to 200 KHz across >> systems of the same class. > > Do you have any further details of the systems involved? If they are > identical systems, they should all have the same real TSC frequency, and > its a known issue that Xen isn't very good at working out the > frequency. TBH, fixing that would be far better overall. Are you convinced all parts match their nominal frequency without _any_ deviation? If that was the case, we could indeed use CPUID leaves 0x15 / 0x16 output, if available. But I very much doubt this. As an example, here's what bare metal Linux says on my newest system: tsc: Detected 2600.000 MHz processor tsc: Refined TSC clocksource calibration: 2591.990 MHz Xen figures: (XEN) Detected 2592.107 MHz processor. And then after another re-boot bare metal Linux again tsc: Refined TSC clocksource calibration: 2592.008 MHz Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation [and 1 more messages]
Jan Beulich writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation"): > Problem is - discussion around this was (iirc) happening not only on > the list, but also on irc (including perhaps private chats). Hrm. Well, if it didn't happen on the list, it didn't happen. It is often useful and productive, of course, to thrash something out on irc, or ping there, or whatever but: Conclusions from irc (and from in-person conversations, phone calls, or other kinds of un-minuted discussions) must be transferred to email as otherwise they are lost (and the effort of having them is often wasted as the conversation has to be had again). If properly writing up the conclusion from an irc conversation is too hard, pasting a transcript into an email seems a bare minimum. > It was for that reason that I made my R-b conditional upon Andrew at > least giving an informal go-ahead (otherwise, together with Wei's > R-b, the patch could have gone in). Right. Thanks for the clarification. Andrew Cooper writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation"): > [lots of stuff] Thanks for the review. I hope Olaf will be able to address most of it, but: Andrew: > [Olaf:] > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > > The padding is sent as zero in write_tsc_info to the receving host. > > The padding is undefined if the changed code runs as receiver. > > I'm not sure what you mean by this final sentence. Me neither. > > handle_tsc_info has no code to verify that padding is indeed zero. Due > > to the lack of a version field it is impossible to know if the sender > > already has the newly introduced vtsc_tolerance field. In the worst > > case the receiving domU will get an unemulated TSC. > > The lack of padding verification is deliberate, for forwards > compatibility. Why does the sending code matter? One way or another, > if the field is 0, the option wasn't present or wasn't configured. > Neither of these situations affect the decision-making that the > receiving side needs to perform. We are talking here about an unused field that is (supposedly?) always sent as zero ? AIUI: In the new code the semantics of zero is "do not allow any tolerance". The old code handles this correctly. The issue is with migrations from new code to old: the tolerance value will silently ignored. Presumably this is considered preferable to the alternative, which is to extend the migration stream in a deliberately incompatible way so that the migration fails. I think this is what Olaf is trying to say ? > > Signed-off-by: Olaf Hering > > Reviewed-by: Wei Liu (v07/v08) > > Reviewed-by: Jan Beulich (v08) > > I'm still -0.5 for this patch. I can appreciate why you want it, but it > is a gross hack which only works when you don't skew time more than NTP > in the guest can cope with. That surely is why there is a limit to the tolerance. I've asked Olaf to try to quantify an appropriate limit. > My gut feeling is that there will be other > more subtle fallout. That's not particularly convincing to me. But if you could try to be more specific I think this could be usefull addressed in the documentation for the feature ? Olaf, I think the ball is now in your court. I hope you can address Andrew's comments, and maybe mine too. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 07/06/18 14:08, Olaf Hering wrote: > Add an option to control when vTSC emulation will be activated for a > domU with tsc_mode=default. Without such option each TSC access from > domU will be emulated, which causes a significant perfomance drop for > workloads that make use of rdtsc. > > One option to avoid the TSC option is to run domUs with tsc_mode=native. > This has the drawback that migrating a domU from a "2.3GHz" class host > to a "2.4GHz" class host may change the rate at wich the TSC counter > increases, the domU may not be prepared for that. > > With the new option the host admin can decide how a domU should behave > when it is migrated across systems of the same class. Since there is > always some jitter when Xen calibrates the cpu_khz value, all hosts of > the same class will most likely have slightly different values. As a > result vTSC emulation is unavoidable. Data collected during the incident > which triggered this change showed a jitter of up to 200 KHz across > systems of the same class. Do you have any further details of the systems involved? If they are identical systems, they should all have the same real TSC frequency, and its a known issue that Xen isn't very good at working out the frequency. TBH, fixing that would be far better overall. > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > The padding is sent as zero in write_tsc_info to the receving host. > The padding is undefined if the changed code runs as receiver. I'm not sure what you mean by this final sentence. > handle_tsc_info has no code to verify that padding is indeed zero. Due > to the lack of a version field it is impossible to know if the sender > already has the newly introduced vtsc_tolerance field. In the worst > case the receiving domU will get an unemulated TSC. The lack of padding verification is deliberate, for forwards compatibility. Why does the sending code matter? One way or another, if the field is 0, the option wasn't present or wasn't configured. Neither of these situations affect the decision-making that the receiving side needs to perform. > > Signed-off-by: Olaf Hering > Reviewed-by: Wei Liu (v07/v08) > Reviewed-by: Jan Beulich (v08) I'm still -0.5 for this patch. I can appreciate why you want it, but it is a gross hack which only works when you don't skew time more than NTP in the guest can cope with. My gut feeling is that there will be other more subtle fallout. As for the implementation itself, a few trivial comments. > diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7 > index 3bbc96f201..122ae36679 100644 > --- a/docs/man/xen-tscmode.pod.7 > +++ b/docs/man/xen-tscmode.pod.7 > @@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated > > =back > > +If the tsc_mode is set to "default" the decision to emulate TSC can be > +tweaked further with the "vtsc_tolerance_khz" option. > + > To understand this in more detail, the rest of this document must > be read. > > @@ -211,6 +214,19 @@ is emulated. Note that, though emulated, the "apparent" > TSC frequency > will be the TSC frequency of the initial physical machine, even after > migration. > > +Since the calibration of the TSC frequency may not be 100% accurate, the > +exact value of the frequency can change even across reboots. It can change across reboots for other reasons, e.g. firmware settings. I'd phrase this as "Since the calibration of the TSC frequency isn't 100% accurate, the value measured by Xen can vary across reboots". > This means > +also several otherwise identical systems can have a slightly different > +TSC frequency. As a result TSC access will be emulated if a domU is > +migrated from one host to another, identical host. To avoid the > +performance impact of TSC emulation a certain tolerance of the measured > +host TSC frequency can be specified with "vtsc_tolerance_khz". If the > +measured "cpu_khz" value is within the tolerance range, TSC access > +remains native. Otherwise it will be emulated. This allows to migrate > +domUs between identical hardware. If the domU will be migrated to a > +different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system, > +TSC will be emualted to maintain the TSC frequency expected by the domU. > + > For environments where both TSC-safeness AND highest performance > even across migration is a requirement, application code can be specially > modified to use an algorithm explicitly designed into Xen for this purpose. > diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in > index 47d88243b1..995277794f 100644 > --- a/docs/man/xl.cfg.pod.5.in > +++ b/docs/man/xl.cfg.pod.5.in > @@ -1898,6 +1898,16 @@ determined in a similar way to that of B TSC > mode. > > Please see B for more information on this option. > > +=item B > + > +B<(x86 only, relevant only for tsc_mode=default)> > +When a domU is started, the CPU frequency of the host is used by the domU for > +TSC related time
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 01.10.18 at 12:52, wrote: > Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to > avoid TSC emulation"): >> Am Thu, 13 Sep 2018 09:39:13 +0200 >> schrieb Olaf Hering : >> > this patch was not applied yet, even after a few "pings". >> >> No reaction since months. >> So scrap that patch, just in case it is still part of someones to-consider > queue. > > I think it would be worth exploring whether this patch could be > applied without an explicit ack from Andrew. > > I confess I ignored all the previous mails because they all started >Andrew, > or >Andrew, Lars, > so I assumed that you didn't want attention from other > maintainers/committers. > > Now that I look at the thread it is difficult for me to see the wood > for the trees but I don't see unanswered concerns. Problem is - discussion around this was (iirc) happening not only on the list, but also on irc (including perhaps private chats). It was for that reason that I made my R-b conditional upon Andrew at least giving an informal go-ahead (otherwise, together with Wei's R-b, the patch could have gone in). Besides the question of correctness from the perspective of guests (which imo is not a problem as the feature needs to be actively enabled, and I think we have no reason to keep admins from breaking their guests if they really mean to), I think the main concern was with the way migration of the new value was implemented. But I really have to defer to Andrew for that, irrespective of him not having responded (on the list) to prior pings. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Olaf Hering writes ("Re: [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation"): > Am Thu, 13 Sep 2018 09:39:13 +0200 > schrieb Olaf Hering : > > this patch was not applied yet, even after a few "pings". > > No reaction since months. > So scrap that patch, just in case it is still part of someones to-consider > queue. I think it would be worth exploring whether this patch could be applied without an explicit ack from Andrew. I confess I ignored all the previous mails because they all started Andrew, or Andrew, Lars, so I assumed that you didn't want attention from other maintainers/committers. Now that I look at the thread it is difficult for me to see the wood for the trees but I don't see unanswered concerns. I guess the patch should maybe therefore be committed. As committer, before I did that despite a missing ack, I would want to (i) try to understand if possibly why the ack was missing (ii) do my own review to satisfy any doubts (iii) give the maintainer a clear tike interval to say "nack". As for (i) (reasons for lack of ack), see above, and it's not clear to me; but AFAICT there are two difficulties. One (see (a) below) is with the principle of the feature. One (see (b) below) is about a detail of the implementation. As for (ii) (own review), see below. That addresses (a) and (b). As for (iii): If anyone actually objects to this patch for some different reason, or to my proposed approach, please comment in the next 7 days. (a) Principle I read the documentation for the new feature and it seems to make sense. But it would be nice for the documentation to explain what is considered a likely safe and sufficient value. For example, one might say something like this: Typical TSC speed variation between supposedly identical hosts is about X%. A Unix guest running NTP for time synchronisation can cope with clock drift rates of up to about Y%. So a vtsc_tolerance_khz setting between these two values is likely to be effective for migration between "identical" hosts, and not disruptive. Olaf, please could you fill in the blanks in that text, and consider what the exact wording should be (depending on what research you conducted etc.), and then consider whether to put it in the documentation, commit meesage, or just on the mailing list ? Personally I think documentation would be ideal if we have firm information but if firm information is hard to come by and all we have is empirical data, then maybe a commit message describing those experiences would be sufficient. (b) Protocol compatibility Olaf: > > On 07.06.18 at 16:49, wrote: > > > Am Thu, 07 Jun 2018 08:44:41 -0600 > > > schrieb "Jan Beulich" : > > >> The re-use of the field is acceptable only if all existing > > >> senders reliably fill zero in there. > > > > > > How do we know all senders? I just know about write_tsc_info > > > from xen-4.6+. > > > > I don't think we care about senders other than ones using libxc, so > > by knowing whether all libxc versions conform to the requirement, > > we could declare we're fine. > > Yes, I think we are fine. And if migration from pre-4.6 is supported > anyway is another question. Most likely the answer is no. This comment from Olaf "Yes, I think we are fine" is not particularly reassuring. Olaf, can you please say something more definite about how you verified whether your assertion is true. And yes, we do mean also for versions from pre-4.6. It is not supported, but it must not go wrong in unexpected ways. You can use the git history to see older implementations in libxc. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Am Thu, 13 Sep 2018 09:39:13 +0200 schrieb Olaf Hering : > this patch was not applied yet, even after a few "pings". No reaction since months. So scrap that patch, just in case it is still part of someones to-consider queue. Olaf pgpFtD6Pj968F.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Andrew, Lars, this patch was not applied yet, even after a few "pings". I think it needs an approval from Andrew before it can go in, unless the bug has to be fixed in some other way. Please take some time to review that change. Thanks, Olaf Am Wed, 1 Aug 2018 18:17:47 +0200 schrieb Olaf Hering : > Am Tue, 26 Jun 2018 19:11:13 +0200 > schrieb Olaf Hering : > > Am Thu, 7 Jun 2018 15:08:29 +0200 > > schrieb Olaf Hering : > > > Add an option to control when vTSC emulation will be activated for a > > > domU with tsc_mode=default. Without such option each TSC access from > > > domU will be emulated, which causes a significant perfomance drop for > > > workloads that make use of rdtsc. > > Andrew, > > do you have any comments on this patch/approach? > Andrew, > any comments on this patch? > https://lists.xen.org/archives/html/xen-devel/2018-06/msg00398.html pgp5Rh4d0PxsH.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Am Tue, 26 Jun 2018 19:11:13 +0200 schrieb Olaf Hering : > Am Thu, 7 Jun 2018 15:08:29 +0200 > schrieb Olaf Hering : > > Add an option to control when vTSC emulation will be activated for a > > domU with tsc_mode=default. Without such option each TSC access from > > domU will be emulated, which causes a significant perfomance drop for > > workloads that make use of rdtsc. > Andrew, > do you have any comments on this patch/approach? Andrew, any comments on this patch? https://lists.xen.org/archives/html/xen-devel/2018-06/msg00398.html Olaf pgpIYIYqVDtEN.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Am Thu, 7 Jun 2018 15:08:29 +0200 schrieb Olaf Hering : > Add an option to control when vTSC emulation will be activated for a > domU with tsc_mode=default. Without such option each TSC access from > domU will be emulated, which causes a significant perfomance drop for > workloads that make use of rdtsc. Andrew, do you have any comments on this patch/approach? Olaf pgpclOaoS7m1Q.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On Thu, Jun 07, Jan Beulich wrote: > >>> On 07.06.18 at 16:49, wrote: > > Am Thu, 07 Jun 2018 08:44:41 -0600 > > schrieb "Jan Beulich" : > >> The re-use of the field is acceptable only if all existing senders reliably > >> fill zero in there. > > How do we know all senders? I just know about write_tsc_info from xen-4.6+. > I don't think we care about senders other than ones using libxc, so > by knowing whether all libxc versions conform to the requirement, > we could declare we're fine. Yes, I think we are fine. And if migration from pre-4.6 is supported anyway is another question. Most likely the answer is no. Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 07.06.18 at 16:49, wrote: > Am Thu, 07 Jun 2018 08:44:41 -0600 > schrieb "Jan Beulich" : > >> The re-use of the field is acceptable only if all existing senders reliably >> fill zero in there. > > How do we know all senders? I just know about write_tsc_info from xen-4.6+. I don't think we care about senders other than ones using libxc, so by knowing whether all libxc versions conform to the requirement, we could declare we're fine. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
Am Thu, 07 Jun 2018 08:44:41 -0600 schrieb "Jan Beulich" : > The re-use of the field is acceptable only if all existing senders reliably > fill zero in there. How do we know all senders? I just know about write_tsc_info from xen-4.6+. Olaf pgp7WNmQ7W4Jd.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 07.06.18 at 15:47, wrote: > Am Thu, 07 Jun 2018 07:31:26 -0600 > schrieb "Jan Beulich" : > >> Hmm, I find this description concerning. Is the field reliably zero when >> coming from older Xen, or is it not? > > It depends on how old. > If the other sending side has write_tsc_info from 4.6+, then _res1 is zero. > If it is older, the XC_SAVE_ID_TSC_INFO is eventually not handled anymore. > At least I could not find the relevant code, even in staging-4.6. > But then, it is unlikely that migration from 4.5 to 4.6 did not work. > > Also, in general the input comes from "remote". So it is untrusted. But isn't this exactly the concern Andrew had voiced? If so, we're not any closer to this having a chance to go in. The re-use of the field is acceptable only if all existing senders reliably fill zero in there. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
>>> On 07.06.18 at 15:08, wrote: > Add an option to control when vTSC emulation will be activated for a > domU with tsc_mode=default. Without such option each TSC access from > domU will be emulated, which causes a significant perfomance drop for > workloads that make use of rdtsc. > > One option to avoid the TSC option is to run domUs with tsc_mode=native. > This has the drawback that migrating a domU from a "2.3GHz" class host > to a "2.4GHz" class host may change the rate at wich the TSC counter > increases, the domU may not be prepared for that. > > With the new option the host admin can decide how a domU should behave > when it is migrated across systems of the same class. Since there is > always some jitter when Xen calibrates the cpu_khz value, all hosts of > the same class will most likely have slightly different values. As a > result vTSC emulation is unavoidable. Data collected during the incident > which triggered this change showed a jitter of up to 200 KHz across > systems of the same class. > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > The padding is sent as zero in write_tsc_info to the receving host. > The padding is undefined if the changed code runs as receiver. > handle_tsc_info has no code to verify that padding is indeed zero. Due > to the lack of a version field it is impossible to know if the sender > already has the newly introduced vtsc_tolerance field. In the worst > case the receiving domU will get an unemulated TSC. Hmm, I find this description concerning. Is the field reliably zero when coming from older Xen, or is it not? > Signed-off-by: Olaf Hering > Reviewed-by: Wei Liu (v07/v08) > Reviewed-by: Jan Beulich (v08) To avoid this getting committed prematurely: I gave this R-b pending Andrew finding his prior concerns addressed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel