Bug#534978: clock drift in Xen domU with clocksource=xen
Hello, Am 04.03.2010 um 17:24 Uhr schrieb Josip Rodin j...@debbugs.entuzijast.net: On Thu, Mar 04, 2010 at 05:21:31PM +0100, Markus Hochholdinger wrote: In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. have you any service like ntp running on these boxes!? What will the app do if ntp corrects the time!? NTP always corrects time in a very subtle manner (see its documentation). I guess it's possible for it to screw with this, yet it never has. for me, i try to only run ntp on the dom0 and no ntp on the domus. I never had a problem with this running 2.6.18 as dom0 and i never had a problem with this running 2.6.26 as dom0. With older xen kernels (i believe it was 2.6.16) i had a problem that the domU clock went wrong while the dom0 clock was correct, so at this time i changed independent_wallclock to 1 and setup ntp in the domU. Some of this domUs run till today with independent_wallclock 1 and ntp inside the domU without problems. Another note about stability: Since linux-image-2.6-xen-686 (2.6.26-20, october 2009) my dom0s with the lenny dom0 kernel are stable and had no crashes (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=542250). Now i try to use this kernel as domU kernel, but till now (2.6.26-22) it isn't stable enough for me, see also http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=534880 . Also some combination of java, debian 3.1 and kernel doesn't work with 2.6.26. java segfaults, but only when started with 2.6.26 and not when startet with 2.6.18. Also the same java version on older or newer debian system works. So for now i will use 2.6.18 (redhat xen flavor) for my domUs. Also linux-image-2.6.32-4-686 seems very good as domU, but i only made a short test: live migration works, mem-set works and detach and attach of a block device works as expected. -- greetings eMHa signature.asc Description: This is a digitally signed message part.
Bug#534978: clock drift in Xen domU with clocksource=xen
Hi, Am 23.02.2010 um 22:36 Uhr schrieb Moritz Muehlenhoff j...@inutil.org: On Tue, Feb 23, 2010 at 02:06:41PM +0100, Markus Hochholdinger wrote: Here is my solution to this problem, lenny xen kernel: * dom0 with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * domU with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * ntpdate/ntp only in dom0, NOT in the domUs [..] It would be nice if you could add this information to http://wiki.debian.org/Xen I've just added this as workaround #2. For me, there're no problems so far. -- greetings eMHa signature.asc Description: This is a digitally signed message part.
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Mar 04, 2010 at 05:21:31PM +0100, Markus Hochholdinger wrote: In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. have you any service like ntp running on these boxes!? What will the app do if ntp corrects the time!? NTP always corrects time in a very subtle manner (see its documentation). I guess it's possible for it to screw with this, yet it never has. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100304162452.ga22...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
Hi, [..] In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. have you any service like ntp running on these boxes!? What will the app do if ntp corrects the time!? -- greetings eMHa signature.asc Description: This is a digitally signed message part.
Bug#534978: clock drift in Xen domU with clocksource=xen
Hi, Markus Hochholdinger wrote: Here is my solution to this problem, lenny xen kernel: * dom0 with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * domU with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 Using jiffies as a clock source is not a solution, it's a workaround, because its resolution (CONFIG_HZ^1) is not good enough for reading microseconds, that is, time with microseconds will become just monotonic. This will cause problems for any program that wants its time readouts to be strictly increasing, as real-world time usually is :) In other words you will get this (real example from a while ago): % i=0; while :; do i=$((i+1)); if [ $i = 20 ]; then break; fi; date --rfc-3339=ns; done 2009-09-23 13:35:13.123400807+02:00 2009-09-23 13:35:13.127400857+02:00 2009-09-23 13:35:13.131400906+02:00 2009-09-23 13:35:13.135400956+02:00 2009-09-23 13:35:13.139401005+02:00 2009-09-23 13:35:13.143401055+02:00 2009-09-23 13:35:13.147401104+02:00 2009-09-23 13:35:13.151401154+02:00 2009-09-23 13:35:13.151401154+02:00 2009-09-23 13:35:13.155401203+02:00 2009-09-23 13:35:13.155401203+02:00 2009-09-23 13:35:13.159401253+02:00 2009-09-23 13:35:13.163401302+02:00 2009-09-23 13:35:13.167401352+02:00 2009-09-23 13:35:13.171401401+02:00 2009-09-23 13:35:13.171401401+02:00 2009-09-23 13:35:13.175401451+02:00 2009-09-23 13:35:13.179401500+02:00 2009-09-23 13:35:13.183401550+02:00 In my case this manifested itself when some PHP profiling via microtime() suddenly became useless, and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Having said that, if one doesn't switch to jiffies but wants to use live migration, that ends up being hampered by the Time went backwards problem. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225111854.ga18...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 01:33:05PM +0100, Bastian Blank wrote: On Thu, Feb 25, 2010 at 12:18:54PM +0100, Josip Rodin wrote: Using jiffies as a clock source is not a solution, it's a workaround, because its resolution (CONFIG_HZ^1) is not good enough for reading microseconds, that is, time with microseconds will become just monotonic. This will cause problems for any program that wants its time readouts to be strictly increasing, as real-world time usually is :) No. The time resolution is not defined and within one step it will always provide the same value. What? :) The problem here is that a time readout function provides the same value across *two* steps. A monotonic function is one which allows for that. A strictly increasing function is one which does not. Most of the time, just monotonic is okay, but not always. and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? The key column has a unique constraint and a default value of current timestamp. Even if two perfectly concurrent writers come in to add a new record, it's still logical to expect for them to be serialized to a minimal extent, because the database itself is explicitly instructed to input all values and maintain their uniqueness. The expectation that all updates take at least one minimal unit of time is perhaps not theoretically valid, but it's certainly like that in the real world (every action takes *some* perceivable time). -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225132755.ga11...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 12:18:54PM +0100, Josip Rodin wrote: Using jiffies as a clock source is not a solution, it's a workaround, because its resolution (CONFIG_HZ^1) is not good enough for reading microseconds, that is, time with microseconds will become just monotonic. This will cause problems for any program that wants its time readouts to be strictly increasing, as real-world time usually is :) No. The time resolution is not defined and within one step it will always provide the same value. and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? Bastian -- Killing is stupid; useless! -- McCoy, A Private Little War, stardate 4211.8 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225123305.ga3...@wavehammer.waldi.eu.org
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 02:27:55PM +0100, Josip Rodin wrote: No. The time resolution is not defined and within one step it will always provide the same value. What? :) The problem here is that a time readout function provides the same value across *two* steps. A monotonic function is one which allows for that. A strictly increasing function is one which does not. Most of the time, just monotonic is okay, but not always. No, the time is only monotone, not strictly monotone. (With discreet values, it is not possible to make it strictly monotone.) and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? The key column has a unique constraint and a default value of current timestamp. Even if two perfectly concurrent writers come in to add a new record, it's still logical to expect for them to be serialized to a minimal extent, because the database itself is explicitly instructed to input all values and maintain their uniqueness. The expectation that all updates take at least one minimal unit of time is perhaps not theoretically valid, but it's certainly like that in the real world (every action takes *some* perceivable time). Wrong answer. Where is this documented as working in the postgresql documentation? Bastian -- Not one hundred percent efficient, of course ... but nothing ever is. -- Kirk, Metamorphosis, stardate 3219.8 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225140141.ga14...@wavehammer.waldi.eu.org
Bug#534978: clock drift in Xen domU with clocksource=xen
On Thu, Feb 25, 2010 at 03:01:41PM +0100, Bastian Blank wrote: No. The time resolution is not defined and within one step it will always provide the same value. What? :) The problem here is that a time readout function provides the same value across *two* steps. A monotonic function is one which allows for that. A strictly increasing function is one which does not. Most of the time, just monotonic is okay, but not always. No, the time is only monotone, not strictly monotone. (With discreet values, it is not possible to make it strictly monotone.) You mean discrete. It's impossible to make it strictly monotone in the resolution that is smaller than the smallest unit of time (or one that converges into zero). But anyway, the problem isn't just the monotonicity, it's simply that e.g. with HZ of 250, a jiffie takes 4ms, so if you need to do anything with something that takes a comparable amount of time, you're shit outta luck. and it also caused occasional PostgreSQL errors with tables that had timestamp columns as keys, since it became possible for two independent transactions to come in at the exact same time. Äh, where is documented, that this supposed to work anyway? The key column has a unique constraint and a default value of current timestamp. Even if two perfectly concurrent writers come in to add a new record, it's still logical to expect for them to be serialized to a minimal extent, because the database itself is explicitly instructed to input all values and maintain their uniqueness. The expectation that all updates take at least one minimal unit of time is perhaps not theoretically valid, but it's certainly like that in the real world (every action takes *some* perceivable time). Wrong answer. Where is this documented as working in the postgresql documentation? I have no idea. Why would I need an exact documentation of this use case? The unique and default key parameters, and the definition of the timestamp data type are documented. Indeed, I just checked and the resolution of a timestamp is explicitly documented as 1 microsecond, so if the underlying system has a resolution of 4000 microseconds, that simply precludes it. If you're trying to argue that nobody should be using anything using microseconds because they're not supported by clocksource=jiffies, well, then we might as well cease this pointless discussion. -- 2. That which causes joy or happiness. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100225150156.ga...@orion.carnet.hr
Bug#534978: clock drift in Xen domU with clocksource=xen
Here is my solution to this problem, lenny xen kernel: * dom0 with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * domU with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * ntpdate/ntp only in dom0, NOT in the domUs I tested it the following way: While changing the time in the dom0 with date and/or hwclock doesn't change the time in the domU. But changing the time in the dom0 with ntpdate/ntpd changes the time in the domU. While ntpd is running in the dom0, I can change the time in the domU with date (hwclock --show in the domU prints nothing), but within 5 minutes, the time in the domU will be automatically corrected to the dom0 time. It seems ntp does this, because if i don't have a ntpd running in the dom0, the changed time in the domU wouldn't correct itself. Hope this helps somebody. If this setup is stable for me for a few weeks, perhaps I'll write it to http://wiki.debian.org/Xen#A.27clocksource.2BAC8-0.3ATimewentbackwards.27 . I still don't understand why ntp on the dom0 can change the time in the domU, while date and hwclock doesn't!? But it works as I expect it. -- greetings eMHa signature.asc Description: This is a digitally signed message part.
Bug#534978: clock drift in Xen domU with clocksource=xen
On Tue, Feb 23, 2010 at 02:06:41PM +0100, Markus Hochholdinger wrote: Here is my solution to this problem, lenny xen kernel: * dom0 with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * domU with clocksource=jiffies and /proc/sys/xen/independent_wallclock=0 * ntpdate/ntp only in dom0, NOT in the domUs I tested it the following way: While changing the time in the dom0 with date and/or hwclock doesn't change the time in the domU. But changing the time in the dom0 with ntpdate/ntpd changes the time in the domU. While ntpd is running in the dom0, I can change the time in the domU with date (hwclock --show in the domU prints nothing), but within 5 minutes, the time in the domU will be automatically corrected to the dom0 time. It seems ntp does this, because if i don't have a ntpd running in the dom0, the changed time in the domU wouldn't correct itself. Hope this helps somebody. If this setup is stable for me for a few weeks, perhaps I'll write it to http://wiki.debian.org/Xen#A.27clocksource.2BAC8-0.3ATimewentbackwards.27 . I still don't understand why ntp on the dom0 can change the time in the domU, while date and hwclock doesn't!? But it works as I expect it. It would be nice if you could add this information to http://wiki.debian.org/Xen Cheers, Moritz -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100223213647.ga23...@galadriel.inutil.org
Bug#534978: clock drift in Xen domU with clocksource=xen
severity 534978 normal thanks I've made some more progress in understanding this behaviour, and have now figured out a workaround. I find the documentation at http://wiki.debian.org/Xen very misleading in several respects. The domU kernel is receiving time info from the hypervisor as it should. My earlier suspicion that the vcpu_info wasn't being updated turned out to be both unfounded (I had missed the significance of vcpu_info placement) and irrelevant (even without the vcpu_info updates the time as computed by pvclock_read_wallclock() drifts several orders of magnitude more slowly than observed). Linux' generic time code (in kernel/time/timekeeping.c), however, doesn't use the value from pvclock_read_wallclock() directly; instead, it uses the clocksource (xen_clocksource in this case) to compute an increment to the xtime kernel variable. For some reason I haven't fully worked out, this results in the drift I observed. I still suspect there is a bug here (the accuracy of the time calculation ought to be better than this) so I'm downgrading the severity to normal rather than wishlist. The good news is that the NTP support code can correct for this. My workaround is therefore to run NTP in the domU. It is neither possible nor necessary to set xen.independent_wallclock=1 (that parameter is only supported by the featureset=xen kernels, which include the SuSE patch), and it is neither necessary nor desirable to change the domU clocksource to jiffies (I tried that and found that the time accuracy got much worse). I had been led to expect (again, by the wiki page, but also by common sense) that the domU was meant to get its clock from Xen and didn't need to run NTP. This appears to only be the case when the domU kernel includes the SuSE patch, not for the pv_ops-based approach. I would very much appreciate being relieved of the need to run NTP in each and every domU: that seems wasteful. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534978: clock drift in Xen domU with clocksource=xen
I think I've made some progress towards figuring out what's going on here. First I looked at the Xen mini-os kernel, which keeps time correctly. I added a few printk()s to getttimeofday() and saw that of the values in the HYPERVISOR_shared_info structure, the vcpu_info data change often (never more than a handful of seconds between version increments) while the wallclock timestamp is updated more rarely. Then I hacked together a Linux kernel module that adds support for /proc/xeninfo, exposing (if I did it right) the contents of the shared_info structure. What I'm seeing is the same occasional updating of the wallclock timestamp (the values are consistent with what I see in the mini-os domU) but the vcpu_info (for virtual CPU 0; data for the other VCPUs are all zeros throughout, as I believe is normal for a single-processor VM) remains stuck at version 2. A caveat here is that while I'm confident (based on the data; the shift value is also right, the multiplier is in the right ballpark) that I've found a shared_info structure I'm not sure I got the right one. The kernel doesn't seem to export all the symbols needed to find its HYPERVISOR_shared_info structure, and it needs to be mapped into memory in a special way; it's conceivable that I did something wrong here, even though I tried to reuse/imitate existing kernel code as much as I could. Anyway, the secular clock drift this bug is about seems consistent with a failure to receive updates to the vcpu_info data. Is the hypervisor somehow discriminating against Linux domU's by not updating the data, or does the domU kernel need to do something more in order to see the updates? The problem is also reproducible with the 2.6.30-bpo.1 kernel (source code from backports.org, recompiled locally), by the way. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534978: clock drift in Xen domU with clocksource=xen
Package: linux-image-2.6.26-2-686-bigmem Version: 2.6.26-15lenny3 Severity: important I'm running this kernel in a Xen domU using the xen clocksource: # cat /sys/devices/system/clocksource/clocksource0/current_clocksource xen The dom0 is running linux-image-2.6.26-2-xen-686 (same version 2.6.26-15lenny3), also with the xen clocksource, and an NTP client. My understanding of the documentation is that the domU's wall clock should be based on information passed (in shared memory) by the hypervisor, which in turn gets clock updates from dom0. I'm observing that the domU's clock runs fast relative to dom0 and the rest of the world. Rebooting the domU causes its clock to be reset to the correct time. Moreover, I've tried running Xen's mini-os.gz (not in Debian's binary packages of Xen, I built it from the extras/mini-os directory of the xen-3 source package) as another domU on the same system, and it printed correct timestamps. From this I deduce that the hypervisor's notion of time is correct, and that the problem must lie in how the domU kernel uses the information from the hypervisor. So far I haven't observed the 'clocksource/0: Time went backwards' error message mentioned at http://wiki.debian.org/Xen . I know I could switch the domU to the jiffies clocksource and run NTP in it, but that's only a workaround. xm info on the dom0 reports: release: 2.6.26-2-xen-686 version: #1 SMP Thu May 28 18:35:28 UTC 2009 machine: i686 nr_cpus: 1 nr_nodes : 1 cores_per_socket : 1 threads_per_core : 1 cpu_mhz: 2399 hw_caps: bfebfbff:::0080:0400 total_memory : 2559 free_memory: 32 node_to_cpu: node0:0 xen_major : 3 xen_minor : 2 xen_extra : -1 xen_caps : xen-3.0-x86_32p xen_scheduler : credit xen_pagesize : 4096 platform_params: virt_start=0xf580 xen_changeset : unavailable cc_compiler: gcc version 4.3.1 (Debian 4.3.1-2) cc_compile_by : waldi cc_compile_domain : debian.org cc_compile_date: Sat Jun 28 15:25:00 UTC 2008 xend_config_format : 4 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org