Re: time issues and ZFS
On 26 January 2013 02:15, Andriy Gapon a...@freebsd.org wrote: on 23/01/2013 18:20 Adrian Chadd said the following: It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be a quirk of the older generation celeron hardware - in which case, we need to tell the user somehow.. This is not software related at all. It's the hardware feature (or its absence). I wonder if your celerons report PBE feature. What am I looking for? And personally, requiring (much) more recent hardware to get sane/correct (btu inefficient) behaviour out of the timekeeping framework is a little .. suboptimal. :) Adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
on 27/01/2013 19:27 Adrian Chadd said the following: On 26 January 2013 02:15, Andriy Gapon a...@freebsd.org wrote: on 23/01/2013 18:20 Adrian Chadd said the following: It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be a quirk of the older generation celeron hardware - in which case, we need to tell the user somehow.. This is not software related at all. It's the hardware feature (or its absence). I wonder if your celerons report PBE feature. What am I looking for? PBE in dmesg And personally, requiring (much) more recent hardware to get sane/correct (btu inefficient) behaviour out of the timekeeping framework is a little .. suboptimal. :) Well, I never knew about this issue before but I always assumed that the reasonable behavior was the behavior. And I never encountered any evidence to the contrary. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
on 23/01/2013 18:20 Adrian Chadd said the following: It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be a quirk of the older generation celeron hardware - in which case, we need to tell the user somehow.. This is not software related at all. It's the hardware feature (or its absence). I wonder if your celerons report PBE feature. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
on 22/01/2013 20:42 Adrian Chadd said the following: Hi! As I said before, the problem with non-HLT loops with event-timer in -9 and -head is that it calls the idle function inside a critical section (critical_enter and critical_exit) which blocks interrupts from occuring. The EI;HLT instruction pair on i386/amd64 atomically and correctly handles things from what I've been told. However, there's no atomic way to do this using ACPI sleeping, so there's a small window where an interrupt may come in but it isn't handled; waiting for the next interrupt to occur before it'll wake up and respond to that interrupt. I don't think that this is true of x86 hardware in general. You might have hit some limitation or a quirk or a bug or an erratum for some particular hardware. E.g. a chipset on this machine has a bit described as such: Set to 1 to skip the C state transition if there is break event when entering C state. The bit is set indeed and as far as I can tell the behavior matches the description. Most modern (non-embedded) machines seem to behave this way. Attempt to enter a deeper C state while a break event is pending still incurs some overhead, but it's not as bad as waiting for the next break event. I kept hitting my head against this when doing network testing. :( Now - specifically for timekeeping it shouldn't matter; that's to do with whether the counters are reliable or not (and heck, are even in lock-step on CPUs.) But extra latency could show up weirdly, hence why I was asking for you to try different timer configurations and idle loops. -- Andriy Gapon -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On 23 January 2013 06:58, Andriy Gapon a...@freebsd.org wrote: I don't think that this is true of x86 hardware in general. You might have hit some limitation or a quirk or a bug or an erratum for some particular hardware. E.g. a chipset on this machine has a bit described as such: Set to 1 to skip the C state transition if there is break event when entering C state. The bit is set indeed and as far as I can tell the behavior matches the description. Most modern (non-embedded) machines seem to behave this way. Attempt to enter a deeper C state while a break event is pending still incurs some overhead, but it's not as bad as waiting for the next break event. I'll reverify the behaviour on my netbooks when I'm back home. It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be a quirk of the older generation celeron hardware - in which case, we need to tell the user somehow.. Adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Jan 22, 2013, at 2:40 AM, Adrian Chadd adr...@freebsd.org wrote: On Jan 21, 2013, at 4:33 AM, Daniel Braniss da...@cs.huji.ac.il wrote: host: DELL PowerEdge R710, 16GB, I administer a Dell PowerEdge R710 and I've been seeing the exact same thing. It's currently running FreeBSD 9.0-STABLE #0 r236355. It has a ZFS pool which sees moderate load most of the time but can be very high at times (when certain scripts run, etc.). I hadn't previously correlated the issue with ZFS load but that is very possible. I set a cron job to restart ntpd when it dies (because the time difference exceeds the sanity check). The cron job runs every 20 minutes, but that varies greatly when the system stops counting. The time offset from ntpdate (which the script runs before restarting ntpd) varies a lot, but always in increments of 300 seconds. I've seen everything from 1200 to 23100. (Yes, that's 23 thousand seconds aka 6 hours 25 minutes that the system wasn't keeping time for.) Sysctl kern.timecounter.hardware defaults to HPET. I experimented with setting it to ACPI-fast but the issue persisted so I put it back. kern.timecounter.choice: TSC-low(-100) ACPI-fast(900) HPET(950) i8254(0) dummy(-100) I first installed the box with an older 9.0-STABLE and this issue was not present. I have been tracking -STABLE on it (albeit irregularly) so I'm not sure when the issue came up. Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? I would love to resolve this and am able to do some experimenting. I've _usually_ been seeing the issue 2-3 times every 1-2 days, but I did just make some changes: disabling ZFS compression and deduplication on all pools updated to 9.1-STABLE from yesterday (r245821) If the issue persists I will try changing some of the sysctls above and follow up with the result. If it goes away, I'll try to remember to report that too. JN ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Jan 22, 2013, at 2:40 AM, Adrian Chadd adr...@freebsd.org wrote: On Jan 21, 2013, at 4:33 AM, Daniel Braniss da...@cs.huji.ac.il wrote: host: DELL PowerEdge R710, 16GB, I administer a Dell PowerEdge R710 and I've been seeing the exact same =thing. It's currently running FreeBSD 9.0-STABLE #0 r236355. It has a =ZFS pool which sees moderate load most of the time but can be very high =at times (when certain scripts run, etc.). I hadn't previously =correlated the issue with ZFS load but that is very possible. I set a cron job to restart ntpd when it dies (because the time =difference exceeds the sanity check). The cron job runs every 20 =minutes, but that varies greatly when the system stops counting. The =time offset from ntpdate (which the script runs before restarting ntpd) =varies a lot, but always in increments of 300 seconds. I've seen =everything from 1200 to 23100. (Yes, that's 23 thousand seconds aka 6 =hours 25 minutes that the system wasn't keeping time for.) Sysctl kern.timecounter.hardware defaults to HPET. I experimented with =setting it to ACPI-fast but the issue persisted so I put it back. kern.timecounter.choice: TSC-low(-100) ACPI-fast(900) HPET(950) i8254(0) =dummy(-100) I first installed the box with an older 9.0-STABLE and this issue was =not present. I have been tracking -STABLE on it (albeit irregularly) so =I'm not sure when the issue came up. Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? I would love to resolve this and am able to do some experimenting. I've =_usually_ been seeing the issue 2-3 times every 1-2 days, but I did just =make some changes: disabling ZFS compression and deduplication on all pools updated to 9.1-STABLE from yesterday (r245821) If the issue persists I will try changing some of the sysctls above and =follow up with the result. If it goes away, I'll try to remember to =report that too. JN set kern.eventtimer.timer=LAPIC this solved it for me. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? Adrian, not yet, for several reasons: 1- as I explained, I can't realy force the problem, it happens when we run some zfs scripts, like mirror, but have to wait till enough changes happened on the source, usualy after 24hs. 2- changing to LAPIC seems to have solved the problem. 3- I'm now learning all I can about event timers and you have not answered some of my questions :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Thus spake Daniel Braniss da...@cs.huji.ac.il: In the meantime here is some info: Intel(R) Xeon(R) CPU E5645: running with no problems LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0) Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0) Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Tue, Jan 22, 2013 at 7:27 AM, Julian Stecklina jstec...@os.inf.tu-dresden.de wrote: Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... On many processors the lapic timer does not work correctly in states lower than C1. There are many processors that will automatically enter a C1E mode when the processor is idle, and in that state I have seen the lapic timer run slower than the programmed frequency, causing time to move to slowly on idle FreeBSD systems. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On 01/22/13 07:27, Julian Stecklina wrote: Thus spake Daniel Braniss da...@cs.huji.ac.il: In the meantime here is some info: Intel(R) Xeon(R) CPU E5645: running with no problems LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0) Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0) Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org This may help: Problem with LAPIC timer is that it stops working when CPU goes to C3 or deeper idle state. These states are not enabled by default, so unless you enabled them explicitly, it is safe to use LAPIC. In any case present 9-STABLE system should prevent you from using unsafe C-state if LAPIC timer is used. From all other perspectives LAPIC is preferable, as it is faster and easier to operate then HPET. Latest CPUs fixed the LAPIC timer problem, so I don't think that switching to it will be pessimistic in foreseeable future. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Hi! As I said before, the problem with non-HLT loops with event-timer in -9 and -head is that it calls the idle function inside a critical section (critical_enter and critical_exit) which blocks interrupts from occuring. The EI;HLT instruction pair on i386/amd64 atomically and correctly handles things from what I've been told. However, there's no atomic way to do this using ACPI sleeping, so there's a small window where an interrupt may come in but it isn't handled; waiting for the next interrupt to occur before it'll wake up and respond to that interrupt. I kept hitting my head against this when doing network testing. :( Now - specifically for timekeeping it shouldn't matter; that's to do with whether the counters are reliable or not (and heck, are even in lock-step on CPUs.) But extra latency could show up weirdly, hence why I was asking for you to try different timer configurations and idle loops. Thanks, Adrian On 22 January 2013 01:55, Daniel Braniss da...@cs.huji.ac.il wrote: Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? Adrian, not yet, for several reasons: 1- as I explained, I can't realy force the problem, it happens when we run some zfs scripts, like mirror, but have to wait till enough changes happened on the source, usualy after 24hs. 2- changing to LAPIC seems to have solved the problem. 3- I'm now learning all I can about event timers and you have not answered some of my questions :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
time issues and ZFS
After many trials (and errors), here are some facts: host: DELL PowerEdge R710, 16GB, mfi0: Dell PERC H700 Integrated mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal mfi1: Dell PERC 6 mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is optimal we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the real-time clock slows down when doing some zfs stuff like send|receive, typing 'date' when less that 1000s went by seems to crorrect the problem, ntpd kicks in and on track again. I have a cron job just logging date every 5 minutes, and the loghost sees: |-- local time on loghost | time on problematic host Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 19:56:19 IST 2013 -- ok Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 20:15:00 IST 2013 -- ok Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 20:21:06 IST 2013 -- off by 1:09 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 20:25:00 IST 2013 -- off by 1:08 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 20:30:00 IST 2013 -- off by 1:09 ... Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 20:55:00 IST 2013 -- diff is now constant .. Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time correction of 4134 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 21:50:00 IST 2013 strangely, when running 8.3, ACPI-fast is chosen: kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) but with 9.1 TSC-low gets chosen: kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) i8254(0) dummy(-100) so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - unless it can't be changed after boot. I realy need help here! thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Hi, Try experimenting with kern.eventtimer.periodic and kern.eventtimer.idletick. If this fixes it for you, please file a PR with all the relevant details. Thanks! Adrian On 21 January 2013 03:33, Daniel Braniss da...@cs.huji.ac.il wrote: After many trials (and errors), here are some facts: host: DELL PowerEdge R710, 16GB, mfi0: Dell PERC H700 Integrated mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal mfi1: Dell PERC 6 mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is optimal we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the real-time clock slows down when doing some zfs stuff like send|receive, typing 'date' when less that 1000s went by seems to crorrect the problem, ntpd kicks in and on track again. I have a cron job just logging date every 5 minutes, and the loghost sees: |-- local time on loghost | time on problematic host Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 19:56:19 IST 2013 -- ok Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 20:15:00 IST 2013 -- ok Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 20:21:06 IST 2013 -- off by 1:09 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 20:25:00 IST 2013 -- off by 1:08 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 20:30:00 IST 2013 -- off by 1:09 ... Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 20:55:00 IST 2013 -- diff is now constant .. Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time correction of 4134 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 21:50:00 IST 2013 strangely, when running 8.3, ACPI-fast is chosen: kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) but with 9.1 TSC-low gets chosen: kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) i8254(0) dummy(-100) so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - unless it can't be changed after boot. I realy need help here! thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Hi, Try experimenting with kern.eventtimer.periodic and kern.eventtimer.idletick. can you give/point to some info about this? btw, I just noticed that on this hardware I get: 9.1-STABLE: vmstat -i interrupt total rate irq3: uart1 931 0 irq4: uart05 0 irq19: ehci01331 0 irq20: hpet0 uhci3 1687937 1163 irq21: uhci2 ehci129 0 irq23: atapci048 0 irq256: bce0 52270 36 irq260: mfi0 14690 10 irq261: mfi13088 2 Total1760329 1213 no cpu timer, instead irq20: hpet0 uhci3, and when 8.3-STABLE: vmstat -i interrupt total rate irq3: uart1 1048 0 irq4: uart05 0 irq19: ehci0 280451 1 irq21: uhci2 ehci129 0 irq23: atapci052 0 cpu0:timer 313544623 1125 irq256: bce030791673110 irq260: mfi0 1372186 4 cpu1:timer 1294093 4 ... total 384382790 1380 is this OK? If this fixes it for you, please file a PR with all the relevant details. I will! Thanks! Adrian On 21 January 2013 03:33, Daniel Braniss da...@cs.huji.ac.il wrote: After many trials (and errors), here are some facts: host: DELL PowerEdge R710, 16GB, mfi0: Dell PERC H700 Integrated mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal mfi1: Dell PERC 6 mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is optimal we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the real-time clock slows down when doing some zfs stuff like send|receive, typing 'date' when less that 1000s went by seems to crorrect the problem, ntpd kicks in and on track again. I have a cron job just logging date every 5 minutes, and the loghost sees: |-- local time on loghost | time on problematic host Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 19:56:19 IST 2013 -- ok Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 20:15:00 IST 2013 -- ok Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 20:21:06 IST 2013 -- off by 1:09 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 20:25:00 IST 2013 -- off by 1:08 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 20:30:00 IST 2013 -- off by 1:09 ... Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 20:55:00 IST 2013 -- diff is now constant .. Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time correction of 4134 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 21:50:00 IST 2013 strangely, when running 8.3, ACPI-fast is chosen: kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) but with 9.1 TSC-low gets chosen: kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) i8254(0) dummy(-100) so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - unless it can't be changed after boot. I realy need help here! thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Mon, 2013-01-21 at 13:33 +0200, Daniel Braniss wrote: After many trials (and errors), here are some facts: host: DELL PowerEdge R710, 16GB, mfi0: Dell PERC H700 Integrated mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal mfi1: Dell PERC 6 mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is optimal we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the real-time clock slows down when doing some zfs stuff like send|receive, typing 'date' when less that 1000s went by seems to crorrect the problem, ntpd kicks in and on track again. I have a cron job just logging date every 5 minutes, and the loghost sees: |-- local time on loghost | time on problematic host Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 19:56:19 IST 2013 -- ok Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 20:15:00 IST 2013 -- ok Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 20:21:06 IST 2013 -- off by 1:09 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 20:25:00 IST 2013 -- off by 1:08 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 20:30:00 IST 2013 -- off by 1:09 ... Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 20:55:00 IST 2013 -- diff is now constant .. Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time correction of 4134 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 21:50:00 IST 2013 strangely, when running 8.3, ACPI-fast is chosen: kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) but with 9.1 TSC-low gets chosen: kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) i8254(0) dummy(-100) so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - unless it can't be changed after boot. I realy need help here! thanks, danny What's the output of sysctl kern.eventtimer? Does the bad behavior change if you set kern.eventimer.periodic=1? -- Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
... What's the output of sysctl kern.eventtimer? kern.eventtimer.periodic is 0 Does the bad behavior change if you set kern.eventimer.periodic=1? setting kern.eventtimer.timer=LAPIC instead of the default HPET made the missing cpu timers to appear: # vmstat -i interrupt total rate irq3: uart1 1695 0 irq4: uart05 0 irq19: ehci03875 0 irq20: hpet0 uhci3 5495755 1135 irq21: uhci2 ehci129 0 irq23: atapci048 0 cpu0:timer 7063 1 irq256: bce0 117073 24 irq260: mfi0 51083 10 irq261: mfi13088 0 cpu1:timer 484 0 cpu14:timer 36 0 cpu6:timer 486 0 cpu8:timer38 0 cpu5:timer38 0 cpu15:timer 38 0 cpu7:timer32 0 cpu12:timer 38 0 cpu3:timer40 0 cpu9:timer36 0 cpu10:timer 34 0 cpu11:timer 37 0 cpu2:timer33 0 cpu13:timer 40 0 cpu4:timer36 0 Total5681160 1173 is this relevant? danny -- Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote: ... What's the output of sysctl kern.eventtimer? kern.eventtimer.periodic is 0 Does the bad behavior change if you set kern.eventimer.periodic=1? setting kern.eventtimer.timer=LAPIC instead of the default HPET made the missing cpu timers to appear: # vmstat -i interrupt total rate irq3: uart1 1695 0 irq4: uart05 0 irq19: ehci03875 0 irq20: hpet0 uhci3 5495755 1135 irq21: uhci2 ehci129 0 irq23: atapci048 0 cpu0:timer 7063 1 irq256: bce0 117073 24 irq260: mfi0 51083 10 irq261: mfi13088 0 cpu1:timer 484 0 cpu14:timer 36 0 cpu6:timer 486 0 cpu8:timer38 0 cpu5:timer38 0 cpu15:timer 38 0 cpu7:timer32 0 cpu12:timer 38 0 cpu3:timer40 0 cpu9:timer36 0 cpu10:timer 34 0 cpu11:timer 37 0 cpu2:timer33 0 cpu13:timer 40 0 cpu4:timer36 0 Total5681160 1173 is this relevant? I'll have to let someone who knows modern x86 hardware better comment on the relative merits of hpet vs. lapic timers. If it was using hpet in one-shot mode, and changing it to hpet in periodic mode makes the problem go away, that might be a clue that there's something wrong in the hpet eventtimer start or interrupt routines. I wonder if a single missed interrupt in one-shot mode would bring an eventtimer to a halt like that? And if so, then what is it about manually asking for the date that kicks it into running again? -- Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
I still firmly believe the ACPI event timer code is racy, and what we may be seeing here is the fallout from that. It's very possible that we're missing interrupts here - the new eventtimer code that made it into 9.x puts the halt behind a critical section, with interrupts disabled. The only platforms that correctly implement enable-interrupts-and-halt atomically is the HLT (well, and the don't-sleep-at-all) idle loops on i386/amd64. The default method is to use the ACPI sleep method, which doesn't do atomic interrupt enable / halt. I'm still seeing odd stuff on some of my ACPI-using netbooks when doing net80211/ath development and it all goes away whenever I fondle with the above settings. So, play with kern.eventtimer.periodic, kern.eventtimer.idletick and machdep.idle (try setting machdep.idle to hlt, or something else listed in machdep.idle_available) - please report back what the results are. Adrian On 21 January 2013 07:54, Ian Lepore i...@freebsd.org wrote: On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote: ... What's the output of sysctl kern.eventtimer? kern.eventtimer.periodic is 0 Does the bad behavior change if you set kern.eventimer.periodic=1? setting kern.eventtimer.timer=LAPIC instead of the default HPET made the missing cpu timers to appear: # vmstat -i interrupt total rate irq3: uart1 1695 0 irq4: uart05 0 irq19: ehci03875 0 irq20: hpet0 uhci3 5495755 1135 irq21: uhci2 ehci129 0 irq23: atapci048 0 cpu0:timer 7063 1 irq256: bce0 117073 24 irq260: mfi0 51083 10 irq261: mfi13088 0 cpu1:timer 484 0 cpu14:timer 36 0 cpu6:timer 486 0 cpu8:timer38 0 cpu5:timer38 0 cpu15:timer 38 0 cpu7:timer32 0 cpu12:timer 38 0 cpu3:timer40 0 cpu9:timer36 0 cpu10:timer 34 0 cpu11:timer 37 0 cpu2:timer33 0 cpu13:timer 40 0 cpu4:timer36 0 Total5681160 1173 is this relevant? I'll have to let someone who knows modern x86 hardware better comment on the relative merits of hpet vs. lapic timers. If it was using hpet in one-shot mode, and changing it to hpet in periodic mode makes the problem go away, that might be a clue that there's something wrong in the hpet eventtimer start or interrupt routines. I wonder if a single missed interrupt in one-shot mode would bring an eventtimer to a halt like that? And if so, then what is it about manually asking for the date that kicks it into running again? -- Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
I still firmly believe the ACPI event timer code is racy, and what we may be seeing here is the fallout from that. It's very possible that we're missing interrupts here - the new eventtimer code that made it into 9.x puts the halt behind a critical section, with interrupts disabled. The only platforms that correctly implement enable-interrupts-and-halt atomically is the HLT (well, and the don't-sleep-at-all) idle loops on i386/amd64. The default method is to use the ACPI sleep method, which doesn't do atomic interrupt enable / halt. I'm still seeing odd stuff on some of my ACPI-using netbooks when doing net80211/ath development and it all goes away whenever I fondle with the above settings. So, play with kern.eventtimer.periodic, kern.eventtimer.idletick and machdep.idle (try setting machdep.idle to hlt, or something else listed in machdep.idle_available) - please report back what the results are. Adrian Adrian, you mention that ACPI is racy, which event timer are you talking about? how is the quality chosen? at the moment switching kern.eventtimer.timer to LAPIC seems to have done the trick. I'll have to wait another 24hs to make sure. In the meantime here is some info: Intel(R) Xeon(R) CPU E5645: running with no problems LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0) Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0) Dual-Core AMD Opteron(tm) Processor 2218: running with no problems LAPIC(400) RTC(0) so if someone is running 9.1 on any of the following and can provide the output of sysctl kern.eventtimer.choice would be nice: Intel(R) Xeon(R) CPU E5410 Intel(R) Xeon(R) CPU E5507 btw, all the above are on server MBs. thanks, danny On 21 January 2013 07:54, Ian Lepore i...@freebsd.org wrote: On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote: ... What's the output of sysctl kern.eventtimer? kern.eventtimer.periodic is 0 Does the bad behavior change if you set kern.eventimer.periodic=1? setting kern.eventtimer.timer=LAPIC instead of the default HPET made the missing cpu timers to appear: # vmstat -i interrupt total rate irq3: uart1 1695 0 irq4: uart05 0 irq19: ehci03875 0 irq20: hpet0 uhci3 5495755 1135 irq21: uhci2 ehci129 0 irq23: atapci048 0 cpu0:timer 7063 1 irq256: bce0 117073 24 irq260: mfi0 51083 10 irq261: mfi13088 0 cpu1:timer 484 0 cpu14:timer 36 0 cpu6:timer 486 0 cpu8:timer38 0 cpu5:timer38 0 cpu15:timer 38 0 cpu7:timer32 0 cpu12:timer 38 0 cpu3:timer40 0 cpu9:timer36 0 cpu10:timer 34 0 cpu11:timer 37 0 cpu2:timer33 0 cpu13:timer 40 0 cpu4:timer36 0 Total5681160 1173 is this relevant? I'll have to let someone who knows modern x86 hardware better comment on the relative merits of hpet vs. lapic timers. If it was using hpet in one-shot mode, and changing it to hpet in periodic mode makes the problem go away, that might be a clue that there's something wrong in the hpet eventtimer start or interrupt routines. I wonder if a single missed interrupt in one-shot mode would bring an eventtimer to a halt like that? And if so, then what is it about manually asking for the date that kicks it into running again? -- Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org