Re: time issues and ZFS

2013-01-27 Thread Adrian Chadd
On 26 January 2013 02:15, Andriy Gapon a...@freebsd.org wrote:
 on 23/01/2013 18:20 Adrian Chadd said the following:
 It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be
 a quirk of the older generation celeron hardware - in which case, we
 need to tell the user somehow..

 This is not software related at all.  It's the hardware feature (or its 
 absence).
 I wonder if your celerons report PBE feature.

What am I looking for?

And personally, requiring (much) more recent hardware to get
sane/correct (btu inefficient) behaviour out of the timekeeping
framework is a little .. suboptimal. :)


Adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-27 Thread Andriy Gapon
on 27/01/2013 19:27 Adrian Chadd said the following:
 On 26 January 2013 02:15, Andriy Gapon a...@freebsd.org wrote:
 on 23/01/2013 18:20 Adrian Chadd said the following:
 It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be
 a quirk of the older generation celeron hardware - in which case, we
 need to tell the user somehow..

 This is not software related at all.  It's the hardware feature (or its 
 absence).
 I wonder if your celerons report PBE feature.
 
 What am I looking for?

PBE in dmesg

 And personally, requiring (much) more recent hardware to get
 sane/correct (btu inefficient) behaviour out of the timekeeping
 framework is a little .. suboptimal. :)

Well, I never knew about this issue before but I always assumed that the
reasonable behavior was the behavior.  And I never encountered any evidence to
the contrary.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-26 Thread Andriy Gapon
on 23/01/2013 18:20 Adrian Chadd said the following:
 It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be
 a quirk of the older generation celeron hardware - in which case, we
 need to tell the user somehow..

This is not software related at all.  It's the hardware feature (or its 
absence).
I wonder if your celerons report PBE feature.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-23 Thread Andriy Gapon
on 22/01/2013 20:42 Adrian Chadd said the following:
 Hi!
 
 As I said before, the problem with non-HLT loops with event-timer in
 -9 and -head is that it calls the idle function inside a critical
 section (critical_enter and critical_exit) which blocks interrupts
 from occuring.
 
 The EI;HLT instruction pair on i386/amd64 atomically and correctly
 handles things from what I've been told.
 
 However, there's no atomic way to do this using ACPI sleeping, so
 there's a small window where an interrupt may come in but it isn't
 handled; waiting for the next interrupt to occur before it'll wake up
 and respond to that interrupt.

I don't think that this is true of x86 hardware in general.
You might have hit some limitation or a quirk or a bug or an erratum for some
particular hardware.

E.g. a chipset on this machine has a bit described as such:
Set to 1 to skip the C state transition if there is break event
when entering C state.
The bit is set indeed and as far as I can tell the behavior matches the 
description.

Most modern (non-embedded) machines seem to behave this way. Attempt to enter a
deeper C state while a break event is pending still incurs some overhead, but 
it's
not as bad as waiting for the next break event.

 I kept hitting my head against this when doing network testing. :(
 
 Now - specifically for timekeeping it shouldn't matter; that's to do
 with whether the counters are reliable or not (and heck, are even in
 lock-step on CPUs.) But extra latency could show up weirdly, hence why
 I was asking for you to try different timer configurations and idle
 loops.

-- 
Andriy Gapon

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-23 Thread Adrian Chadd
On 23 January 2013 06:58, Andriy Gapon a...@freebsd.org wrote:

 I don't think that this is true of x86 hardware in general.
 You might have hit some limitation or a quirk or a bug or an erratum for some
 particular hardware.

 E.g. a chipset on this machine has a bit described as such:
 Set to 1 to skip the C state transition if there is break event
 when entering C state.
 The bit is set indeed and as far as I can tell the behavior matches the 
 description.

 Most modern (non-embedded) machines seem to behave this way. Attempt to enter 
 a
 deeper C state while a break event is pending still incurs some overhead, but 
 it's
 not as bad as waiting for the next break event.

I'll reverify the behaviour on my netbooks when I'm back home.

It may be a quirk of an older 9.x, which is fixed in -HEAD. It may be
a quirk of the older generation celeron hardware - in which case, we
need to tell the user somehow..



Adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-23 Thread John Nielsen
On Jan 22, 2013, at 2:40 AM, Adrian Chadd adr...@freebsd.org wrote:

 On Jan 21, 2013, at 4:33 AM, Daniel Braniss da...@cs.huji.ac.il wrote:
 
 host: DELL PowerEdge R710, 16GB, 

I administer a Dell PowerEdge R710 and I've been seeing the exact same thing. 
It's currently running FreeBSD 9.0-STABLE #0 r236355. It has a ZFS pool which 
sees moderate load most of the time but can be very high at times (when certain 
scripts run, etc.). I hadn't previously correlated the issue with ZFS load but 
that is very possible.

I set a cron job to restart ntpd when it dies (because the time difference 
exceeds the sanity check). The cron job runs every 20 minutes, but that 
varies greatly when the system stops counting. The time offset from ntpdate 
(which the script runs before restarting ntpd) varies a lot, but always in 
increments of 300 seconds. I've seen everything from 1200 to 23100. (Yes, 
that's 23 thousand seconds aka 6 hours 25 minutes that the system wasn't 
keeping time for.)

Sysctl kern.timecounter.hardware defaults to HPET. I experimented with setting 
it to ACPI-fast but the issue persisted so I put it back.
kern.timecounter.choice: TSC-low(-100) ACPI-fast(900) HPET(950) i8254(0) 
dummy(-100)

I first installed the box with an older 9.0-STABLE and this issue was not 
present. I have been tracking -STABLE on it (albeit irregularly) so I'm not 
sure when the issue came up.

 Have you run tests with the machdep.idle value changed, and fiddling
 kern.eventtimer.periodic / kern.eventtimer.idletick ?

I would love to resolve this and am able to do some experimenting. I've 
_usually_ been seeing the issue 2-3 times every 1-2 days, but I did just make 
some changes:
disabling ZFS compression and deduplication on all pools
updated to 9.1-STABLE from yesterday (r245821)

If the issue persists I will try changing some of the sysctls above and follow 
up with the result. If it goes away, I'll try to remember to report that too.

JN

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-23 Thread Daniel Braniss
 On Jan 22, 2013, at 2:40 AM, Adrian Chadd adr...@freebsd.org wrote:
  On Jan 21, 2013, at 4:33 AM, Daniel Braniss da...@cs.huji.ac.il  wrote:
  
  host: DELL PowerEdge R710, 16GB, 
 
 I administer a Dell PowerEdge R710 and I've been seeing the exact same 
 =thing. It's currently running FreeBSD 9.0-STABLE #0 r236355. It has a =ZFS 
 pool which sees moderate load most of the time but can be very high =at times 
 (when certain scripts run, etc.). I hadn't previously =correlated the issue 
 with ZFS load but that is very possible.  I set a cron job to restart ntpd 
 when it dies (because the time =difference exceeds the sanity check). The 
 cron job runs every 20 =minutes, but that varies greatly when the system 
 stops counting. The =time offset from ntpdate (which the script runs before 
 restarting ntpd) =varies a lot, but always in increments of 300 seconds. I've 
 seen =everything from 1200 to 23100. (Yes, that's 23 thousand seconds aka 6 
 =hours 25 minutes that the system wasn't keeping time for.)
 
 Sysctl kern.timecounter.hardware defaults to HPET. I experimented with 
 =setting it to ACPI-fast but the issue persisted so I put it back.
 kern.timecounter.choice: TSC-low(-100) ACPI-fast(900) HPET(950) i8254(0) 
 =dummy(-100)  I first installed the box with an older 9.0-STABLE and 
 this issue was =not present. I have been tracking -STABLE on it (albeit 
 irregularly) so =I'm not sure when the issue came up.
 
 
 Have you run tests with the machdep.idle value changed, and fiddling
 
 kern.eventtimer.periodic / kern.eventtimer.idletick ?
 
 I would love to resolve this and am able to do some experimenting. I've 
 =_usually_ been seeing the issue 2-3 times every 1-2 days, but I did just 
 =make some changes:
   disabling ZFS compression and deduplication on all pools
   updated to 9.1-STABLE from yesterday (r245821)
 
 If the issue persists I will try changing some of the sysctls above and 
 =follow up with the result. If it goes away, I'll try to remember to =report 
 that too.
 
 JN
 

set kern.eventtimer.timer=LAPIC
this solved it for me.

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Adrian Chadd
Daniel,

Have you run tests with the machdep.idle value changed, and fiddling
kern.eventtimer.periodic / kern.eventtimer.idletick ?



adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Daniel Braniss
 Daniel,
 
 Have you run tests with the machdep.idle value changed, and fiddling
 kern.eventtimer.periodic / kern.eventtimer.idletick ?

Adrian,

not yet, for several reasons:
1- as I explained, I can't realy force the problem, it happens when we run some
zfs scripts, like mirror, but have to wait till enough changes happened on
the source, usualy after 24hs.
2- changing to LAPIC seems to have solved the problem.
3- I'm now learning all I can about event timers and you have not answered some
   of my questions :-)

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Julian Stecklina
Thus spake Daniel Braniss da...@cs.huji.ac.il:

 In the meantime here is some info:
 Intel(R) Xeon(R) CPU E5645: running with no problems
   LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0)

 Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment
   HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0)

Does anyone know why the LAPIC is given a lower priority than HPET in
this case? If you have an LAPIC, it should always be prefered to HPET,
unless something is seriously wrong with it...

Julian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Ryan Stone
On Tue, Jan 22, 2013 at 7:27 AM, Julian Stecklina 
jstec...@os.inf.tu-dresden.de wrote:

 Does anyone know why the LAPIC is given a lower priority than HPET in
 this case? If you have an LAPIC, it should always be prefered to HPET,
 unless something is seriously wrong with it...


On many processors the lapic timer does not work correctly in states lower
than C1.  There are many processors that will automatically enter a C1E
mode when the processor is idle, and in that state I have seen the lapic
timer run slower than the programmed frequency, causing time to move to
slowly on idle FreeBSD systems.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Adam McDougall

On 01/22/13 07:27, Julian Stecklina wrote:

Thus spake Daniel Braniss da...@cs.huji.ac.il:


In the meantime here is some info:
Intel(R) Xeon(R) CPU E5645: running with no problems
   LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0)

Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment
   HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0)


Does anyone know why the LAPIC is given a lower priority than HPET in
this case? If you have an LAPIC, it should always be prefered to HPET,
unless something is seriously wrong with it...

Julian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



This may help:

Problem with LAPIC timer is that it stops working when CPU goes to C3 
or deeper idle state. These states are not enabled by default, so unless 
you enabled them explicitly, it is safe to use LAPIC. In any case 
present 9-STABLE system should prevent you from using unsafe C-state if 
LAPIC timer is used. From all other perspectives LAPIC is preferable, as 
it is faster and easier to operate then HPET. Latest CPUs fixed the 
LAPIC timer problem, so I don't think that switching to it will be 
pessimistic in foreseeable future.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-22 Thread Adrian Chadd
Hi!

As I said before, the problem with non-HLT loops with event-timer in
-9 and -head is that it calls the idle function inside a critical
section (critical_enter and critical_exit) which blocks interrupts
from occuring.

The EI;HLT instruction pair on i386/amd64 atomically and correctly
handles things from what I've been told.

However, there's no atomic way to do this using ACPI sleeping, so
there's a small window where an interrupt may come in but it isn't
handled; waiting for the next interrupt to occur before it'll wake up
and respond to that interrupt.

I kept hitting my head against this when doing network testing. :(

Now - specifically for timekeeping it shouldn't matter; that's to do
with whether the counters are reliable or not (and heck, are even in
lock-step on CPUs.) But extra latency could show up weirdly, hence why
I was asking for you to try different timer configurations and idle
loops.

Thanks,


Adrian

On 22 January 2013 01:55, Daniel Braniss da...@cs.huji.ac.il wrote:
 Daniel,

 Have you run tests with the machdep.idle value changed, and fiddling
 kern.eventtimer.periodic / kern.eventtimer.idletick ?

 Adrian,

 not yet, for several reasons:
 1- as I explained, I can't realy force the problem, it happens when we run 
 some
 zfs scripts, like mirror, but have to wait till enough changes happened on
 the source, usualy after 24hs.
 2- changing to LAPIC seems to have solved the problem.
 3- I'm now learning all I can about event timers and you have not answered 
 some
of my questions :-)

 danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


time issues and ZFS

2013-01-21 Thread Daniel Braniss
After many trials (and errors), here are some facts:

host: DELL PowerEdge R710, 16GB, 
 mfi0: Dell PERC H700 Integrated
 mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal
 mfi1: Dell PERC 6 
 mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is 
optimal

we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the real-time
clock slows down when doing some zfs stuff like send|receive, typing 'date' 
when less that 1000s went by seems to crorrect the problem,
ntpd kicks in and on track again.

I have a cron job just logging date every 5 minutes, and the loghost sees:

|-- local time on loghost | time on problematic host
Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 
19:56:19 IST 2013   -- ok
Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 
20:15:00 IST 2013   -- ok
Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 
20:21:06 IST 2013   -- off by 1:09
Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 
20:25:00 IST 2013   -- off by 1:08
Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 
20:30:00 IST 2013   -- off by 1:09
...
Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 
20:55:00 IST 2013   -- diff is now constant
..
Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time 
correction of 4134 seconds exceeds sanity limit (1000); set clock manually to 
the correct UTC time.
...
Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 
21:50:00 IST 2013


strangely, when running 8.3, ACPI-fast is chosen:
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) 
dummy(-100)
but with 9.1 TSC-low gets chosen:
kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) 
i8254(0) 
dummy(-100)

so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - 
unless it can't be changed after boot.

I realy need help here!

thanks,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Adrian Chadd
Hi,

Try experimenting with kern.eventtimer.periodic and kern.eventtimer.idletick.

If this fixes it for you, please file a PR with all the relevant details.

Thanks!




Adrian


On 21 January 2013 03:33, Daniel Braniss da...@cs.huji.ac.il wrote:
 After many trials (and errors), here are some facts:

 host: DELL PowerEdge R710, 16GB,
  mfi0: Dell PERC H700 Integrated
  mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal
  mfi1: Dell PERC 6
  mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is
 optimal

 we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the 
 real-time
 clock slows down when doing some zfs stuff like send|receive, typing 'date'
 when less that 1000s went by seems to crorrect the problem,
 ntpd kicks in and on track again.

 I have a cron job just logging date every 5 minutes, and the loghost sees:

 |-- local time on loghost | time on problematic host
 Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20
 19:56:19 IST 2013   -- ok
 Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20
 20:15:00 IST 2013   -- ok
 Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20
 20:21:06 IST 2013   -- off by 1:09
 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20
 20:25:00 IST 2013   -- off by 1:08
 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20
 20:30:00 IST 2013   -- off by 1:09
 ...
 Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20
 20:55:00 IST 2013   -- diff is now constant
 ..
 Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time
 correction of 4134 seconds exceeds sanity limit (1000); set clock manually to
 the correct UTC time.
 ...
 Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20
 21:50:00 IST 2013


 strangely, when running 8.3, ACPI-fast is chosen:
 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0)
 dummy(-100)
 but with 9.1 TSC-low gets chosen:
 kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) 
 i8254(0)
 dummy(-100)

 so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens -
 unless it can't be changed after boot.

 I realy need help here!

 thanks,
 danny



 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Daniel Braniss
 Hi,
 
 Try experimenting with kern.eventtimer.periodic and kern.eventtimer.idletick.
 
can you give/point to some info about this?

btw, I just noticed that on this hardware I get:
9.1-STABLE:

 vmstat -i
interrupt  total   rate
irq3: uart1  931  0
irq4: uart05  0
irq19: ehci01331  0
irq20: hpet0 uhci3   1687937   1163
irq21: uhci2 ehci129  0
irq23: atapci048  0
irq256: bce0   52270 36
irq260: mfi0   14690 10
irq261: mfi13088  2
Total1760329   1213

no cpu timer, instead irq20: hpet0 uhci3,
and when 8.3-STABLE:
 vmstat -i
interrupt  total   rate
irq3: uart1 1048  0
irq4: uart05  0
irq19: ehci0  280451  1
irq21: uhci2 ehci129  0
irq23: atapci052  0
cpu0:timer 313544623   1125
irq256: bce030791673110
irq260: mfi0 1372186  4
cpu1:timer   1294093  4
...

total  384382790   1380

is this OK?


 If this fixes it for you, please file a PR with all the relevant details.
 
I will!

 Thanks!
 
 
 
 
 Adrian
 
 
 On 21 January 2013 03:33, Daniel Braniss da...@cs.huji.ac.il wrote:
  After many trials (and errors), here are some facts:
 
  host: DELL PowerEdge R710, 16GB,
   mfi0: Dell PERC H700 Integrated
   mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal
   mfi1: Dell PERC 6
   mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is
  optimal
 
  we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the 
  real-time
  clock slows down when doing some zfs stuff like send|receive, typing 'date'
  when less that 1000s went by seems to crorrect the problem,
  ntpd kicks in and on track again.
 
  I have a cron job just logging date every 5 minutes, and the loghost sees:
 
  |-- local time on loghost | time on problematic host
  Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20
  19:56:19 IST 2013   -- ok
  Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20
  20:15:00 IST 2013   -- ok
  Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20
  20:21:06 IST 2013   -- off by 1:09
  Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20
  20:25:00 IST 2013   -- off by 1:08
  Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20
  20:30:00 IST 2013   -- off by 1:09
  ...
  Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20
  20:55:00 IST 2013   -- diff is now constant
  ..
  Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time
  correction of 4134 seconds exceeds sanity limit (1000); set clock manually 
  to
  the correct UTC time.
  ...
  Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20
  21:50:00 IST 2013
 
 
  strangely, when running 8.3, ACPI-fast is chosen:
  kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) 
  i8254(0)
  dummy(-100)
  but with 9.1 TSC-low gets chosen:
  kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) 
  i8254(0)
  dummy(-100)
 
  so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens -
  unless it can't be changed after boot.
 
  I realy need help here!
 
  thanks,
  danny
 
 
 
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Ian Lepore
On Mon, 2013-01-21 at 13:33 +0200, Daniel Braniss wrote:
 After many trials (and errors), here are some facts:
 
 host: DELL PowerEdge R710, 16GB, 
  mfi0: Dell PERC H700 Integrated
  mfid0: 14305280MB (29297213440 sectors) RAID volume 'r5' is optimal
  mfi1: Dell PERC 6 
  mfid1: 12393472MB (25381830656 sectors) RAID volume 'Virtual Disk 0' is 
 optimal
 
 we have NO problems with FreeBSD-8.3-STABLE, but with 9.1-STABLE, the 
 real-time
 clock slows down when doing some zfs stuff like send|receive, typing 'date' 
 when less that 1000s went by seems to crorrect the problem,
 ntpd kicks in and on track again.
 
 I have a cron job just logging date every 5 minutes, and the loghost sees:
 
 |-- local time on loghost | time on problematic host
 Jan 20 19:56:19 store-02.cs.huji.ac.il Jan 20 19:56:19 danny: Sun Jan 20 
 19:56:19 IST 2013 -- ok
 Jan 20 20:15:00 store-02.cs.huji.ac.il Jan 20 20:15:00 danny: Sun Jan 20 
 20:15:00 IST 2013 -- ok
 Jan 20 21:30:00 store-02.cs.huji.ac.il Jan 20 20:21:06 danny: Sun Jan 20 
 20:21:06 IST 2013 -- off by 1:09
 Jan 20 21:33:53 store-02.cs.huji.ac.il Jan 20 20:25:00 danny: Sun Jan 20 
 20:25:00 IST 2013 -- off by 1:08
 Jan 20 21:38:54 store-02.cs.huji.ac.il Jan 20 20:30:00 danny: Sun Jan 20 
 20:30:00 IST 2013 -- off by 1:09
 ...
 Jan 20 22:03:54 store-02.cs.huji.ac.il Jan 20 20:55:00 danny: Sun Jan 20 
 20:55:00 IST 2013 -- diff is now constant
 ..
 Jan 20 22:04:13 store-02.cs.huji.ac.il Jan 20 20:55:19 ntpd[1848]: time 
 correction of 4134 seconds exceeds sanity limit (1000); set clock manually to 
 the correct UTC time.
 ...
 Jan 20 22:58:53 store-02.cs.huji.ac.il Jan 20 21:50:00 danny: Sun Jan 20 
 21:50:00 IST 2013
 
 
 strangely, when running 8.3, ACPI-fast is chosen:
   kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) 
 dummy(-100)
 but with 9.1 TSC-low gets chosen:
   kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) HPET(950) 
 i8254(0) 
 dummy(-100)
 
 so I did sysctl kern.timecounter.hardware=ACPI-fast, but the same happens - 
 unless it can't be changed after boot.
 
 I realy need help here!
 
 thanks,
   danny

What's the output of sysctl kern.eventtimer?  Does the bad behavior
change if you set kern.eventimer.periodic=1?

-- Ian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Daniel Braniss
...
 
 What's the output of sysctl kern.eventtimer?  

kern.eventtimer.periodic is 0

  Does the bad behavior
 change if you set kern.eventimer.periodic=1?
 

setting kern.eventtimer.timer=LAPIC
instead of the default HPET made the missing cpu timers to appear:
# vmstat -i 
interrupt  total   rate
irq3: uart1 1695  0
irq4: uart05  0
irq19: ehci03875  0
irq20: hpet0 uhci3   5495755   1135
irq21: uhci2 ehci129  0
irq23: atapci048  0
cpu0:timer  7063  1
irq256: bce0  117073 24
irq260: mfi0   51083 10
irq261: mfi13088  0
cpu1:timer   484  0
cpu14:timer   36  0
cpu6:timer   486  0
cpu8:timer38  0
cpu5:timer38  0
cpu15:timer   38  0
cpu7:timer32  0
cpu12:timer   38  0
cpu3:timer40  0
cpu9:timer36  0
cpu10:timer   34  0
cpu11:timer   37  0
cpu2:timer33  0
cpu13:timer   40  0
cpu4:timer36  0
Total5681160   1173

is this relevant?

danny




 -- Ian
 
 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Ian Lepore
On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote:
 ...
  
  What's the output of sysctl kern.eventtimer?  
 
 kern.eventtimer.periodic is 0
 
   Does the bad behavior
  change if you set kern.eventimer.periodic=1?
  
 
 setting kern.eventtimer.timer=LAPIC
 instead of the default HPET made the missing cpu timers to appear:
 # vmstat -i 
 interrupt  total   rate
 irq3: uart1 1695  0
 irq4: uart05  0
 irq19: ehci03875  0
 irq20: hpet0 uhci3   5495755   1135
 irq21: uhci2 ehci129  0
 irq23: atapci048  0
 cpu0:timer  7063  1
 irq256: bce0  117073 24
 irq260: mfi0   51083 10
 irq261: mfi13088  0
 cpu1:timer   484  0
 cpu14:timer   36  0
 cpu6:timer   486  0
 cpu8:timer38  0
 cpu5:timer38  0
 cpu15:timer   38  0
 cpu7:timer32  0
 cpu12:timer   38  0
 cpu3:timer40  0
 cpu9:timer36  0
 cpu10:timer   34  0
 cpu11:timer   37  0
 cpu2:timer33  0
 cpu13:timer   40  0
 cpu4:timer36  0
 Total5681160   1173
 
 is this relevant?

I'll have to let someone who knows modern x86 hardware better comment on
the relative merits of hpet vs. lapic timers.  If it was using hpet in
one-shot mode, and changing it to hpet in periodic mode makes the
problem go away, that might be a clue that there's something wrong in
the hpet eventtimer start or interrupt routines.  

I wonder if a single missed interrupt in one-shot mode would bring an
eventtimer to a halt like that?  And if so, then what is it about
manually asking for the date that kicks it into running again?

-- Ian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Adrian Chadd
I still firmly believe the ACPI event timer code is racy, and what we
may be seeing here is the fallout from that.

It's very possible that we're missing interrupts here - the new
eventtimer code that made it into 9.x puts the halt behind a critical
section, with interrupts disabled. The only platforms that correctly
implement enable-interrupts-and-halt atomically is the HLT (well, and
the don't-sleep-at-all) idle loops on i386/amd64. The default method
is to use the ACPI sleep method, which doesn't do atomic interrupt
enable / halt.

I'm still seeing odd stuff on some of my ACPI-using netbooks when
doing net80211/ath development and it all goes away whenever I fondle
with the above settings.

So, play with kern.eventtimer.periodic, kern.eventtimer.idletick and
machdep.idle (try setting machdep.idle to hlt, or something else
listed in machdep.idle_available) - please report back what the
results are.


Adrian

On 21 January 2013 07:54, Ian Lepore i...@freebsd.org wrote:
 On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote:
 ...
 
  What's the output of sysctl kern.eventtimer?

 kern.eventtimer.periodic is 0

   Does the bad behavior
  change if you set kern.eventimer.periodic=1?
 

 setting kern.eventtimer.timer=LAPIC
 instead of the default HPET made the missing cpu timers to appear:
 # vmstat -i
 interrupt  total   rate
 irq3: uart1 1695  0
 irq4: uart05  0
 irq19: ehci03875  0
 irq20: hpet0 uhci3   5495755   1135
 irq21: uhci2 ehci129  0
 irq23: atapci048  0
 cpu0:timer  7063  1
 irq256: bce0  117073 24
 irq260: mfi0   51083 10
 irq261: mfi13088  0
 cpu1:timer   484  0
 cpu14:timer   36  0
 cpu6:timer   486  0
 cpu8:timer38  0
 cpu5:timer38  0
 cpu15:timer   38  0
 cpu7:timer32  0
 cpu12:timer   38  0
 cpu3:timer40  0
 cpu9:timer36  0
 cpu10:timer   34  0
 cpu11:timer   37  0
 cpu2:timer33  0
 cpu13:timer   40  0
 cpu4:timer36  0
 Total5681160   1173

 is this relevant?

 I'll have to let someone who knows modern x86 hardware better comment on
 the relative merits of hpet vs. lapic timers.  If it was using hpet in
 one-shot mode, and changing it to hpet in periodic mode makes the
 problem go away, that might be a clue that there's something wrong in
 the hpet eventtimer start or interrupt routines.

 I wonder if a single missed interrupt in one-shot mode would bring an
 eventtimer to a halt like that?  And if so, then what is it about
 manually asking for the date that kicks it into running again?

 -- Ian


 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and ZFS

2013-01-21 Thread Daniel Braniss
 I still firmly believe the ACPI event timer code is racy, and what we
 may be seeing here is the fallout from that.
 
 It's very possible that we're missing interrupts here - the new
 eventtimer code that made it into 9.x puts the halt behind a critical
 section, with interrupts disabled. The only platforms that correctly
 implement enable-interrupts-and-halt atomically is the HLT (well, and
 the don't-sleep-at-all) idle loops on i386/amd64. The default method
 is to use the ACPI sleep method, which doesn't do atomic interrupt
 enable / halt.
 
 I'm still seeing odd stuff on some of my ACPI-using netbooks when
 doing net80211/ath development and it all goes away whenever I fondle
 with the above settings.
 
 So, play with kern.eventtimer.periodic, kern.eventtimer.idletick and
 machdep.idle (try setting machdep.idle to hlt, or something else
 listed in machdep.idle_available) - please report back what the
 results are.
 
 
 Adrian


Adrian,
you mention that ACPI is racy, which event timer are you talking about?

how is the quality chosen?

at the moment switching kern.eventtimer.timer to LAPIC seems to have done the
trick. I'll have to wait another 24hs to make sure.

In the meantime here is some info:
Intel(R) Xeon(R) CPU E5645: running with no problems
  LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0)

Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment
  HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0)

Dual-Core AMD Opteron(tm) Processor 2218: running with no problems
  LAPIC(400) RTC(0)

so if someone is running 9.1 on any of the following and can provide
the output of sysctl kern.eventtimer.choice would be nice:

Intel(R) Xeon(R) CPU E5410
Intel(R) Xeon(R) CPU E5507

btw, all the above are on server MBs.

thanks,
danny


 On 21 January 2013 07:54, Ian Lepore i...@freebsd.org wrote:
  On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote:
  ...
  
   What's the output of sysctl kern.eventtimer?
 
  kern.eventtimer.periodic is 0
 
Does the bad behavior
   change if you set kern.eventimer.periodic=1?
  
 
  setting kern.eventtimer.timer=LAPIC
  instead of the default HPET made the missing cpu timers to appear:
  # vmstat -i
  interrupt  total   rate
  irq3: uart1 1695  0
  irq4: uart05  0
  irq19: ehci03875  0
  irq20: hpet0 uhci3   5495755   1135
  irq21: uhci2 ehci129  0
  irq23: atapci048  0
  cpu0:timer  7063  1
  irq256: bce0  117073 24
  irq260: mfi0   51083 10
  irq261: mfi13088  0
  cpu1:timer   484  0
  cpu14:timer   36  0
  cpu6:timer   486  0
  cpu8:timer38  0
  cpu5:timer38  0
  cpu15:timer   38  0
  cpu7:timer32  0
  cpu12:timer   38  0
  cpu3:timer40  0
  cpu9:timer36  0
  cpu10:timer   34  0
  cpu11:timer   37  0
  cpu2:timer33  0
  cpu13:timer   40  0
  cpu4:timer36  0
  Total5681160   1173
 
  is this relevant?
 
  I'll have to let someone who knows modern x86 hardware better comment on
  the relative merits of hpet vs. lapic timers.  If it was using hpet in
  one-shot mode, and changing it to hpet in periodic mode makes the
  problem go away, that might be a clue that there's something wrong in
  the hpet eventtimer start or interrupt routines.
 
  I wonder if a single missed interrupt in one-shot mode would bring an
  eventtimer to a halt like that?  And if so, then what is it about
  manually asking for the date that kicks it into running again?
 
  -- Ian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org