Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2018-02-02 Thread Edd Barrett
Hi,

I'm experiencing this issue too.

On Tue, Dec 26, 2017 at 12:27:31PM -0500, Andrew Davis wrote:
> Virtualization software: QEMU + KVM (2.10.0-1.fc27)

FWIW, there are reports that this bug is absent from qemu-2.11.0.

-- 
Best Regards
Edd Barrett

http://www.theunixzoo.co.uk



Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2018-01-10 Thread srutherford
Would this be consistent with the PIT taking longer to respond? The mode of
KVM used here (mentioned on the KVM list) moves the PIT to userspace and
would make it less accurate. If I'm reading OpenBSD's LAPIC calibration code
right, this might be the case. I believe Linux uses one of the PM Timer or
TSC to do the calibration.

(The obvious solution here is to just disable that mode if you are using
OpenBSD, which apparently works.)



--
Sent from: 
http://openbsd-archive.7691.n7.nabble.com/openbsd-dev-bugs-f183916.html



Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2017-12-27 Thread Mark Kettenis
> From: Andrew Davis 
> Date: Wed, 27 Dec 2017 11:39:54 -0500
> 
> Hello again,
> 
> I tested with each of the "acpihpet0", "acpitimer0", and "i8254" timers. 
> The timing problem manifested when using all 3 timers. I ran the date 
> loop with "acpihpet0" and "acpitimer0" until the issue manifested, and 
> let "i8254" run overnight.
> 
> Here are some snippets from the date logs from where I started logging 
> the date loop, and where the timing issue became present.
> 
> acpitimer0:
> 
>      Tue Dec 26 23:57:57 UTC 2017
>      Tue Dec 26 23:57:58 UTC 2017
>      ...
>      Wed Dec 27 00:10:10 UTC 2017
>      Wed Dec 27 00:10:12 UTC 2017
>      Wed Dec 27 00:10:14 UTC 2017
> 
> i8254:
> 
>      Wed Dec 27 00:14:23 UTC 2017
>      Wed Dec 27 00:14:24 UTC 2017
>      ...
>      Wed Dec 27 00:59:30 UTC 2017
>      Wed Dec 27 00:59:31 UTC 2017
>      Wed Dec 27 00:59:33 UTC 2017
> 
> acpihpet0:
> 
>      Wed Dec 27 16:20:54 UTC 2017
>      Wed Dec 27 16:20:55 UTC 2017
>      ...
>      Wed Dec 27 16:32:44 UTC 2017
>      Wed Dec 27 16:32:45 UTC 2017
>      Wed Dec 27 16:32:47 UTC 2017
>      Wed Dec 27 16:32:49 UTC 2017
> 
> The i8254 timer hit a point where the system stopped reporting the 
> proper time altogether. I ran these commands this morning after my 
> OpenBSD VM ran with i8254 overnight, and this is what the "date" command 
> displayed. The proper time is shown below.
> 
>      # sysctl | grep -i timecounter
>      kern.timecounter.tick=1
>      kern.timecounter.timestepwarnings=0
>      kern.timecounter.hardware=i8254
>      kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000) 
> dummy(-100)
> 
>      # date
>      Wed Dec 27 01:35:51 UTC 2017
> 
>      [root@local-linux ~]# date
>      Wed Dec 27 16:11:05 UTC 2017

Your test results are consistent with the local APIC emulation being
broken in Linux/KVM.  Regardless of what hardware is used for the
timecounter, the clock interrupts use the local APIC timer in OpenBSD.

OpenBSD programs the local APIC to interrupt every 10ms in so-called
repeated mode.  The clock interrupt is then responsable for reading
the timecounter to update the current wall clock time and for running
things like timeouts that wake up tasks that are sleeping.  If we get
no clock interrupts those wakeups don't happen, and your sleeps take
longer than what you intended.  But as long as the timecounter doesn't
wrap the wall clock time will be correctly updated once another clock
interrupt comes in.  And that's what happens with the i8524
timecounter.  It wraps fairly quickly, so if the clock interrupts
don't come in for a while, OpenBSD's idea of wall clock time starts to
get out of sync with reality.

So why do other systems not suffer from this problem?  I'm fairly
certain they also use the local APIC for clock interrupts.  But the
systems you tested (Linux, FreeBSD) probably don't run it in repeated
mode.  Some people consider running the local APIC in repeated mode a
bad idea.  And they might even be right.  Waking a system up at
regular intervals even if there is no real work to do is a bit silly
and wastes power.  Although one could argue that 10ms between wakeups
is long enough for this to matter much on modern systems.

Maybe we'll change the way we do clock interrupts at some point in the
future.  It would probably help vmm(4).  But this is not a trivial
task and won't happen overnight.  Working around bugs in someone
else's software certainly isn't enough motivation for me to implement
it.  

Cheers,

Mark


> On 12/26/2017 5:44 PM, Mike Larkin wrote:
> > On Tue, Dec 26, 2017 at 03:24:03PM -0500, Andrew Davis wrote:
> >> Hello,
> >>
> >> No, I didn't changing the kern.timecounter selection directly. I had tried
> >> disabling the HPET on qemu/kvm (which may have affected this selection?).
> >>
> >> Two of my boxes, both OpenBSD 6.1 report this:
> >>
> >> # sysctl kern.timecounter
> >> kern.timecounter.tick=1
> >> kern.timecounter.timestepwarnings=0
> >> kern.timecounter.hardware=acpihpet0
> >> kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000)
> >> dummy(-100)
> >>
> >> Best,
> >> Andrew
> >>
> > Could you try one of the others and let us know if it helps, please?
> >
> > -ml
> >
> >> On 12/26/2017 2:36 PM, Mike Larkin wrote:
> >>> On Tue, Dec 26, 2017 at 12:27:31PM -0500, Andrew Davis wrote:
>  Hello,
> 
>  I'm experiencing some odd timing issues on OpenBSD 6.2 (and 6.1) on the
>  system listed below. This is preventing me from running OpenBSD on my
>  servers. Can you determine if this is a bug in the OpenBSD operating 
>  system?
>  I can provide more information if needed.
> 
>  Virtualized environment.
> 
>  Host CPU: 2 x Intel E5-2630 v3 2.4 Ghz
>  Host OS: Fedora 27
>  Virtualization software: QEMU + KVM (2.10.0-1.fc27)
>  Guest Machine: default (pc-i440fx-2.10)
> 

Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2017-12-26 Thread Mike Larkin
On Tue, Dec 26, 2017 at 03:24:03PM -0500, Andrew Davis wrote:
> Hello,
> 
> No, I didn't changing the kern.timecounter selection directly. I had tried
> disabling the HPET on qemu/kvm (which may have affected this selection?).
> 
> Two of my boxes, both OpenBSD 6.1 report this:
> 
> # sysctl kern.timecounter
> kern.timecounter.tick=1
> kern.timecounter.timestepwarnings=0
> kern.timecounter.hardware=acpihpet0
> kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000)
> dummy(-100)
> 
> Best,
> Andrew
> 

Could you try one of the others and let us know if it helps, please?

-ml

> On 12/26/2017 2:36 PM, Mike Larkin wrote:
> > On Tue, Dec 26, 2017 at 12:27:31PM -0500, Andrew Davis wrote:
> > > Hello,
> > > 
> > > I'm experiencing some odd timing issues on OpenBSD 6.2 (and 6.1) on the
> > > system listed below. This is preventing me from running OpenBSD on my
> > > servers. Can you determine if this is a bug in the OpenBSD operating 
> > > system?
> > > I can provide more information if needed.
> > > 
> > > Virtualized environment.
> > > 
> > > Host CPU: 2 x Intel E5-2630 v3 2.4 Ghz
> > > Host OS: Fedora 27
> > > Virtualization software: QEMU + KVM (2.10.0-1.fc27)
> > > Guest Machine: default (pc-i440fx-2.10)
> > > Guest OS: OpenBSD 6.2 (and 6.1).
> > > 
> > > Basically, OpenBSD processes degrade over time to the point where they're
> > > completely unresponsive. This simple date printout script is a good 
> > > example.
> > > It should print out the date once per second, but after roughly ~20 mins 
> > > on
> > > this hardware configuration, it takes 2 seconds to print each line, then 4
> > > seconds to print each line, and so on. After running for about 24 hours, 
> > > the
> > > delay is about 1 minute between line printouts.
> > > 
> > >      while sleep 1; do date; done
> > > 
> > > I've tried tweaking some different settings on the guest and host, such as
> > > disabling the HPET timer and x2apic, neither of which has proven 
> > > effective.
> > > 
> > > I saw mention of adding "kvm-intel.preemption_timer=0" in another recent
> > > thread. This seems to resolve the slowdown issue.
> > > 
> > > However, I have run other guest operating systems on this hardware
> > > configuration (CentOS, Ubuntu, FreeBSD) - neither of which required any of
> > > these tweaks, or experienced timing issues. This leads me to believe that 
> > > it
> > > could be related to a bug in OpenBSD.
> > > 
> > > I have access to several machines with this hardware configuration and
> > > tested on multiple machines, to rule out a possible one-off hardware 
> > > issue.
> > > Each host displayed the same behavior.
> > > 
> > > Best regards,
> > > Andrew
> > > 
> > What timecounter source did the OpenBSD guests pick? Did you try selecting
> > one of the other choices to see if this helps?
> > 
> > sysctl kern.timecounterif you're not sure what I'm talking about.
> > 
> > -ml
> 



Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2017-12-26 Thread Andrew Davis

Hello,

No, I didn't changing the kern.timecounter selection directly. I had 
tried disabling the HPET on qemu/kvm (which may have affected this 
selection?).


Two of my boxes, both OpenBSD 6.1 report this:

# sysctl kern.timecounter
kern.timecounter.tick=1
kern.timecounter.timestepwarnings=0
kern.timecounter.hardware=acpihpet0
kern.timecounter.choice=i8254(0) acpihpet0(1000) acpitimer0(1000) 
dummy(-100)


Best,
Andrew

On 12/26/2017 2:36 PM, Mike Larkin wrote:

On Tue, Dec 26, 2017 at 12:27:31PM -0500, Andrew Davis wrote:

Hello,

I'm experiencing some odd timing issues on OpenBSD 6.2 (and 6.1) on the
system listed below. This is preventing me from running OpenBSD on my
servers. Can you determine if this is a bug in the OpenBSD operating system?
I can provide more information if needed.

Virtualized environment.

Host CPU: 2 x Intel E5-2630 v3 2.4 Ghz
Host OS: Fedora 27
Virtualization software: QEMU + KVM (2.10.0-1.fc27)
Guest Machine: default (pc-i440fx-2.10)
Guest OS: OpenBSD 6.2 (and 6.1).

Basically, OpenBSD processes degrade over time to the point where they're
completely unresponsive. This simple date printout script is a good example.
It should print out the date once per second, but after roughly ~20 mins on
this hardware configuration, it takes 2 seconds to print each line, then 4
seconds to print each line, and so on. After running for about 24 hours, the
delay is about 1 minute between line printouts.

     while sleep 1; do date; done

I've tried tweaking some different settings on the guest and host, such as
disabling the HPET timer and x2apic, neither of which has proven effective.

I saw mention of adding "kvm-intel.preemption_timer=0" in another recent
thread. This seems to resolve the slowdown issue.

However, I have run other guest operating systems on this hardware
configuration (CentOS, Ubuntu, FreeBSD) - neither of which required any of
these tweaks, or experienced timing issues. This leads me to believe that it
could be related to a bug in OpenBSD.

I have access to several machines with this hardware configuration and
tested on multiple machines, to rule out a possible one-off hardware issue.
Each host displayed the same behavior.

Best regards,
Andrew


What timecounter source did the OpenBSD guests pick? Did you try selecting
one of the other choices to see if this helps?

sysctl kern.timecounterif you're not sure what I'm talking about.

-ml




Re: Degraded timing performance - QEMU, KVM - OpenBSD 6.2

2017-12-26 Thread Mike Larkin
On Tue, Dec 26, 2017 at 12:27:31PM -0500, Andrew Davis wrote:
> Hello,
> 
> I'm experiencing some odd timing issues on OpenBSD 6.2 (and 6.1) on the
> system listed below. This is preventing me from running OpenBSD on my
> servers. Can you determine if this is a bug in the OpenBSD operating system?
> I can provide more information if needed.
> 
> Virtualized environment.
> 
> Host CPU: 2 x Intel E5-2630 v3 2.4 Ghz
> Host OS: Fedora 27
> Virtualization software: QEMU + KVM (2.10.0-1.fc27)
> Guest Machine: default (pc-i440fx-2.10)
> Guest OS: OpenBSD 6.2 (and 6.1).
> 
> Basically, OpenBSD processes degrade over time to the point where they're
> completely unresponsive. This simple date printout script is a good example.
> It should print out the date once per second, but after roughly ~20 mins on
> this hardware configuration, it takes 2 seconds to print each line, then 4
> seconds to print each line, and so on. After running for about 24 hours, the
> delay is about 1 minute between line printouts.
> 
>     while sleep 1; do date; done
> 
> I've tried tweaking some different settings on the guest and host, such as
> disabling the HPET timer and x2apic, neither of which has proven effective.
> 
> I saw mention of adding "kvm-intel.preemption_timer=0" in another recent
> thread. This seems to resolve the slowdown issue.
> 
> However, I have run other guest operating systems on this hardware
> configuration (CentOS, Ubuntu, FreeBSD) - neither of which required any of
> these tweaks, or experienced timing issues. This leads me to believe that it
> could be related to a bug in OpenBSD.
> 
> I have access to several machines with this hardware configuration and
> tested on multiple machines, to rule out a possible one-off hardware issue.
> Each host displayed the same behavior.
> 
> Best regards,
> Andrew
> 

What timecounter source did the OpenBSD guests pick? Did you try selecting
one of the other choices to see if this helps?

sysctl kern.timecounterif you're not sure what I'm talking about.

-ml