Re: Using bintime() in acpi_cpu_idle()?

2012-07-30 Thread Alexander Motin

On 30.07.2012 07:33, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 15:26, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 11:37, Bruce Evans wrote:
...

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is
exactly what happens here. Unlike timecounter usage here we don't need
CPU synchronicity, but we need it working during deep sleeps.


The ticker is the same as the timecounter in many cases of interest.  If
the TSC stops then it cannot be used for timecounting unless
timecounting
is reinitialized.  Timecounting should be reinitialized after deep
sleeps,
but you say you need it to work during deep sleeps.


Timecounter already has detection logic to disable TSC in cases where
it is unreliable. I don't want to replicate it here. I need not
precise and not synchronized by reliable and fast time source.


Yes, this logic gives exactly what you don't want (an inefficient
timecounter), by preventing use of the TSC for the timecounter, although
the TSC is perfectly usable for the ticker and here.


Can you teach me how to use ticker that is not ticking? If TSC was 
considered unusable for timecounter for reasons unrelated to SMP, how 
can I use it as ticker.



I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
very out of date.  If their clock didn't stop, then they might have
wrapped or otherwise overflowed and the times read would be garbage.
Is there any locking or ordering to prevent them being used before they
are reinitialized?


I am not sure what reinitialization are you talking about. IIRC, there
is no any waking up code for TSC. None other time counters have
problems with C-states.


It is the timecounter code that needs reinitializing.  If the TSC stops,
or wraps mod 2**32, then its counts become garbage for the purpose of
timecounting.  Maybe it is not used for timecounting in either of these
cases.  But these cases shouldn't prevent its use for timecounting.

The 2**32 number is because timecounters only use 32 bits of hardware
counters (for efficiency).  So even if the hardware has some magic to
not stop the TSC while sleeping (maybe it fakes not stopping it be
reloading on wakeup), it is still unusable by timecounters after sleeping
for a second or 2 so that it wraps.  The software needs similar faking
to reload the timecounter on wakeup.  This makes use of timecounters in
sleep/wakeup code fragile.


At this moment I am not talking about S-states sleeping for hours. I am 
talking about C-states for milliseconds. It means that TSC may stop and 
start 10K times each second or even more. Attempt to save and restore 
its state will consume so much resources, that probably make it useless.


What's about wrap after 2 seconds, I would be happy to make CPU sleep 
for so long, but now 100ms is all I can hope even on idle system.



At boot time there is a dummy timecounter that returns bogo-times.
Apparently sleeping doesn't occur before the timecounter is switched to
a real one.  The dummy timecounter isn't switched back to after boot
time.  But it probably should be, since the hardware timecounter may
have stopped or wrapped.  Sleeping could just set a flag to indicate
this state, but then you would have to provide a fake time anyway on
finding the flag set.  Boot time just points to the dummy timecounter
so as not to check this flag in all early timecounter hardware calls.


And how dummy timecounter that counts something, but not time, can help 
me to measure sleep time?


--
Alexander Motin
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-30 Thread Alexander Motin

On 30.07.2012 09:25, Alexander Motin wrote:

On 30.07.2012 07:33, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 15:26, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 11:37, Bruce Evans wrote:
...

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is
exactly what happens here. Unlike timecounter usage here we don't need
CPU synchronicity, but we need it working during deep sleeps.


The ticker is the same as the timecounter in many cases of
interest.  If
the TSC stops then it cannot be used for timecounting unless
timecounting
is reinitialized.  Timecounting should be reinitialized after deep
sleeps,
but you say you need it to work during deep sleeps.


Timecounter already has detection logic to disable TSC in cases where
it is unreliable. I don't want to replicate it here. I need not
precise and not synchronized by reliable and fast time source.


Yes, this logic gives exactly what you don't want (an inefficient
timecounter), by preventing use of the TSC for the timecounter, although
the TSC is perfectly usable for the ticker and here.


Can you teach me how to use ticker that is not ticking? If TSC was
considered unusable for timecounter for reasons unrelated to SMP, how
can I use it as ticker.


I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
very out of date.  If their clock didn't stop, then they might have
wrapped or otherwise overflowed and the times read would be garbage.
Is there any locking or ordering to prevent them being used before they
are reinitialized?


I am not sure what reinitialization are you talking about. IIRC, there
is no any waking up code for TSC. None other time counters have
problems with C-states.


It is the timecounter code that needs reinitializing.  If the TSC stops,
or wraps mod 2**32, then its counts become garbage for the purpose of
timecounting.  Maybe it is not used for timecounting in either of these
cases.  But these cases shouldn't prevent its use for timecounting.

The 2**32 number is because timecounters only use 32 bits of hardware
counters (for efficiency).  So even if the hardware has some magic to
not stop the TSC while sleeping (maybe it fakes not stopping it be
reloading on wakeup), it is still unusable by timecounters after sleeping
for a second or 2 so that it wraps.  The software needs similar faking
to reload the timecounter on wakeup.  This makes use of timecounters in
sleep/wakeup code fragile.


At this moment I am not talking about S-states sleeping for hours. I am
talking about C-states for milliseconds. It means that TSC may stop and
start 10K times each second or even more. Attempt to save and restore
its state will consume so much resources, that probably make it useless.

What's about wrap after 2 seconds, I would be happy to make CPU sleep
for so long, but now 100ms is all I can hope even on idle system.


At boot time there is a dummy timecounter that returns bogo-times.
Apparently sleeping doesn't occur before the timecounter is switched to
a real one.  The dummy timecounter isn't switched back to after boot
time.  But it probably should be, since the hardware timecounter may
have stopped or wrapped.  Sleeping could just set a flag to indicate
this state, but then you would have to provide a fake time anyway on
finding the flag set.  Boot time just points to the dummy timecounter
so as not to check this flag in all early timecounter hardware calls.


And how dummy timecounter that counts something, but not time, can help
me to measure sleep time?


Nevermind, let it be compromise solution -- ticker for C1 state where 
performance is the most important and where TSC works and ACPI timer for 
others.


--
Alexander Motin
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-30 Thread Bruce Evans

On Mon, 30 Jul 2012, Alexander Motin wrote:


On 30.07.2012 07:33, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:

...
Timecounter already has detection logic to disable TSC in cases where
it is unreliable. I don't want to replicate it here. I need not
precise and not synchronized by reliable and fast time source.


Yes, this logic gives exactly what you don't want (an inefficient
timecounter), by preventing use of the TSC for the timecounter, although
the TSC is perfectly usable for the ticker and here.


Can you teach me how to use ticker that is not ticking? If TSC was considered 
unusable for timecounter for reasons unrelated to SMP, how can I use it as 
ticker.


No :-).  I can't teach you how to use either the ticker or the timecounter
if their clock is not ticking.  I'm just saying that if you use can
blindly use a timecounter, then you can blindly use the ticker.  The
working of both depends on their clock not stopping ticking, and that in
many cases their clock is the same (the TSC).

The TSC is considered usable for the ticker under weak conditions:
- it exists according to CPUID_TSC
- it is not disabled by the machdep.disable_tsc tuneable
- its dynamic probe finds that its frequency is nonzero.  The probe
  has some more cpuid tests and other complications which may prevent
  it being fuly dynamic.  There is another tuneable,
  machdep.disable_tsc_calibration which prevents the dynamic frequency
  determination.  I think the frequency comes from a table then, and
  is never zero, so this doesn't prevent the TSC being used for the
  ticker.
- the 2 tuneables are of course undocumented in /usr/share/man.  There is
  hardly any useful documentation of the TSC there either.  zgrep finds
  TSC mainly in timercounters(4) and hwpmc(4).  In timecounters(4),
  the references to the TSC are useless since they are just literal
  output of $(sysctl kern.timecounter).  In hwpmc(4), the READTSC
  instruction but not much more is mentioned.

The TSC is considered usable for a timecounter under the above conditions,
but its default quality is low so it rarely gets used.  Its quality is
changed under the following conditions:
- APM enabled: reduce quality to nearly -infinity
- CPU can deep sleep, and Intel CPU, and TSC not invariant: reduce quality
  to nearly -infinity, because (only) Intel CPUs are known to stop the
  TSC in deep sleeps under these conditions.  This is what you should have
  told me to justify use of binuptime() :-).  Users can still configure
  the TSC as a timecounter, but this would break more than your use of
  binuptime() if the TSC actually stops.
- SMP configured, and  1 CPU:
  - vm guest: reduce quality significantly, but not to nearly -infinity
  - else do cpuid and dynamic synchronization tests:
- fail tests: reduce as for vm guest
- pass tests: increase a little, to just above ACPI-fast IIRC
- pass synchronization tests, but not invariant: keep default.
- SMP not configured, or only 1 CPU: increase a little iff invariant.
  Invariant means P-state invariant.  I forgot that the invariance flag
  was a tuneable.  This tuneable, kern.timecounter.tsc_invariant, is
  of course undocumented.  It conditionalizes more than this case.
  Other bugs in it are:
  - it is in a different namespace than the tuneables described above.
  - this different namespace is worse, since the flag applies to more
than the timecounter decision.  It also gives the ticker invariance,
flag and controls whether there are event handlers for frequency
changes.
  - you can force the flag on using the tuneable, but you can't force
it off.
- for SMP, there is also the kern.timecounter.smp tunable.  This has
  much the same bugs as kern.timecounter.tsc_invariant:
  - it is of course undocumented
  - you can force it on, but you can't force it off
  - however, its namespace seems to be not incorrect, since it seems
to only control timecounter quality (very indirectly now, by
modifying the dynamic probes.  It used to be a simple flag to
modify the SMP config option).

Stopping of the TSC in deep sleeps doesn't prevent its use as the ticker.
This should mostly work for the main use of the ticker, for thread
runtimes, because most threads never idle directly, but switch to the
idle thread for some CPU.  I think deep sleeps break runtime accounting
for idle threads (if the ticker stops).  Has anyone seen this (idle
times near 0 on mostly-idle systems that have spent days idling)?


I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
...

I am not sure what reinitialization are you talking about. IIRC, there
is no any waking up code for TSC. None other time counters have
problems with C-states.


It is the timecounter code that needs reinitializing.  If the TSC stops,
or wraps mod 2**32, then its counts become garbage for the purpose of
timecounting.  Maybe it is not used for 

Re: Using bintime() in acpi_cpu_idle()?

2012-07-30 Thread Bruce Evans

On Mon, 30 Jul 2012, Alexander Motin wrote:


[...] ...
Nevermind, let it be compromise solution -- ticker for C1 state where 
performance is the most important and where TSC works and ACPI timer for 
others.


I like that.

Something similar should work for making the TSC usable in timecounters
even if it stops during deep sleeps.  The timecounter hardware would
have to be switched or reinitalized if it might have stopped.

Bruce
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-29 Thread Bruce Evans

On Sun, 29 Jul 2012, Alexander Motin wrote:

With ACPI timer gradually becoming one of slowest in the system, is there 
some reason to use it directly in acpi_cpu_idle()? I've made a patch:

http://people.freebsd.org/~mav/sleep_time.patch
to use binuptime() instead. Using even HPET from system time counter (not 
even speaking about TSC) that significantly improves performance on some 
workloads if this code is not covered by MWAIT optimization in cpu_idle().


Does it work with a perverse timecounter like the i8254 work?  The
user is permitted to switch to any supported timecounter.  There are
other perverse ones:
- ACPI.  This seems to be unavailable if the system thinks ACPI-fast
  works.  Bug.  The user should be able to downgrade to it if ACPI-fast
  in fact doesn't work.  Since it reads the hardware more than once,
  it is much slower than direct use of the hardware.
- ACPI-fast.  Even this is perverse.  It only reads the hardware once,
  but goes through many software layers.

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?  If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.

Bruce
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-29 Thread Alexander Motin

On 29.07.2012 11:37, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


With ACPI timer gradually becoming one of slowest in the system, is
there some reason to use it directly in acpi_cpu_idle()? I've made a
patch:
http://people.freebsd.org/~mav/sleep_time.patch
to use binuptime() instead. Using even HPET from system time counter
(not even speaking about TSC) that significantly improves performance
on some workloads if this code is not covered by MWAIT optimization in
cpu_idle().


Does it work with a perverse timecounter like the i8254 work?


At least on my test system it does, even though predictably much slower 
then the others.



The
user is permitted to switch to any supported timecounter.  There are
other perverse ones:
- ACPI.  This seems to be unavailable if the system thinks ACPI-fast
   works.  Bug.  The user should be able to downgrade to it if ACPI-fast
   in fact doesn't work.  Since it reads the hardware more than once,
   it is much slower than direct use of the hardware.
- ACPI-fast.  Even this is perverse.  It only reads the hardware once,
   but goes through many software layers.

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is 
exactly what happens here. Unlike timecounter usage here we don't need 
CPU synchronicity, but we need it working during deep sleeps.


--
Alexander Motin
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-29 Thread Bruce Evans

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 11:37, Bruce Evans wrote:
...

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is exactly 
what happens here. Unlike timecounter usage here we don't need CPU 
synchronicity, but we need it working during deep sleeps.


The ticker is the same as the timecounter in many cases of interest.  If
the TSC stops then it cannot be used for timecounting unless timecounting
is reinitialized.  Timecounting should be reinitialized after deep sleeps,
but you say you need it to work during deep sleeps.

I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
very out of date.  If their clock didn't stop, then they might have
wrapped or otherwise overflowed and the times read would be garbage.
Is there any locking or ordering to prevent them being used before they
are reinitialized?

Bruce
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-29 Thread Alexander Motin

On 29.07.2012 15:26, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 11:37, Bruce Evans wrote:
...

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is
exactly what happens here. Unlike timecounter usage here we don't need
CPU synchronicity, but we need it working during deep sleeps.


The ticker is the same as the timecounter in many cases of interest.  If
the TSC stops then it cannot be used for timecounting unless timecounting
is reinitialized.  Timecounting should be reinitialized after deep sleeps,
but you say you need it to work during deep sleeps.


Timecounter already has detection logic to disable TSC in cases where it 
is unreliable. I don't want to replicate it here. I need not precise and 
not synchronized by reliable and fast time source.



I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
very out of date.  If their clock didn't stop, then they might have
wrapped or otherwise overflowed and the times read would be garbage.
Is there any locking or ordering to prevent them being used before they
are reinitialized?


I am not sure what reinitialization are you talking about. IIRC, there 
is no any waking up code for TSC. None other time counters have problems 
with C-states.


--
Alexander Motin
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: Using bintime() in acpi_cpu_idle()?

2012-07-29 Thread Bruce Evans

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 15:26, Bruce Evans wrote:

On Sun, 29 Jul 2012, Alexander Motin wrote:


On 29.07.2012 11:37, Bruce Evans wrote:
...

binuptime() is more accurate than uncalibrated scaling.  Is accuracy
required?


Accuracy is not required at all. +-20% is not a problem.


If not, the CPU ticker might work, and is faster than HPET,
and and is not under user control for perverse settings.  It normally
reduces to readtsc() with no serializing instruction even in proposed
changes.  This is good enough for process times (not very good) and
depends on the CPU not changing.  Its calibration is very accurate
(similar to timecounters) modulo bugs, but not always up to date.


Problem with ticker that it may stop during idle periods, and idle is
exactly what happens here. Unlike timecounter usage here we don't need
CPU synchronicity, but we need it working during deep sleeps.


The ticker is the same as the timecounter in many cases of interest.  If
the TSC stops then it cannot be used for timecounting unless timecounting
is reinitialized.  Timecounting should be reinitialized after deep sleeps,
but you say you need it to work during deep sleeps.


Timecounter already has detection logic to disable TSC in cases where it is 
unreliable. I don't want to replicate it here. I need not precise and not 
synchronized by reliable and fast time source.


Yes, this logic gives exactly what you don't want (an inefficient
timecounter), by preventing use of the TSC for the timecounter, although
the TSC is perfectly usable for the ticker and here.


I wouldn't trust timecounters for some time after waking up after a
deep sleep.  If their clock stopped then the times read might only be
very out of date.  If their clock didn't stop, then they might have
wrapped or otherwise overflowed and the times read would be garbage.
Is there any locking or ordering to prevent them being used before they
are reinitialized?


I am not sure what reinitialization are you talking about. IIRC, there is no 
any waking up code for TSC. None other time counters have problems with 
C-states.


It is the timecounter code that needs reinitializing.  If the TSC stops,
or wraps mod 2**32, then its counts become garbage for the purpose of
timecounting.  Maybe it is not used for timecounting in either of these
cases.  But these cases shouldn't prevent its use for timecounting.

The 2**32 number is because timecounters only use 32 bits of hardware
counters (for efficiency).  So even if the hardware has some magic to
not stop the TSC while sleeping (maybe it fakes not stopping it be
reloading on wakeup), it is still unusable by timecounters after sleeping
for a second or 2 so that it wraps.  The software needs similar faking
to reload the timecounter on wakeup.  This makes use of timecounters in
sleep/wakeup code fragile.

At boot time there is a dummy timecounter that returns bogo-times.
Apparently sleeping doesn't occur before the timecounter is switched to
a real one.  The dummy timecounter isn't switched back to after boot
time.  But it probably should be, since the hardware timecounter may
have stopped or wrapped.  Sleeping could just set a flag to indicate
this state, but then you would have to provide a fake time anyway on
finding the flag set.  Boot time just points to the dummy timecounter
so as not to check this flag in all early timecounter hardware calls.

Bruce
___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org