[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-14 Thread Eryk Sun


Eryk Sun  added the comment:

>> Try changing EnterNonRecursiveMutex() to break out of the loop in 
>> this case
>
> This does work, but unfortunately a little too well - in a single 
> test I saw several instances where that approach returned 
> _earlier_ than the timeout.

It's documented that a timeout between N and N+1 ticks can be satisfied 
anywhere in that range. In practice I see a wider range. In the kernel, 
variations in wait time could depend on when the due time is calculated in the 
interrupt cycle, when the next interrupt occurs and the interrupt time is 
updated, and when the thread is dispatched. A benefit of using a 
high-resolution external deadline is that waiting will never return early, but 
it may return later than it otherwise would, e.g. if re-waiting for a remaining 
1 ms actually takes 20 ms.

There are many unrelated WaitForSingleObject and WaitForMultipleObjects in the 
interpreter, extension modules, and code that uses _winapi.WaitForSingleObject 
and _winapi.WaitForMultipleObjects. For example, time.sleep() allows 
WAIT_TIMEOUT to override the deadline. I suggest measuring the 
performance-counter interval for time.sleep(0.001) on both the main thread 
(Sleep based) and a new thread (WaitForSingleObjectEx based).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-14 Thread Ryan Hileman


Ryan Hileman  added the comment:

> It shouldn't behave drastically different just because the user closed the 
> laptop lid for an hour

I talked to someone who's been helping with the Go time APIs and it seems like 
that holds pretty well for interactive timeouts, but makes no sense for network 
related code. If you lost a network connection (with, say, a 30 second timeout) 
due to the lid being closed, you don't want to wait 30 seconds after opening 
the lid for the application to realize it needs to reconnect. (However there's 
probably no good way to design Python's locking system around both cases, so 
it's sufficient to say "lock timers won't advance during suspend" and make the 
application layer work around that on its own in the case of network code)

> Try changing EnterNonRecursiveMutex() to break out of the loop in this case

This does work, but unfortunately a little too well - in a single test I saw 
several instances where that approach returned _earlier_ than the timeout.

I assume the reason for this loop is the call can get interrupted with a "needs 
retry" state. If so, you'd still see 16ms of jitter anytime that happens as 
long as it's backed by a quantized time source.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-14 Thread Eryk Sun


Eryk Sun  added the comment:

> Seems like Windows 7 may need to be considered as well, as 
> per vstinner's bpo-32592 mention?

Python 3.9 doesn't support Windows 7. Moreover, the interpreter DLL in 3.9 
implicitly imports PathCchCanonicalizeEx, PathCchCombineEx, and 
PathCchSkipRoot, which were added in Windows 8. So it won't even load in 
Windows 7.

> Have there been any issues filed about the deadline behaviors 
> across system suspend?

Not that I'm aware of, but waits should be correct and consistent in principle. 
It shouldn't behave drastically different just because the user closed the 
laptop lid for an hour.

> Looks like Linux (CLOCK_MONOTONIC) and macOS (mach_absolute_time())
> already don't track suspend time in time.monotonic(). I think that's
> enough to suggest that long-term Windows shouldn't either

I'm not overly concerned here with cross-platform consistency. If Windows 
hadn't changed the behavior of wait timeouts, then I wouldn't worry about it 
since most clocks in Windows are biased by the time spent suspended. It's a 
bonus that this change would also improve cross-platform consistency for 
time.monotonic(). 

> I tested QueryUnbiasedInterruptTime() and it exhibits the same 
> 16ms jitter as GetTickCount64() (which I expected), 

For bpo-41299, it occurs to me that we've only ever used _PY_EMULATED_WIN_CV, 
in which case PyCOND_TIMEDWAIT() returns 1 for a timeout, as implemented in 
_PyCOND_WAIT_MS(). Try changing EnterNonRecursiveMutex() to break out of the 
loop in this case. For example:

} else if (milliseconds != 0) {
/* wait at least until the target */
ULONGLONG now, target;
QueryUnbiasedInterruptTime();
target += milliseconds;
while (mutex->locked) {
int ret = PyCOND_TIMEDWAIT(>cv, >cs,
(long long)milliseconds * 1000);
if (ret < 0) {
result = WAIT_FAILED;
break;
}
if (ret == 1) { /* timeout */
break;
}
QueryUnbiasedInterruptTime();
if (target <= now)
break;
milliseconds = (DWORD)(target - now);
}
}

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-13 Thread Ryan Hileman


Ryan Hileman  added the comment:

> The monotonic clock should thus be based on QueryUnbiasedInterruptTime

My primary complaint here is that Windows is the only major platform with a low 
resolution monotonic clock. Using QueryUnbiasedInterruptTime() on older OS 
versions wouldn't entirely help that, so we have the same consistency issue 
(just on a smaller scale). I would personally need to still use 
time.perf_counter() instead of time.monotonic() due to this, but I'm not 
totally against it.

> For consistency, an external deadline (e.g. for SIGINT support) should work 
> the same way.

Have there been any issues filed about the deadline behaviors across system 
suspend?

> which I presume includes most users of Python 3.9+

Seems like Windows 7 may need to be considered as well, as per vstinner's 
bpo-32592 mention?

> starting with Windows 8, WaitForSingleObject() and WaitForMultipleObjects() 
> exclude time when the system is suspended

Looks like Linux (CLOCK_MONOTONIC) and macOS (mach_absolute_time()) already 
don't track suspend time in time.monotonic(). I think that's enough to suggest 
that long-term Windows shouldn't either, but I don't know how to reconcile that 
with my desire for Windows not to be the only platform with low resolution 
monotonic time by default.

> then the change to use QueryPerformanceCounter() to resolve bpo-41299 should 
> be reverted. The deadline should instead be computed with 
> QueryUnbiasedInterruptTime()

I don't agree with this, as it would regress the fix. This is more of a topic 
for bpo-41299, but I tested QueryUnbiasedInterruptTime() and it exhibits the 
same 16ms jitter as GetTickCount64() (which I expected), so non-precise 
interrupt time can't solve this issue. I do think 
QueryUnbiasedInterruptTimePrecise() would be a good fit. I think making this 
particular timeout unbiased (which would be a new behavior) should be a lower 
priority than making it not jitter.

> For Windows we could implement the following clocks:

I think that list is great and making those enums work with clock_gettime on 
Windows sounds like a very clear improvement to the timing options available. 
Having the ability to query each clock source directly would also reduce the 
impact if time.monotonic() does not perfectly suit a specific application.

---

I think my current positions after writing all of this are:

- I would probably be in support of a 3.11+ change for time.monotonic() to use 
QueryUnbiasedInterruptTime() pre-Windows 10, and dynamically use 
QueryUnbiasedInterruptTimePrecise() on Windows 10+. Ideally the Windows 
clock_gettime() code lands in the same release, so users can directly pick 
their time source if necessary. This approach also helps my goal of making 
time.monotonic()'s suspend behavior more uniform across platforms.

- Please don't revert bpo-41299 (especially the backports), as it does fix the 
issue and tracking suspend time is the same (not a regression) as the previous 
GetTickCount64() code. I think the lock timeouts should stick with QPC 
pre-Windows-10 to fix the jitter, but could use 
QueryUnbiasedInterruptTimePrecise() on Windows 10+ (which needs the same 
runtime check as the time.monotonic() change, thus could probably land in the 
same patch set).

- I'm honestly left with more questions than I started after diving into the 
GetSystemTimePreciseAsFileTime() rabbit hole. I assume it's not a catastrophic 
issue? Maybe it's a situation where adding the clock_gettime() enums would 
sufficiently help anyone who cares about the exact behavior during clock 
modification. I don't have strong opinions about it, besides it being a shame 
that Windows currently has lower precision timestamps in general. Could be 
worth doing a survey of other languages' choices, but any further discussion 
can probably go to bpo-19007.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-13 Thread Eryk Sun


Eryk Sun  added the comment:

On second thought, starting with Windows 8, WaitForSingleObject() and 
WaitForMultipleObjects() exclude time when the system is suspended. For 
consistency, an external deadline (e.g. for SIGINT support) should work the 
same way. The monotonic clock should thus be based on 
QueryUnbiasedInterruptTime(). We can conditionally use 
QueryUnbiasedInterruptTimePrecise() in Windows 10, which I presume includes 
most users of Python 3.9+ on Windows since Windows 8.1 only has a 3% share of 
desktop/laptop systems.

If we can agree on the above, then the change to use QueryPerformanceCounter() 
to resolve bpo-41299 should be reverted. The deadline should instead be 
computed with QueryUnbiasedInterruptTime(). It's limited to the resolution of 
the system interrupt time, but at least compared to GetTickCount64() it returns 
the real interrupt time instead of an idealized 64 ticks/second.

> expose all of the Windows clocks directly (through clock_gettime enums?)

_Py_clock_gettime() and _Py_clock_getres() could be implemented in 
Python/pytime.c. For Windows we could implement the following clocks:

CLOCK_REALTIMEGetSystemTimePreciseAsFileTime
CLOCK_REALTIME_COARSE GetSystemTimeAsFileTime
CLOCK_MONOTONIC_COARSEQueryUnbiasedInterruptTime
CLOCK_PROCESS_CPUTIME_ID  GetProcessTimes
CLOCK_THREAD_CPUTIME_ID   GetThreadTimes
CLOCK_PERF_COUNTERQueryPerformanceCounter

Windows 10+
CLOCK_MONOTONIC   QueryUnbiasedInterruptTimePrecise
CLOCK_BOOTTIMEQueryInterruptTimePrecise
CLOCK_BOOTTIME_COARSE QueryInterruptTime

> it may also be worth replacing time.time()'s GetSystemTimeAsFileTime with 
> GetSystemTimePreciseAsFileTime

See bpo-19007, which is nearly 8 years old.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-12 Thread STINNER Victor


STINNER Victor  added the comment:

> To reduce the adverse effects of this frequency offset error, recent versions 
> of Windows, particularly Windows 8, use multiple hardware timers to detect 
> the frequency offset and compensate for it to the extent possible. This 
> calibration process is performed when Windows is started.

Technically, it remains possible to install Python on Windows 7, see: bpo-32592.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-12 Thread Ryan Hileman


Ryan Hileman  added the comment:

I think a lot of that is based on very outdated information. It's worth reading 
this article: 
https://docs.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps

I will repeat Microsoft's current recommendation (from that article):

> Windows has and will continue to invest in providing a reliable and efficient 
> performance counter. When you need time stamps with a resolution of 1 
> microsecond or better and you don't need the time stamps to be synchronized 
> to an external time reference, choose QueryPerformanceCounter, 
> KeQueryPerformanceCounter, or KeQueryInterruptTimePrecise. When you need 
> UTC-synchronized time stamps with a resolution of 1 microsecond or better, 
> choose GetSystemTimePreciseAsFileTime or KeQuerySystemTimePrecise.

(Based on that, it may also be worth replacing time.time()'s 
GetSystemTimeAsFileTime with GetSystemTimePreciseAsFileTime in CPython, as 
GetSystemTimePreciseAsFileTime is available in Windows 8 and newer)

PEP 418:

> It has a much higher resolution, but has lower long term precision than 
> GetTickCount() and timeGetTime() clocks. For example, it will drift compared 
> to the low precision clocks.

Microsoft on drift (from the article above):

> To reduce the adverse effects of this frequency offset error, recent versions 
> of Windows, particularly Windows 8, use multiple hardware timers to detect 
> the frequency offset and compensate for it to the extent possible. This 
> calibration process is performed when Windows is started.

Modern Windows also automatically detects and works around stoppable TSC, as 
well as several other issues:

> Some processors can vary the frequency of the TSC clock or stop the 
> advancement of the TSC register, which makes the TSC unsuitable for timing 
> purposes on these processors. These processors are said to have non-invariant 
> TSC registers. (Windows will automatically detect this, and select an 
> alternative time source for QPC).

It seems like Microsoft considers QPC to be a significantly better time source 
now, than when PEP 418 was written.

Another related conversation is whether Python can just expose all of the 
Windows clocks directly (through clock_gettime enums?), as that gives anyone 
who really wants full control over their timestamps a good escape hatch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-12 Thread STINNER Victor


STINNER Victor  added the comment:

Changing is clock is a tricky. There are many things to consider:

* Is it really monotonic in all cases?
* Does it have a better resolution than the previous clock?
* Corner cases: does it include time spent in time.sleep() and while the system 
is suspended?
* etc.

--

When I designed PEP 418 (in 2012), QueryPerformanceCounter() was not reliable:

"It has a much higher resolution, but has lower long term precision than 
GetTickCount() and timeGetTime() clocks. For example, it will drift compared to 
the low precision clocks."
https://www.python.org/dev/peps/pep-0418/#windows-queryperformancecounter

And there were a few bugs like: "The performance counter value may unexpectedly 
leap forward because of a hardware bug".

A Microsoft blog article explains that users wanting a steady clock with 
precision higher than GetTickCount() should interpolate GetTickCount() using 
QueryPerformanceCounter(). If I recall correctly, this is what Firefox did for 
instance.

Eryk: "That said, Windows 10 also provides QueryInterruptTimePrecise(), which 
is a hybrid solution. It uses the performance counter to interpolate a 
timestamp between interrupts. I'd prefer to use this for time.monotonic() 
instead of QPC, if it's available via GetProcAddress()."

Oh, good that they provided an implementation for that :-)

--

> V8 uses QueryPerformanceCounter after checking for old CPUs: 
> https://github.com/v8/v8/blob/dc712da548c7fb433caed56af9a021d964952728/src/base/platform/time.cc#L672

It uses CPUID to check for "non stoppable time stamp counter": 
https://github.com/v8/v8/blob/master/src/base/cpu.cc

  // Check if CPU has non stoppable time stamp counter.
  const unsigned parameter_containing_non_stop_time_stamp_counter = 0x8007;
  if (num_ext_ids >= parameter_containing_non_stop_time_stamp_counter) {
__cpuid(cpu_info, parameter_containing_non_stop_time_stamp_counter);
has_non_stop_time_stamp_counter_ = (cpu_info[3] & (1 << 8)) != 0;
  }

Maybe we use such check in Python: use GetTickCount() on old CPUs, or 
QueryPerformanceCounter() otherwise. MSVC provides the __cpuid() function:
https://docs.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=msvc-160

--

> Swift originally used QueryPerformanceCounter, but switched to 
> QueryUnbiasedInterruptTime() because they didn't want to count time the 
> system spent asleep

Oh, I recall that it was a tricky question. The PEP 418 simply says:
"The behaviour of clocks after a system suspend is not defined in the 
documentation of new functions."

See "Include Sleep" and "Include Suspend" columns of my table:
https://www.python.org/dev/peps/pep-0418/#monotonic-clocks

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-11 Thread Terry J. Reedy


Change by Terry J. Reedy :


--
nosy: +belopolsky, p-ganssle, vstinner
versions:  -Python 3.10, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-09 Thread Ryan Hileman


Ryan Hileman  added the comment:

Great information, thanks!

> Windows 10 also provides QueryInterruptTimePrecise(), which is a hybrid 
> solution. It uses the performance counter to interpolate a timestamp between 
> interrupts. I'd prefer to use this for time.monotonic() instead of QPC, if 
> it's available via GetProcAddress()

My personal vote is to use the currently most common clock source (QPC) for now 
for monotonic(), because it's the same across Windows versions and the most 
likely to produce portable monotonic timestamps between apps/languages on the 
same system. It's also the easiest patch, as there's already a code path for 
QPC.

(As someone building multi-app experiences around Python, I don't want to check 
the Windows version to see which time base Python is using. I'd feel better 
about switching to QITP() if/when Python drops Windows 8 support.)

A later extension of this idea (maybe behind a PEP) could be to survey the 
existing timers available on each platform and consider whether it's worth 
extending `time` to expose them all, and unify cross-platform the ones that are 
exposed (e.g. better formalize/document which clocks will advance while the 
machine is asleep on each platform).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-09 Thread Eryk Sun


Eryk Sun  added the comment:

You resolved bpo-41299 using QueryPerformanceCounter(), so we're already a step 
toward making it the default monotonic clock. Personally, I've only relied on 
QPC for short intervals, but, as you've highlighted above, other language 
runtimes use it for their monotonic clock. Since Vista, it's apparently more 
reliable in terms of calibration and ensuring that a processor TSC is only used 
if it's known to be invariant and constant.

That said, Windows 10 also provides QueryInterruptTimePrecise(), which is a 
hybrid solution. It uses the performance counter to interpolate a timestamp 
between interrupts. I'd prefer to use this for time.monotonic() instead of QPC, 
if it's available via GetProcAddress().

QueryInterruptTimePrecise() is about 1.38 times the cost of QPC (on average 
across 100 million calls). Both functions are significantly more expensive than 
QueryInterruptTime() and GetTickCount64(), which simply return a value that's 
read from shared memory (i.e. the KUSER_SHARED_DATA structure).

> QueryUnbiasedInterruptTime() is available on Windows 8 while 
> QueryInterruptTime() is new as of Windows 10. The "Unbiased" 
> just refers to whether it advances during sleep.

QueryInterruptTime() and QueryUnbiasedInterruptTime() don't provide 
high-resolution timestamps. They're updated by the system timer interrupt 
service routine, which defaults to 64 interrupts/second. The time increment 
depends on when the counter is read by the ISR, but it averages out to 
approximately the interrupt period (e.g. 15.625 ms).

> I'm not actually sure whether time.monotonic() in Python counts 
> time spent asleep, or whether that's desirable. 

POSIX doesn't specify whether CLOCK_MONOTONIC [1] should include the time that 
elapses while the system is in standby mode. In Linux, CLOCK_BOOTTIME includes 
this time, and CLOCK_MONOTONIC excludes it. Windows 
QueryUnbiasedInterruptTime[Precise]() excludes it.

> Perhaps the long term answer would be to introduce separate 
> "asleep" and "awake" monotonic clocks in Python

Both may not be supportable on all platforms, but they're supported in Linux, 
Windows 10, and macOS. The latter has mach_continuous_time(), which includes 
the time in standby mode, and mach_absolute_time(), which excludes it.

--- 
[1] 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/clock_gettime.html

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44328] time.monotonic() should use a different clock source on Windows

2021-06-06 Thread Ryan Hileman


Ryan Hileman  added the comment:

I found these two references:
- 
https://stackoverflow.com/questions/35601880/windows-timing-drift-of-performancecounter-c
- https://bugs.python.org/issue10278#msg143209

Which suggest QueryPerformanceCounter() may be bad because it can drift. 
However, these posts are fairly old and the StackOverflow post also says the 
drift is small on newer hardware / Windows.

Microsoft's current stance is that QueryPerformanceCounter() is good: 
https://docs.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps

> Guidance for acquiring time stamps
> Windows has and will continue to invest in providing a reliable and efficient 
> performance counter. When you need time stamps with a resolution of 1 
> microsecond or better and you don't need the time stamps to be synchronized 
> to an external time reference, choose QueryPerformanceCounter

I looked into how a few other languages provide monotonic time on Windows:

Golang seems to read the interrupt time (presumably equivalent to 
QueryInterruptTime) directly by address. 
https://github.com/golang/go/blob/a3868028ac8470d1ab7782614707bb90925e7fe3/src/runtime/sys_windows_amd64.s#L499

Rust uses QueryPerformanceCounter: 
https://github.com/rust-lang/rust/blob/38ec87c1885c62ed8c66320ad24c7e535535e4bd/library/std/src/time.rs#L91

V8 uses QueryPerformanceCounter after checking for old CPUs: 
https://github.com/v8/v8/blob/dc712da548c7fb433caed56af9a021d964952728/src/base/platform/time.cc#L672

Ruby uses QueryPerformanceCounter: 
https://github.com/ruby/ruby/blob/44cff500a0ad565952e84935bc98523c36a91b06/win32/win32.c#L4712

C# implements QueryPerformanceCounter on other platforms using CLOCK_MONOTONIC, 
indicating that they should be roughly equivalent: 
https://github.com/dotnet/runtime/blob/01b7e73cd378145264a7cb7a09365b41ed42b240/src/coreclr/pal/src/misc/time.cpp#L175

Swift originally used QueryPerformanceCounter, but switched to 
QueryUnbiasedInterruptTime() because they didn't want to count time the system 
spent asleep: 
https://github.com/apple/swift-corelibs-libdispatch/commit/766d64719cfdd07f97841092bec596669261a16f

--

Note that none of these languages use GetTickCount64(). Swift is an interesting 
counter point, and I noticed QueryUnbiasedInterruptTime() is available on 
Windows 8 while QueryInterruptTime() is new as of Windows 10. The "Unbiased" 
just refers to whether it advances during sleep.

I'm not actually sure whether time.monotonic() in Python counts time spent 
asleep, or whether that's desirable. Some kinds of timers using monotonic time 
should definitely freeze during sleep so they don't cause a flurry of activity 
on wake, but others definitely need to roughly track wall clock time, even 
during sleep.

Perhaps the long term answer would be to introduce separate "asleep" and 
"awake" monotonic clocks in Python, and possibly deprecate perf_counter() if 
it's redundant after this (as I think it's aliased to monotonic() on 
non-Windows platforms anyway).

--
title: time.monotonic() should use QueryPerformanceCounter() on Windows -> 
time.monotonic() should use a different clock source on Windows

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com