[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2021-03-17 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
assignee:  -> docs@python
components: +Documentation -Windows
nosy: +docs@python
type: performance -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2021-03-17 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2021-03-17 Thread Eryk Sun


Eryk Sun  added the comment:

> Given that extra info, I'd say we're fine to document that our timeouts 
> can't do any better than the OS, which "for example, is typically 
> around 15ms on Windows", and recommend using non-blocking calls 
> instead.

The 15.625 ms resolution limit is fine, as long as performance is predictable. 
I don't like the random inconsistency introduced by extending only certain 
waits, in different ways, to support SIGINT and/or waits longer than 49.7 days. 
For example, time.sleep() doesn't ignore WAIT_TIMEOUT to recompute the 
remaining time, so it's not subject to the resolution limit that's imposed by 
GetTickCount64().

I'd prefer a common implementation of _Py_Sleep, _Py_WaitForSingleObject, and 
_Py_WaitForMultiple objects in order to be able to definitively state that all 
wait timeouts are unconditionally limited to the resolution reported by 
time.get_clock_info('monotonic').resolution; are not limited to 49.7 days; and 
can be interrupted by Ctrl+C in the main thread -- except for waiting on I/O. 
(There's an open issue to enable Ctrl+C to cancel synchronous I/O in the main 
thread -- such as reading from a pipe.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2021-03-17 Thread Steve Dower


Steve Dower  added the comment:

Given that extra info, I'd say we're fine to document that our timeouts can't 
do any better than the OS, which "for example, is typically around 15ms on 
Windows", and recommend using non-blocking calls instead.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2021-03-17 Thread Eryk Sun


Eryk Sun  added the comment:

See the related discussion in bpo-41299. The system interrupt period can be 
lowered to 1 ms with timeBeginPeriod() -- or lower with undocumented 
NtSetTimerResolution(). This affects the resolution of dispatcher waits such as 
Sleep() and WaitForSingleObject(), as well as the resolution of 
Query[Unbiased]InterruptTime(). But apparently GetTickCount[64]() has a fixed 
update frequency of 64 ticks/second (15.625 ms period), regardless of the 
system interrupt period. The combination of WaitForSingleObject() with 
GetTickCount64() has marginally better resolution and consistent performance if 
the system interrupt period is lowered to 1 ms.

If a wait needs to be resumable in order to handle SIGINT, and it should 
perform consistently with other waits, then there are two choices. One option 
is to base the timeout on Query[Unbiased]InterruptTime(), for which the 
resolution depends on the system interrupt period. Conversely, if we want all 
waits to perform consistently and independent of the system interrupt period, 
then they should all depend on GetTickCount64(). This could be implemented in 
one place, such as Modules/signalmodule.c, with the addition of _Py_Sleep(), 
Py_WaitForSingleObject(), and _Py_WaitForMultipleObjects(). On the main thread, 
these wait functions would include the SIGINT event and resume the wait if the 
SIGINT handler doesn't raise an exception.

--
versions: +Python 3.10, Python 3.8, Python 3.9 -Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-09-02 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

Another possibility might be to not use the Windows timeout clock, at least not 
for short timeouts.  The following shows that tk does about 970 1 millesecond 
timeouts in 1 second (on my machine).
---
import tkinter as tk

root = tk.Tk()
cbid = None
cbcount = 0

def cb():
global cbid, cbcount
cbcount += 1
cbid = root.after(1, cb)

def cbcancel():
print(cbcount)
root.after_cancel(cbid)

root.after(1000, cbcancel)
cbid = root.after(1, cb)
root.mainloop()
---

Here is a proof-of-concept queue-get-with-timeout function with sub-millesecond 
resolution.
---
import tkinter as tk
from queue import Queue, Empty
from time import perf_counter

q = Queue()
fails = 0
value = ''

def qcb():
try:
global value
value = q.get(block=False)
except Empty:
global fails
fails += 1
if perf_counter() < stop:
root.after(0, qcb)
return
root.destroy()

def qget_to(timeout):
global root, stop
root = tk.Tk()
root.withdraw()
stop = perf_counter() + timeout
qid = root.after(0, qcb)
#q.put(1)
root.mainloop()
print('failures:', fails, 'value:', value)

qget_to(.001)
---

With the put commented out, there are 27 fails (on my machine).  When a value 
is already available, there are none.  The loop could be parameterized to 
repeatedly call the no-block version of any similar function.  Large enough 
time-outs could be partially fulfilled using the system timeout function.

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-30 Thread Steve Dower


Steve Dower  added the comment:

Agreed with not putting platform-specific details everywhere, but in this case 
we can probably have a generic wording for "Blocks for `timeout` seconds as 
best as is available on the current operating system".

It is a general problem across all timeouts, so someone probably ought to do a 
full pass to find anywhere it matters. I'm not volunteering for that though!

About the only other possible fix is to figure out the current resolution, and 
replace any waits with shorter timeouts with a spin-lock. But in general I 
would rather leave this to the operating system and just document that we can't 
always overrule decisions made by the OS.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread STINNER Victor


STINNER Victor  added the comment:

I dislike the idea of documentation the Windows clocks in the queue 
documentation, it's not the right place. I dislike the idea of documentating 
Windows "internals" in the Python documentation. You should see Python as a 
"thin wrapper" to the operating system. If you have an issue on a platform, you 
should look at the platform documentation. Python already provides 
time.get_clock_info('monotonic').

>From what I read, changing the system clock resolution is a bad idea, since 
>the Windows kernel is not designed for that, and it can reduce the battery 
>life.

About Windows vs Linux, as I wrote, you are looking at the wrong way. 
queue.get() is not a bottleneck if you use threads, asyncio, multiprocessing or 
whatever.

Again, this issue is not a bug. And I dislike the idea of documentating the 
behaviour. So I still consider that the issue should be closed as "not a bug". 
But I leave it open since it seems like you want to use it to discuss Windows 
clocks ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread Eryk Sun


Eryk Sun  added the comment:

Changing the system timer resolution doesn't appear to help, given the current 
design combined with what looks like a bug in Windows (at least in Windows 10). 
We can set the system clock resolution below a millisecond via 
NtSetTimerResolution. I tested at 0.4882 ms, which allowed sleeping for 0.5 ms 
via NtDelayExectution. (NT interval resolution is 100 ns, nominally). But 
Python's threading Lock timeout is still limited to about 15-16 ms. I'd expect 
1 ms resolution given we're ultimately calling WaitForSingleObjectEx via 
PyCOND_TIMEDWAIT -> _PyCOND_WAIT_MS.

It appears the problem is that GetTickCount[64] continues to increment by 
15.625 ms (i.e. by 15 to 16), which looks like a bug to me, but may be a 
compatibility behavior. We use the tick count in EnterNonRecursiveMutex in the 
following block of code:

} else if (milliseconds != 0) {
/* wait at least until the target */
DWORD now, target = GetTickCount() + milliseconds;
while (mutex->locked) {
if (PyCOND_TIMEDWAIT(>cv, >cs, (long 
long)milliseconds*1000) < 0) {
result = WAIT_FAILED;
break;
}
now = GetTickCount();
if (target <= now)
break;
milliseconds = target-now;
}
}

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread Paul Overing


Paul Overing  added the comment:

Thank you to everyone for looking at my issue esp Josh.r for the detailed 
response. I now understand the issue and I just wanted to bring it to the 
group's attention for everyone's benefit.  Vstinner is correct; Linux is my 
solution as it is my production environment but I was using my windows laptop 
for dev as I've been doing frequent Asia trips lately and noticed the 
difference.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Victor, that was a little overboard. By that logic, there doesn't need to be a 
Windows version of Python.

That said, Paul doesn't seem to understand that the real resolution limit isn't 
1 ms; that's the lower limit on arguments to the API, but the real limit is the 
system clock, which has a granularity in the 10-16 ms range. It's a problem 
with Windows in general, and the cure is worse than the disease.

Per 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724411(v=vs.85).aspx 
, the resolution of the system timer is typically in the range of 10 
milliseconds to 16 milliseconds.

Per 
https://docs.microsoft.com/en-us/windows/desktop/Sync/wait-functions#wait-functions-and-time-out-intervals
 :

> Wait Functions and Time-out Intervals

> The accuracy of the specified time-out interval depends on the resolution of 
> the system clock. The system clock "ticks" at a constant rate. If the 
> time-out interval is less than the resolution of the system clock, the wait 
> may time out in less than the specified length of time. If the time-out 
> interval is greater than one tick but less than two, the wait can be anywhere 
> between one and two ticks, and so on.

All the Windows synchronization primitives (e.g. WaitForSingleObjectEx 
https://docs.microsoft.com/en-us/windows/desktop/api/synchapi/nf-synchapi-waitforsingleobjectex
 , which is what ultimately implements timed lock acquisition on Windows) are 
based on the system clock, so without drastic measures, it's impossible to get 
better granularity than the 10-16 ms of the default system clock configuration.

The link on "Wait Functions and Time-out Intervals" does mention that this 
granularity *can* be increased, but it recommends against fine-grained tuning 
(so you can't just tweak it before a wait and undo the tweak after; the only 
safe thing to do is change it on program launch and undo it on program exit). 
Even then, it's a bad idea for Python to use it; per timeBeginPeriod's own docs 
( 
https://docs.microsoft.com/en-us/windows/desktop/api/timeapi/nf-timeapi-timebeginperiod
 ):

> This function affects a global Windows setting. Windows uses the lowest value 
> (that is, highest resolution) requested by any process. Setting a higher 
> resolution can improve the accuracy of time-out intervals in wait functions. 
> However, it can also reduce overall system performance, because the thread 
> scheduler switches tasks more often. High resolutions can also prevent the 
> CPU power management system from entering power-saving modes. Setting a 
> higher resolution does not improve the accuracy of the high-resolution 
> performance counter.

Basically, to improve the resolution of timed lock acquisition, we'd have to 
change the performance profile of the entire OS while Python was running, 
likely increasing power usage and possibly reducing performance. Global 
solutions to local problems are a bad idea.

The most reasonable solution to the problem is to simply document it (maybe not 
for queue.Queue, but for the threading module). Possibly even provide an 
attribute in the threading module similar to  threading.TIMEOUT_MAX that 
reports the system clock's granularity for informational purposes (might need 
to be a function so it reports the potentially changing granularity).

Other, less reasonable solutions, would be:

1. Expose a function (with prominent warnings about not using it in a fine 
grained manner, and the effects on power management and performance) that would 
increase the system clock granularity as much as possible timeGetDevCaps 
reports possible (possibly limited to a user provided suggestion, so while the 
clock could go to 1 ms resolution, the user could request only 5 ms resolution 
to reduce the costs of doing so). Requires some additional state (whether 
timeBeginPeriod has been called, and with what values) so timeEndPeriod can be 
called properly before each adjustment and when Python exits. Pro is the code 
is *relatively* simple and would mostly fix the problem. Cons are that it 
wouldn't be super discoverable (unless we put notes in every place that uses 
timeouts, not just in threading docs), it encourages bad behavior (one 
application deciding its needs are more important that conserving power), and 
we'd have to be *really* careful to pair our calls universally (timeEndPeriod 
mus
 t be called, even when other cleanup is skipped, such as when calling 
os._exit; AFAICT, the docs imply that per-process adjustments to the clock 
aren't undone even when the process completes, which means failure to pair all 
calls would leave the system with a suboptimal system clock resolution that 
would remain in effect until rebooted).

2. (Likely a terrible idea, and like option 1, should be explicitly opt-in, not 
enabled by default) Offer the option to have Python lock timeouts only use 
WaitForSingleObjectEx only to sleep to within one system clock tick of the 
target time (and not 

[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread STINNER Victor


STINNER Victor  added the comment:

>  It makes the windows queue.Queue implementations with timeouts unusable in 
> low latency environments and should be documented as such. As it is slow and 
> highly variable.

Well, it seems like you have a solution which is called Linux, no?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread Paul Overing


Paul Overing  added the comment:

Thank you for your quick response.  But 1ms resolution or not, 14.5ms does not 
seem reasonable given that in Ubuntu it is 0.049ms. Ubuntu is on average 295x 
faster.  

Also the variability in times is much higher in Windows max/avg = 9.547, 
avg/min =  8.69.

It makes the windows queue.Queue implementations with timeouts unusable in low 
latency environments and should be documented as such. As it is slow and highly 
variable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread STINNER Victor


STINNER Victor  added the comment:

> Victor or Steve, do you know the resolution of perf_counter() and monotonic() 
> on Windows?

You can use:

>>> time.get_clock_info('perf_counter')
namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', 
monotonic=True, resolution=1e-09)

perf_counter() has usually a resolution much smaller than 1 ms (ex: 100 ns), 
whereas monotonic() has a bad resolution (around 15.6 ms).

Rounding matters for the monotonic clock on Windows :-) Python should round 
away from zero (ROUND_UP) for timeouts.

--

> I have found that using timeout with a python queue.Queue() results in an 
> average delay of  14.5ms in Windows over Ubuntu in python 3.6.5/3.7, I 
> subtracted the 1 ms timeout.

This behaviour looks correct to me. I suggest to close the issue.

Python rely on Windows API. Queue.get() uses a threading.Lock. On Windows, 
lock.acquire(timeout=timeout) uses EnterNonRecursiveMutex() which expects a 
number of a milliseconds. Even if you succeed to enhance the resolution of the 
system clock, you will still be stuck to a resolution of 1 ms because of the 
Windows API.

I'm not sure of what is your underlying question. Does it look inefficient to 
you? You can use threads or asyncio to write more efficient code. But this is 
out of the scope of the bug tracker :-)

I suggest to close this issue as "not a bug".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-28 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Victor or Steve, do you know the resolution of perf_counter() and monotonic() 
on Windows?

--
nosy: +rhettinger, vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-28 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-28 Thread Paul Overing


New submission from Paul Overing :

I have found that using timeout with a python queue.Queue() results in an 
average delay of  14.5ms in Windows over Ubuntu in python 3.6.5/3.7, I 
subtracted the 1 ms timeout.

I have also tried datetime.datetime.now() to measure the time, with similar 
results. 

Does that make sense?


import queue
import time

max=0
min=1
num = 1
sum = 0
for x in range(1):
q = queue.Queue()
start = time.perf_counter()
try:
msg = q.get(block=True, timeout=.001)
except:
end = time.perf_counter()

chng = end-start -.001

if chng > max:
max = chng
if chng < min:
min = chng
sum = sum + chng

print(f"Avg: {sum/num}")
print(f"Min: {min}")
print(f"Max: {max}")

Results:

Window 10, Python 3.6.5/3.7
Avg: 0.014549868429997701
Min: 0.00167414200046
Max: 0.1389025150284

Ubuntu 16.04, Python 3.6.5/3.7
Avg: 6.275181290839012e-05
Min: 4.89290034456644e-05
Max: 8.690999695681965e-05

--
components: Windows
messages: 324277
nosy: Gammaguy, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms
type: performance
versions: Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com