On 12Aug2014 09:56, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info>
wrote:
Cameron Simpson wrote:
On 12Aug2014 02:07, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info>
wrote:
Is this documented somewhere?
In python/2.7.6/reference/simple_stmts.html#index-22, "print" is described
in terms of a "write" for each object, and a "write" for the separators.
There is no mention of locking.
Ah, thanks!
On that basis, I would find the interleaving described normal and
expected. And certainly not "broken".
I personally didn't describe it as "broken",
Yes, sorry.
but it is, despite the
documentation. I just ran a couple of trials where I collected the output
of sys.stdout while 50 threads blasted "Spam ABCD EFGH" (plus the implicit
newline) to stdout as fast as possible using print. The result was that out
of 248165 lines[1], 595 were mangled. Many of the mangled lines were the
expected simple run-ons:
Spam ABCD EFGHSpam ABCD EFGH\n\n
which makes sense given the documentation, but there were lots of anomalies.
Mysterious spaces appearing in the strings:
Spam ABCD EFGH Spam ABCD EFGH\n\n
Spam ABCD EFGH Spam ABCD EFGH\n Spam ABCD EFGH\n
occasional collisions mid-string:
Spam ABSpam ABCD EFGH\nCD EFGH\n
letters disappearing:
Spam AB\nD EFGH\n
and at least one utterly perplexing (to me) block of ASCII NULs appearing in
the middle of the output:
\x00\x00\x00...\x00\x00\n
This is with Python 2.7.2 on Linux.
Sounds like print is not thread safe. Which it does not promise to be. But I
would normally expect most file.write methods to be thread safe. Naively.
Just use a lock! And rebind "print"! Or use the logging system!
Personally, I believe that print ought to do its own locking.
I don't, but I kind of believe "file"s should have thread safe write calls.
Again, not guarrenteed AFAIR.
And print is a
statement, although in this case there's no need to support anything older
than 2.6, so something like this ought to work:
from __future__ import print_function
_print = print
_rlock = threading.RLock()
def print(*args, **kwargs):
with _rlock:
_print(*args, **kwargs)
Sadly, using print as a function alone isn't enough to fix this problem, but
in my quick tests, using locking as above does fix it, and with no
appreciable slowdown.
I would expect file.write to be fast enough that the lock would usually be
free. With no evidence, just personal expectation. Taking a free lock should be
almost instant.
[1] Even the number of lines of output demonstrates a bug. I had fifty
threads printing 5000 times each, which makes 250000 lines, not 248165.
Sounds like the file internals are unsafe. Ugh.
Cheers,
Cameron Simpson <c...@zip.com.au>
If it ain't broken, keep playing with it.
--
https://mail.python.org/mailman/listinfo/python-list