[issue6721] Locks in the standard library should be sanitized on fork
Connor Wolf added the comment: > Python 3.5.1+ (default, Mar 30 2016, 22:46:26) Whatever the stock 3.5 on ubuntu 16.04 x64 is. I've actually been running into a whole horde of really bizarre issues related to what I /think/ is locking in stdout. Basically, I have a context where I have thousands and thousands of (relatively short lived) `multiprocessing.Process()` processes, and over time they all get wedged (basically, I have ~4-32 processes alive at any time, but they all get recycled every few minutes). After doing some horrible (https://github.com/fake-name/ReadableWebProxy/blob/master/logSetup.py#L21-L78) hackery in the logging module, I'm not seeing processes get wedged there, but I do still encounter issues with what I can only assume is a lock in the print statement. I'm hooking into a wedged process using [pystuck](https://github.com/alonho/pystuck) durr@rwpscrape:/media/Storage/Scripts/ReadableWebProxy⟫ pystuck --port 6675 Welcome to the pystuck interactive shell. Use the 'modules' dictionary to access remote modules (like 'os', or '__main__') Use the `%show threads` magic to display all thread stack traces. In [1]: show threads <_MainThread(MainThread, started 140574012434176)> File "runScrape.py", line 74, in go() File "runScrape.py", line 57, in go runner.run() File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 453, in run living = sum([manager.check_run_jobs() for manager in managers]) File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 453, in living = sum([manager.check_run_jobs() for manager in managers]) File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 364, in check_run_jobs proc.start() File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start self._popen = self._Popen(self) File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/usr/lib/python3.5/multiprocessing/context.py", line 267, in _Popen return Popen(process_obj) File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__ self._launch(process_obj) File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 74, in _launch code = process_obj._bootstrap() File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap self.run() File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 145, in run run.go() File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 101, in go self.log.info("RunInstance starting!") File "/usr/lib/python3.5/logging/__init__.py", line 1279, in info self._log(INFO, msg, args, **kwargs) File "/usr/lib/python3.5/logging/__init__.py", line 1415, in _log self.handle(record) File "/usr/lib/python3.5/logging/__init__.py", line 1425, in handle self.callHandlers(record) File "/usr/lib/python3.5/logging/__init__.py", line 1487, in callHandlers hdlr.handle(record) File "/usr/lib/python3.5/logging/__init__.py", line 855, in handle self.emit(record) File "/media/Storage/Scripts/ReadableWebProxy/logSetup.py", line 134, in emit print(outstr) <Thread(Thread-4, started daemon 140573656733440)> File "/usr/lib/python3.5/threading.py", line 882, in _bootstrap self._bootstrap_inner() File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/usr/lib/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.5/dist-packages/rpyc/utils/server.py", line 241, in start self.accept() File "/usr/local/lib/python3.5/dist-packages/rpyc/utils/server.py", line 128, in accept sock, addrinfo = self.listener.accept() File "/usr/lib/python3.5/socket.py", line 195, in accept fd, addr = self._accept() <Thread(Thread-5, started daemon 140573665126144)> File "/usr/local/lib/python3.5/dist-packages/pystuck/thread_probe.py", line 15, in thread_frame_generator yield (thread_, frame) So, somehow the print() statement is blocking, which I have /no/ idea how to go about debugging. I assume there's a lock /in/ the print statement function call, and I'm probably going to look into wrapping both the print() call and the multiprocessing.Process() call execution in a single, shared multiprocessing lock, but that seems like a very patchwork solution to something that should just work. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6721> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6721] Locks in the standard library should be sanitized on fork
Connor Wolf added the comment: Arrrgh, s/threading/multiprocessing/g in my last message. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6721> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6721] Locks in the standard library should be sanitized on fork
Connor Wolf added the comment: > IMNSHO, working on "fixes" for this issue while ignoring the larger > application design flaw elephant in the room doesn't make a lot of sense. I understand the desire for a canonically "correct" fix, but it seems the issue with fixing it "correctly" has lead to the /actual/ implementation being broken for at least 6 years now. As it is, my options are: A. Rewrite the many, many libraries I use that internally spawn threads. B. Not use multiprocessing. (A) is prohibitive from a time perspective (I don't even know how many libraries I'd have to rewrite!), and (B) means I'd get 1/24-th of my VMs performance, so it's somewhat prohibitive. At the moment, I've thrown together a horrible, horrible fix where I reach into the logging library (which is where I'm seeing deadlocks), and manually iterate over all attached log managers, resetting the locks in each immediately when each process spawns. This is, I think it can be agreed, a horrible, horrible hack, but in my particular case it works (the worst case result is garbled console output for a line or two). --- If a canonical fix is not possible, at least add a facility to the threading fork() call that lets the user decide what to do. In my case, my program is wedging in the logging system, and I am entirely OK with having transiently garbled logs, if it means I don't wind up deadlocking and having to force kill the interpreter (which is, I think, far /more/ destructive an action). If I could basically do `multiprocessing.Process(*args, *kwargs, _clear_locks=True)`, that would be entirely sufficient, and not change existing behaviour at all. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6721> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6721] Locks in the standard library should be sanitized on fork
Connor Wolf added the comment: Is anything happening with these fixes? This is still an issue (I'm running into it now)? -- nosy: +Connor.Wolf ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6721> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26105] Python JSON module doesn't actually produce JSON
New submission from Connor Wolf: The Python library JSON library doesn't emit JSON by default. Basically, `json.dumps(float('nan'))` produces something that kind of looks like json, but isn't (specifically, `'NaN'`). Valid JSON must be `null`. JSON *does not allow `NaN`, `infinity`, or `-infinity`. `json.dump[s]` has the parameter `allow_nan`, but it's `False` by default, so basically it's not actually JSON by default. The default for emitting JSON should actually emit JSON. `allow_nan` must be `True` by default. -- components: Library (Lib) messages: 258179 nosy: Connor.Wolf priority: normal severity: normal status: open title: Python JSON module doesn't actually produce JSON versions: Python 3.4 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26105> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26105] Python JSON module doesn't actually produce JSON
Connor Wolf added the comment: The problem here is that JSON is *everywhere*, and I only ran into this particular issue after a whole bunch of digging as to why my "JSON" messages were disappearing in some javascript. Basically, with the default the way it is, you have interoperability bombs in every project that uses it to interface with other languages. In my case, I'm using Flask-SocketIO ( https://github.com/miguelgrinberg/Flask-SocketIO ), which uses JSON as it's transport, and it works fine until you have a NaN or infinity in your data, at which point the socket.io in the browser starts *silently* eating messages. Basically, if I call json.dumps, the principle of least astonishment dictated that you actually get, you know, JSON. If you have a module called something like `pyson`, and it's partially JSON compatible, that makes sense. For the JSON module to fail at the very thing it's named after is kind of ludicrous. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26105> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22993] Plistlib fails on certain valid plist values
New submission from Connor Wolf: I'm using plistlib to process plist files produced by an iphone app. Somehow, the application is generating plist files with a absolute date value along the lines of `-12-30T00:00:00Z`. This is a valid date, and the apple plist libraries can handle this without issue. However, it causes a ValueError if you load a plist containing it. Minimal example: python file: ``` import plistlib test = plistlib.readPlist('./test.plist') ``` plist file: ``` ?xml version=1.0 encoding=UTF-8? !DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN http://www.apple.com/DTDs/PropertyList-1.0.dtd; plist version=1.0 dict keyTest/key date-12-30T00:00:00Z/date /dict /plist ``` This fails on both python3 and python2, with the exact same error: ``` herp@mainnas:/media/Storage/Scripts/pFail$ python3 test.py Traceback (most recent call last): File test.py, line 3, in module test = plistlib.readPlist('./test.plist') File /usr/lib/python3.4/plistlib.py, line 164, in readPlist dict_type=_InternalDict) File /usr/lib/python3.4/plistlib.py, line 995, in load return p.parse(fp) File /usr/lib/python3.4/plistlib.py, line 325, in parse self.parser.ParseFile(fileobj) File /usr/lib/python3.4/plistlib.py, line 337, in handle_end_element handler() File /usr/lib/python3.4/plistlib.py, line 413, in end_date self.add_object(_date_from_string(self.get_data())) File /usr/lib/python3.4/plistlib.py, line 291, in _date_from_string return datetime.datetime(*lst) ValueError: year is out of range herp@mainnas:/media/Storage/Scripts/pFail$ python test.py Traceback (most recent call last): File test.py, line 3, in module test = plistlib.readPlist('./test.plist') File /usr/lib/python2.7/plistlib.py, line 78, in readPlist rootObject = p.parse(pathOrFile) File /usr/lib/python2.7/plistlib.py, line 406, in parse parser.ParseFile(fileobj) File /usr/lib/python2.7/plistlib.py, line 418, in handleEndElement handler() File /usr/lib/python2.7/plistlib.py, line 474, in end_date self.addObject(_dateFromString(self.getData())) File /usr/lib/python2.7/plistlib.py, line 198, in _dateFromString return datetime.datetime(*lst) ValueError: year is out of range herp@mainnas:/media/Storage/Scripts/pFail$ ``` -- components: Library (Lib) messages: 232107 nosy: Connor.Wolf priority: normal severity: normal status: open title: Plistlib fails on certain valid plist values type: behavior versions: Python 2.7, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22993 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22993] Plistlib fails on certain valid plist values
Connor Wolf added the comment: Aaaand there is no markup processing. How do I edit my report? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22993 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com