[issue6721] Locks in the standard library should be sanitized on fork

2016-07-08 Thread Connor Wolf

Connor Wolf added the comment:

> Python 3.5.1+ (default, Mar 30 2016, 22:46:26)

Whatever the stock 3.5 on ubuntu 16.04 x64 is.

I've actually been running into a whole horde of really bizarre issues related 
to what I /think/ is locking in stdout. 

Basically, I have a context where I have thousands and thousands of (relatively 
short lived) `multiprocessing.Process()` processes, and over time they all get 
wedged (basically, I have ~4-32 processes alive at any time, but they all get 
recycled every few minutes).

After doing some horrible 
(https://github.com/fake-name/ReadableWebProxy/blob/master/logSetup.py#L21-L78) 
hackery in the logging module, I'm not seeing processes get wedged there, but I 
do still encounter issues with what I can only assume is a lock in the print 
statement. I'm hooking into a wedged process using 
[pystuck](https://github.com/alonho/pystuck)

durr@rwpscrape:/media/Storage/Scripts/ReadableWebProxy⟫ pystuck --port 6675
Welcome to the pystuck interactive shell.
Use the 'modules' dictionary to access remote modules (like 'os', or '__main__')
Use the `%show threads` magic to display all thread stack traces.

In [1]: show threads
<_MainThread(MainThread, started 140574012434176)>
  File "runScrape.py", line 74, in 
go()
  File "runScrape.py", line 57, in go
runner.run()
  File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 453, 
in run
living = sum([manager.check_run_jobs() for manager in managers])
  File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 453, 
in 
living = sum([manager.check_run_jobs() for manager in managers])
  File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 364, 
in check_run_jobs
proc.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 267, in _Popen
return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 74, in _launch
code = process_obj._bootstrap()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
  File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 145, 
in run
run.go()
  File "/media/Storage/Scripts/ReadableWebProxy/WebMirror/Runner.py", line 101, 
in go
self.log.info("RunInstance starting!")
  File "/usr/lib/python3.5/logging/__init__.py", line 1279, in info
self._log(INFO, msg, args, **kwargs)
  File "/usr/lib/python3.5/logging/__init__.py", line 1415, in _log
self.handle(record)
  File "/usr/lib/python3.5/logging/__init__.py", line 1425, in handle
self.callHandlers(record)
  File "/usr/lib/python3.5/logging/__init__.py", line 1487, in callHandlers
hdlr.handle(record)
  File "/usr/lib/python3.5/logging/__init__.py", line 855, in handle
self.emit(record)
  File "/media/Storage/Scripts/ReadableWebProxy/logSetup.py", line 134, in emit
print(outstr)

<Thread(Thread-4, started daemon 140573656733440)>
  File "/usr/lib/python3.5/threading.py", line 882, in _bootstrap
self._bootstrap_inner()
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.5/dist-packages/rpyc/utils/server.py", line 241, 
in start
self.accept()
  File "/usr/local/lib/python3.5/dist-packages/rpyc/utils/server.py", line 128, 
in accept
sock, addrinfo = self.listener.accept()
  File "/usr/lib/python3.5/socket.py", line 195, in accept
fd, addr = self._accept()

<Thread(Thread-5, started daemon 140573665126144)>
  File "/usr/local/lib/python3.5/dist-packages/pystuck/thread_probe.py", line 
15, in thread_frame_generator
yield (thread_, frame)


So, somehow the print() statement is blocking, which I have /no/ idea how to go 
about debugging. I assume there's a lock /in/ the print statement function 
call, and I'm probably going to look into wrapping both the print() call and 
the multiprocessing.Process() call  execution in a single, shared 
multiprocessing lock, but that
seems like a very patchwork solution to something that should just work.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6721] Locks in the standard library should be sanitized on fork

2016-07-08 Thread Connor Wolf

Connor Wolf added the comment:

Arrrgh, s/threading/multiprocessing/g in my last message.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6721] Locks in the standard library should be sanitized on fork

2016-07-08 Thread Connor Wolf

Connor Wolf added the comment:

> IMNSHO, working on "fixes" for this issue while ignoring the larger 
> application design flaw elephant in the room doesn't make a lot of sense.

I understand the desire for a canonically "correct" fix, but it seems the issue 
with fixing it "correctly" has lead to the /actual/ implementation being broken 
for at least 6 years now.

As it is, my options are:
A. Rewrite the many, many libraries I use that internally spawn threads.
B. Not use multiprocessing.

(A) is prohibitive from a time perspective (I don't even know how many 
libraries I'd have to rewrite!), and (B) means I'd get 1/24-th of my VMs 
performance, so it's somewhat prohibitive.

At the moment, I've thrown together a horrible, horrible fix where I reach into 
the logging library (which is where I'm seeing deadlocks), and manually iterate 
over all attached log managers, resetting the locks in each immediately when 
each process spawns. 
This is, I think it can be agreed, a horrible, horrible hack, but in my 
particular case it works (the worst case result is garbled console output for a 
line or two). 

---

If a canonical fix is not possible, at least add a facility to the threading 
fork() call that lets the user decide what to do. In my case, my program is 
wedging in the logging system, and I am entirely OK with having transiently 
garbled logs, if it means I don't wind up deadlocking and having to force kill 
the interpreter (which is, I think, far /more/ destructive an action).

If I could basically do `multiprocessing.Process(*args, *kwargs, 
_clear_locks=True)`, that would be entirely sufficient, and not change existing 
behaviour at all.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6721] Locks in the standard library should be sanitized on fork

2016-07-08 Thread Connor Wolf

Connor Wolf added the comment:

Is anything happening with these fixes? This is still an issue (I'm running 
into it now)?

--
nosy: +Connor.Wolf

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26105] Python JSON module doesn't actually produce JSON

2016-01-13 Thread Connor Wolf

New submission from Connor Wolf:

The Python library JSON library doesn't emit JSON by default.

Basically, `json.dumps(float('nan'))` produces something that kind of looks 
like json, but isn't (specifically, `'NaN'`). Valid JSON must be `null`.

JSON *does not allow `NaN`, `infinity`, or `-infinity`. 

`json.dump[s]` has the parameter `allow_nan`, but it's `False` by default, so 
basically it's not actually JSON by default.

The default for emitting JSON should actually emit JSON. `allow_nan` must be 
`True` by default.

--
components: Library (Lib)
messages: 258179
nosy: Connor.Wolf
priority: normal
severity: normal
status: open
title: Python JSON module doesn't actually produce JSON
versions: Python 3.4

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26105>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26105] Python JSON module doesn't actually produce JSON

2016-01-13 Thread Connor Wolf

Connor Wolf added the comment:

The problem here is that JSON is *everywhere*, and I only ran into this 
particular issue after a whole bunch of digging as to why my "JSON" messages 
were disappearing in some javascript. 

Basically, with the default the way it is, you have interoperability bombs in 
every project that uses it to interface with other languages. In my case, I'm 
using Flask-SocketIO ( https://github.com/miguelgrinberg/Flask-SocketIO ), 
which uses JSON as it's transport, and it works fine until you have a NaN or 
infinity in your data, at which point the socket.io in the browser starts 
*silently* eating messages.

Basically, if I call json.dumps, the principle of least astonishment dictated 
that you actually get, you know, JSON.

If you have a module called something like `pyson`, and it's partially JSON 
compatible, that makes sense. For the JSON module to fail at the very thing 
it's named after is kind of ludicrous.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26105>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22993] Plistlib fails on certain valid plist values

2014-12-03 Thread Connor Wolf

New submission from Connor Wolf:

I'm using plistlib to process plist files produced by an iphone app. Somehow, 
the application is generating plist files with a absolute date value along the 
lines of `-12-30T00:00:00Z`.

This is a valid date, and the apple plist libraries can handle this without 
issue. However, it causes a ValueError if you load a plist containing it.

Minimal example:

python file:  
```
import plistlib
test = plistlib.readPlist('./test.plist')
```

plist file:  
```
?xml version=1.0 encoding=UTF-8?
!DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN 
http://www.apple.com/DTDs/PropertyList-1.0.dtd;
plist version=1.0
dict
keyTest/key
date-12-30T00:00:00Z/date
/dict
/plist
```

This fails on both python3 and python2, with the exact same error:


```
herp@mainnas:/media/Storage/Scripts/pFail$ python3 test.py
Traceback (most recent call last):
  File test.py, line 3, in module
test = plistlib.readPlist('./test.plist')
  File /usr/lib/python3.4/plistlib.py, line 164, in readPlist
dict_type=_InternalDict)
  File /usr/lib/python3.4/plistlib.py, line 995, in load
return p.parse(fp)
  File /usr/lib/python3.4/plistlib.py, line 325, in parse
self.parser.ParseFile(fileobj)
  File /usr/lib/python3.4/plistlib.py, line 337, in handle_end_element
handler()
  File /usr/lib/python3.4/plistlib.py, line 413, in end_date
self.add_object(_date_from_string(self.get_data()))
  File /usr/lib/python3.4/plistlib.py, line 291, in _date_from_string
return datetime.datetime(*lst)
ValueError: year is out of range
herp@mainnas:/media/Storage/Scripts/pFail$ python test.py
Traceback (most recent call last):
  File test.py, line 3, in module
test = plistlib.readPlist('./test.plist')
  File /usr/lib/python2.7/plistlib.py, line 78, in readPlist
rootObject = p.parse(pathOrFile)
  File /usr/lib/python2.7/plistlib.py, line 406, in parse
parser.ParseFile(fileobj)
  File /usr/lib/python2.7/plistlib.py, line 418, in handleEndElement
handler()
  File /usr/lib/python2.7/plistlib.py, line 474, in end_date
self.addObject(_dateFromString(self.getData()))
  File /usr/lib/python2.7/plistlib.py, line 198, in _dateFromString
return datetime.datetime(*lst)
ValueError: year is out of range
herp@mainnas:/media/Storage/Scripts/pFail$

```

--
components: Library (Lib)
messages: 232107
nosy: Connor.Wolf
priority: normal
severity: normal
status: open
title: Plistlib fails on certain valid plist values
type: behavior
versions: Python 2.7, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22993
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22993] Plistlib fails on certain valid plist values

2014-12-03 Thread Connor Wolf

Connor Wolf added the comment:

Aaaand there is no markup processing. How do I edit my report?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22993
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com