[issue40847] New parser considers empty line following a backslash to be a syntax error, old parser didn't
Adam Williamson added the comment: I'm not the best person to ask what I'd "consider" to be a bug or not, to be honest. I'm just a Fedora packaging guy trying to make our packages build with Python 3.9 :) If this is still an important question, I'd suggest asking the folks from the Black issue and PR I linked to, that's the "real world" case if any. -- ___ Python tracker <https://bugs.python.org/issue40847> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40848] compile() can compile a bare starred expression with `PyCF_ONLY_AST` flag with the old parser, but not the new one
Adam Williamson added the comment: Realized I forgot to give it, so in case it's important, the context here is the black test suite: https://github.com/psf/black/issues/1441 that test suite has a file full of expressions that it expects to be able to parse this way (it uses `ast.parse()`, which in turn calls `compile()` with this flag). A bare (*starred) line is part of that file: https://github.com/psf/black/blob/master/tests/data/expression.py#L149 and has been for as long as black has existed. Presumably if this isn't going to be fixed we'll need to adapt this black test file to test a starred expression in a 'valid' way, somehow. -- ___ Python tracker <https://bugs.python.org/issue40848> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40848] compile() can compile a bare starred expression with `PyCF_ONLY_AST` flag with the old parser, but not the new one
New submission from Adam Williamson : Not 100% sure this would be considered a bug, but it seems at least worth filing to check. This is a behaviour difference between the new parser and the old one. It's very easy to reproduce: sh-5.0# PYTHONOLDPARSER=1 python3 Python 3.9.0b1 (default, May 29 2020, 00:00:00) [GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from _ast import * >>> compile("(*starred)", "", "exec", flags=PyCF_ONLY_AST) >>> sh-5.0# python3 Python 3.9.0b1 (default, May 29 2020, 00:00:00) [GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from _ast import * >>> compile("(*starred)", "", "exec", flags=PyCF_ONLY_AST) Traceback (most recent call last): File "", line 1, in File "", line 1 (*starred) ^ SyntaxError: invalid syntax That is, you can compile() the expression "(*starred)" with PyCF_ONLY_AST flag set with the old parser, but not with the new one. Without PyCF_ONLY_AST you get a SyntaxError with both parsers, though a with the old parser, the error message is "can't use starred expression here", not "invalid syntax". -- components: Interpreter Core messages: 370620 nosy: adamwill priority: normal severity: normal status: open title: compile() can compile a bare starred expression with `PyCF_ONLY_AST` flag with the old parser, but not the new one versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue40848> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40847] New parser considers empty line following a backslash to be a syntax error, old parser didn't
New submission from Adam Williamson : While debugging issues with the black test suite in Python 3.9, I found one which black upstream says is a Cpython issue, so I'm filing it here. Reproduction is very easy. Just use this four-line tester: print("hello, world") \ print("hello, world 2") with that saved as `test.py`, check the results: sh-5.0# PYTHONOLDPARSER=1 python3 test.py hello, world hello, world 2 sh-5.0# python3 test.py File "/builddir/build/BUILD/black-19.10b0/test.py", line 3 ^ SyntaxError: invalid syntax The reason black has this test (well, a similar test - in black's test, the file *starts* with the backslash then the empty line, but the result is the same) is covered in https://github.com/psf/black/issues/922 and https://github.com/psf/black/pull/948 . -- components: Interpreter Core messages: 370618 nosy: adamwill priority: normal severity: normal status: open title: New parser considers empty line following a backslash to be a syntax error, old parser didn't type: behavior versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue40847> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37951] Disallow fork in a subinterpreter broke subprocesses in mod_wsgi daemon mode
Adam Williamson added the comment: It's this function: https://github.com/freeipa/freeipa/blob/master/ipalib/install/kinit.py#L66 The function `run` is imported from `ipapython.ipautil`, it's defined here: https://github.com/freeipa/freeipa/blob/master/ipapython/ipautil.py#L391 all of this is being run inside a WSGI. -- ___ Python tracker <https://bugs.python.org/issue37951> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37951] Disallow fork in a subinterpreter broke subprocesses in mod_wsgi daemon mode
Adam Williamson added the comment: Well, now our (Fedora QA's) automated testing of FreeIPA is showing what looks like a problem with preexec_fn (rather than fork) being disallowed: https://bugzilla.redhat.com/show_bug.cgi?id=1759290 Login to the FreeIPA webUI is failing, and at the time it fails we see this error message on the server end: [Mon Oct 07 09:22:19.521604 2019] [wsgi:error] [pid 32989:tid 139746234119936] [remote 10.0.2.102:56054] ipa: DEBUG: args=['/usr/bin/kinit', 'admin', '-c', '/run/ipa/ccaches/kinit_32989', '-E'] [Mon Oct 07 09:22:19.521996 2019] [wsgi:error] [pid 32989:tid 139746234119936] [remote 10.0.2.102:56054] ipa: DEBUG: Process execution failed [Mon Oct 07 09:22:19.522189 2019] [wsgi:error] [pid 32989:tid 139746234119936] [remote 10.0.2.102:56054] ipa: INFO: 401 Unauthorized: preexec_fn not supported within subinterpreters -- nosy: +adamwill status: pending -> open ___ Python tracker <https://bugs.python.org/issue37951> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes
Adam Williamson <awill...@redhat.com> added the comment: Yeah, I've added a comment there. I agree we can keep subsequent discussion in that issue. Closing this as a dupe. I actually have the same thought as you, but I suspect making something that "worked" before start throwing an error might be a hard sell for some. Perhaps at least some kind of warning? -- resolution: -> duplicate stage: -> resolved status: open -> closed ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12750] add cross-platform support for %s strftime-format code
Adam Williamson <awill...@redhat.com> added the comment: On the "attractive nuisance" angle: I just ran right into this problem, and reported https://bugs.python.org/issue32988 . As I suggested there, if Python doesn't try to fix this, I'd suggest it should at least *explicitly document* that using %s is unsupported and dangerous in more than one way (might not work on all platforms, does not do what it should for 'aware' datetimes on platforms where it *does* work). I think explicitly telling people NOT to use it would be better than just not mentioning it. At least for me, when I saw real code using it and that the docs just didn't mention it, my initial thought was "I guess it must be OK, and the docs just missed it out for some reason". If I'd gone to the docs and seen an explicit note that it's not supported and doesn't work right, that would've been much clearer and I wouldn't have had to figure that out for myself :) For Python 2, btw, the arrow library might be a suitable alternative to suggest: you can do something like this, assuming you have an aware datetime object called 'awaredate' you want to get the timestamp for: import arrow ts = arrow.get(awaredate).timestamp and it does the right thing. -- nosy: +adamwill ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue12750> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes
Adam Williamson <awill...@redhat.com> added the comment: I'd suggest that if that is the case, it would be better for the docs to *specifically mention* that `%s` is not supported and should not be used, rather than simply not mentioning it. When it's used in real code (note someone in the SO issue mentions "I have been going crazy trying to figure out why i see strftime("%s") a lot, yet it's not in the docs") and just *not mentioned* in the docs, this tends to give the impression that it's something usable that was perhaps just forgotten from the docs, or something. The situation would be much clearer if the docs said "DO NOT USE THIS, IT'S DANGEROUS AND DOESN'T DO WHAT YOU THINK" in big letters. (And suggested using .timestamp() on Python 3.3+, and possibly arrow's .timestamp on 2.7?) -- ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes
Adam Williamson <awill...@redhat.com> added the comment: Paul: right. This is on Linux - specifically Fedora Linux, but I don't think it matters. glibc strftime and strptime depend on an underlying struct called 'tm'. 'man strftime' says: %s The number of seconds since the Epoch, 1970-01-01 00:00:00 + (UTC). (TZ) (Calculated from mktime(tm).) And 'man mktime' says: The mktime() function converts a broken-down time structure, expressed as local time, to calendar time representation. ... On success, mktime() returns the calendar time (seconds since the Epoch), expressed as a value of type time_t." I am finding it hard to determine whether various C standards require the tm struct and mktime and strftime and so on to handle timezones, but I'm sort of inclining to the answer that "no they don't". Basically I suspect what's going on in this case is that the timezone information gets lost somewhere in the chain down from Python to system strftime to system mktime, and Python doesn't make any adjustment to the actual date / time values before calling system strftime to try and account for this. I think Python must do *something* more than purely converting to a tm and calling system strftime, though, as %Z does work, which it wouldn't if Python was purely converting to a non-timezone-aware tm struct and calling system strftime, I don't think... -- ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes
New submission from Adam Williamson <awill...@redhat.com>: Test script: import pytz import datetime utc = pytz.timezone('UTC') print(datetime.datetime(2017, 1, 1, tzinfo=utc).strftime('%s')) Try running it with various system timezones: [adamw@xps13k pagure (more-timezone-fun %)]$ TZ='UTC' python /tmp/test2.py 1483228800 [adamw@xps13k pagure (more-timezone-fun %)]$ TZ='America/Winnipeg' python /tmp/test2.py 1483250400 [adamw@xps13k pagure (more-timezone-fun %)]$ TZ='America/Vancouver' python /tmp/test2.py 1483257600 That's Python 2.7.14; same results with Python 3.6.4. This does not seem correct. The correct Unix time for an aware datetime object should be a constant: for 2017-01-01 00:00 UTC it *is* 1483228800 . No matter what the system's local timezone, that should be the output of strftime('%s'), surely. What it seems to be doing instead is just outputting the Unix time for 2017-01-01 00:00 in the system timezone. I *do* note that strftime('%s') is completely undocumented in Python; neither https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior nor https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior mentions it. However, it does exist, and is used in the real world; I found this usage of it, and the bug, in a real project, Pagure. -- components: Library (Lib) messages: 313169 nosy: adamwill priority: normal severity: normal status: open title: datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes versions: Python 2.7, Python 3.6 ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32988> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30062] datetime in Python 3.6+ no longer respects 'TZ' environment variable
Adam Williamson added the comment: Hmm, after a bit more poking I found this: https://docs.python.org/3/library/time.html#time.tzset "Note Although in many cases, changing the TZ environment variable may affect the output of functions like localtime() without calling tzset(), this behavior should not be relied on." It seems like that's kinda what we're dealing with here. If I extend my tests to change TZ, call the test function, then call `time.tzset()` and call the test function again, the *second* call to the test function gives the different result, i.e. the `time.tzset()` call does what it claims and changes Python's conception of the 'current' timezone. So while that note has been there all along, it seems like the behaviour actually changed between 3.5 and 3.6, and a change to 'TZ' is now less likely to be respected without a `tzset()` call. But given the doc note, perhaps that can't be considered a bug. anaconda doesn't call `time.tzset()` anywhere at present. It's also multi-threaded, so making sure all the threads call `time.tzset()` after any thread has changed what the 'current' timezone is will be lots of fun to implement, I guess :/ -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30062> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30062] datetime in Python 3.6+ no longer respects 'TZ' environment variable
New submission from Adam Williamson: I can't figure out yet why this is, but it's very easy to demonstrate: [adamw@adam anaconda (time-log %)]$ python35 Python 3.5.2 (default, Feb 11 2017, 18:09:24) [GCC 7.0.1 20170209 (Red Hat 7.0.1-0.7)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> import datetime >>> os.environ['TZ'] = 'America/Winnipeg' >>> datetime.datetime.fromtimestamp(0) datetime.datetime(1969, 12, 31, 18, 0) >>> os.environ['TZ'] = 'Europe/London' >>> datetime.datetime.fromtimestamp(0) datetime.datetime(1970, 1, 1, 1, 0) >>> [adamw@adam anaconda (time-log %)]$ python3 Python 3.6.0 (default, Mar 21 2017, 17:30:34) [GCC 7.0.1 20170225 (Red Hat 7.0.1-0.10)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> import datetime >>> os.environ['TZ'] = 'America/Winnipeg' >>> datetime.datetime.fromtimestamp(0) datetime.datetime(1969, 12, 31, 16, 0) >>> os.environ['TZ'] = 'Europe/London' >>> datetime.datetime.fromtimestamp(0) datetime.datetime(1969, 12, 31, 16, 0) >>> That is, when deciding what timezone to use for operations that involve one, if the 'TZ' environment variable was set, Python 3.5 would use the timezone it was set to. Python 3.6 does not, it ignores it. As you can see, if I twiddle the 'TZ' setting and call `datetime.datetime.fromtimestamp(0)` repeatedly under Python 3.5, I get different results - each one is the wall clock time at the epoch (timestamp 0) in the timezone specified as 'TZ'. If I do the same on Python 3.6, the 'TZ' setting is ignored and I always get the same result (the wall clock time of 'the epoch' in Vancouver, which is my real timezone, and which I guess is being picked up from /etc/localtime or whatever). This wound up causing a problem in the Fedora / Red Hat installer, anaconda: https://bugzilla.redhat.com/show_bug.cgi?id=1433560 The 'current time zone' can be changed in anaconda. Shortly after it starts up, it automatically tries to guess the correct time zone via geolocation, and the user can also explicitly choose a timezone in the installer interface (or set one in a kickstart). Whenever the timezone is set in this way, an underlying library (libtimezonemap - https://launchpad.net/timezonemap) sets 'TZ' to the chosen timezone. It turns out other code in anaconda relies on Python respecting that setting, which Python 3.6 does not do. As a consequence, anaconda with Python 3.6 winds up setting the system time incorrectly. Also, the timestamps on all its log files are different now, and there may well be other consequences I didn't figure out yet. The same applies to, e.g., `datetime.datetime.now()`: you can perform the same experiment with Python 3.5 and 3.6. If you change the 'TZ' env var while calling `datetime.datetime.now()` after each change, on Python 3.5, the naive datetime object it returns is the current time *in that timezone*. On Python 3.6, regardless of what 'TZ' is set to, it always gives you the same time. Is this an intended and/or desired change that we should adjust to somehow? Is there another way a running Python process can change what "the current" timezone is, for the purposes of datetime calculations like this? -- components: Library (Lib) messages: 291574 nosy: adamwill priority: normal severity: normal status: open title: datetime in Python 3.6+ no longer respects 'TZ' environment variable versions: Python 3.6, Python 3.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30062> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29113] modulefinder no longer finds all required modules for Python itself, due to use of __import__ in sysconfig
New submission from Adam Williamson: I'm not sure if this is really considered a bug or just an unavoidable limitation, but as it involves part of the stdlib operating on Python itself, I figured it was at least worth reporting. In Fedora we have a fairly simple little script called python-deps: https://github.com/rhinstaller/anaconda/blob/master/dracut/python-deps which is used to figure out the dependencies of a couple of Python scripts used in the installer's initramfs environment, so the necessary bits of Python (but not the rest of it) can be included in the installer's initramfs. Unfortunately, with Python 3.6, this seems to be broken for the core of Python itself, because of this change: https://github.com/python/cpython/commit/a6431f2c8cf4783c2fd522b2f6ee04c3c204237f which changed sysconfig.py from doing "from _sysconfigdata import build_time_vars" to using __import__ . I *think* that modulefinder can't cope with this use of __import__ and so misses that sysconfig requires "_sysconfigdata_m_linux_x86_64-linux-gnu" (or whatever the actual name is on your particular platform and arch). This results in us not including the platform-specific module in the installer initramfs, so Python blows up on startup when the 'site' module tries to import the 'sysconfig' module. We could work around this one way or another in the python-deps script, but I figured the issue was at least worth an upstream report to see if it's considered a significant issue or not. You can reproduce the problem quite trivially by writing a test script which just does, e.g., "import site", and then running the example code from the ModuleFinder docs on it: from modulefinder import ModuleFinder finder = ModuleFinder() finder.run_script('test.py') print('Loaded modules:') for name, mod in finder.modules.items(): print('%s: ' % name, end='') print(','.join(list(mod.globalnames.keys())[:3])) if you examine the output, you'll see that the 'sysconfig' module is included, but the site-specific module is not. -- components: Library (Lib) messages: 284304 nosy: adamwill priority: normal severity: normal status: open title: modulefinder no longer finds all required modules for Python itself, due to use of __import__ in sysconfig versions: Python 3.6 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29113> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: https://github.com/tiran/defusedxml/pull/4 should fix this, I hope. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: https://paste.fedoraproject.org/511245/14824393/ is my cut at a fix for this, gonna test it out now. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: Digging some more, it looks like *only* Python 3.3 went so far out of its way to hide the pure-Python iterparse() - the code was changed again in 3.4 and it doesn't do that any more. So I think a way forward here is to make the code that uses _IterParseIterator specific to Python 3.3, and use the Python 2.7 code (i.e. just use the iterparse() function) for 3.2 and 3.4+. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: Aha, so thanks to my colleague Patrick Uiterwijk, we see the problem. Since Python 3.3, Python doesn't actually use that pure-Python iterparse() function if it can instead replace it with a C version: https://github.com/python/cpython/blob/3.3/Lib/xml/etree/ElementTree.py#L1705 "# Overwrite 'ElementTree.parse' and 'iterparse' to use the C XMLParser" so the reason defusedxml wants to use _IterParseIterator on Python 3 is because if it just uses xml.etree.ElementTree.iterparse() it's getting the 'accelerated' C implementation, not the pure-Python implementation it wants. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: serhiy: so, the funny thing is this: your fix is ultimately a reversion. Though we have to dig way back into the bowels of defusedxml to see this. Specifically, to this commit! https://github.com/tiran/defusedxml/commit/03d4fc6cf246a209c2cf892b33f5b6cf5af4ecbd that's the point at which Christian introduced a divergence between Python 2 and Python 3 here, and essentially the same divergence remains between the `elif PY3:` and `else: # Python 2.7` blocks now. The Python 2.7 block in current defusedxml is in fact the same as your block, because `_iterparse` is just the parent `iterparse` function, as discovered by `_get_py3_cls()`. So before applying your change, I kinda want to understand why Christian introduced this divergence in the first place. The commit message claims it's because Python 3.3 hid some pure python, but I don't quite understand that: https://github.com/python/cpython/blob/3.3/Lib/xml/etree/ElementTree.py looks like iterparse() was still perfectly available and usable for this purpose in 3.3, just as it was in 3.2 and still appears to be in 3.6. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
Adam Williamson added the comment: Ammar: yep, that's correct. There's code in defused's ElementTree.py - _ get_py3_cls() - which passes different values to _generate_etree_functions based on the Python 3 version. For Python 3.2+, defused 0.4.1 expects to use the _IterParseIterator class from xml ElementTree , but that got removed in 3.6, so if you just use defused 0.4.1 with Python 3.6, it asserts as soon as you try to import defusedxml.ElementTree at all: >>> import defusedxml.ElementTree Traceback (most recent call last): File "", line 1, in File "/tmp/defusedxml-0.4.1/defusedxml/ElementTree.py", line 62, in _XMLParser, _iterparse, _IterParseIterator, ParseError = _get_py3_cls() File "/tmp/defusedxml-0.4.1/defusedxml/ElementTree.py", line 56, in _get_py3_cls _IterParseIterator = pure_pymod._IterParseIterator AttributeError: module 'xml.etree.ElementTree' has no attribute '_IterParseIterator' Christian made a change to make _get_py3_cls() pass None to _generate_etree_functions() so you can at least import defusedxml.ElementTree, but he didn't change _generate_etree_functions() at all so it just doesn't have a code path that handles this at all; for Python 3.2+ it's expecting to get a real iterator, not None, and it just breaks completely trying to use None as an iterator: sh-4.3# echo "" > test.xml sh-4.3# python3 >>> import defusedxml.ElementTree >>> parser = defusedxml.ElementTree.iterparse('test.xml') Traceback (most recent call last): File "", line 1, in File "/tmp/defusedxml-0.4.1/defusedxml/common.py", line 141, in iterparse return _IterParseIterator(source, events, parser, close_source) TypeError: 'NoneType' object is not callable Serhiy, thanks for the suggestion! We'll try that out. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
New submission from Adam Williamson: The changes made to xml.etree.ElementTree in this commit: https://github.com/python/cpython/commit/12a626fae80a57752ccd91ad25b5a283e18154ec break defusedxml , Christian Heimes' library of modified parsers that's intended to be safe for parsing untrusted input. As of now, it's not possible to have defusedxml working properly with Python 3.6; its ElementTree parsers cannot work properly. Of course, defusedxml is an external library that does 'inappropriate' things (like fiddling around with internals of the xml library). So usually this should be considered just a problem for defusedxml to deal with somehow, and indeed I've reported it there: https://github.com/tiran/defusedxml/issues/3 . That report has more details on the precise problem. I thought it was worthwhile reporting to Python itself as well, however, for a specific reason. The Python docs for the xml library explicitly cover and endorse the use of defusedxml: "defusedxml is a pure Python package with modified subclasses of all stdlib XML parsers that prevent any potentially malicious operation. Use of this package is recommended for any server code that parses untrusted XML data." - https://docs.python.org/3.6/library/xml.html#the-defusedxml-and-defusedexpat-packages so as things stand, the Python 3.6 docs will explicitly recommend people use a module which does not work with Python 3.6. Is this considered a serious problem? It also looks to me (though I'm hardly an expert) as if it might be quite difficult and ugly to fix this on the defusedxml side, and the 'nicest' fix might actually be to tweak Python's xml module back a bit more to how it was in < 3.6 (but without losing the optimization from the commit in question) so it's easier for defusedxml to get at the internals it needs...but I could well be wrong about that. Thanks! -- components: XML messages: 283854 nosy: adamwill priority: normal severity: normal status: open title: xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml type: behavior versions: Python 3.6 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com