[issue40847] New parser considers empty line following a backslash to be a syntax error, old parser didn't

2020-06-08 Thread Adam Williamson


Adam Williamson  added the comment:

I'm not the best person to ask what I'd "consider" to be a bug or not, to be 
honest. I'm just a Fedora packaging guy trying to make our packages build with 
Python 3.9 :) If this is still an important question, I'd suggest asking the 
folks from the Black issue and PR I linked to, that's the "real world" case if 
any.

--

___
Python tracker 
<https://bugs.python.org/issue40847>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40848] compile() can compile a bare starred expression with `PyCF_ONLY_AST` flag with the old parser, but not the new one

2020-06-02 Thread Adam Williamson


Adam Williamson  added the comment:

Realized I forgot to give it, so in case it's important, the context here is 
the black test suite:

https://github.com/psf/black/issues/1441

that test suite has a file full of expressions that it expects to be able to 
parse this way (it uses `ast.parse()`, which in turn calls `compile()` with 
this flag). A bare (*starred) line is part of that file:

https://github.com/psf/black/blob/master/tests/data/expression.py#L149

and has been for as long as black has existed. Presumably if this isn't going 
to be fixed we'll need to adapt this black test file to test a starred 
expression in a 'valid' way, somehow.

--

___
Python tracker 
<https://bugs.python.org/issue40848>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40848] compile() can compile a bare starred expression with `PyCF_ONLY_AST` flag with the old parser, but not the new one

2020-06-02 Thread Adam Williamson


New submission from Adam Williamson :

Not 100% sure this would be considered a bug, but it seems at least worth 
filing to check. This is a behaviour difference between the new parser and the 
old one. It's very easy to reproduce:

 sh-5.0# PYTHONOLDPARSER=1 python3
Python 3.9.0b1 (default, May 29 2020, 00:00:00) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from _ast import *
>>> compile("(*starred)", "", "exec", flags=PyCF_ONLY_AST)

>>> 
 sh-5.0# python3
Python 3.9.0b1 (default, May 29 2020, 00:00:00) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from _ast import *
>>> compile("(*starred)", "", "exec", flags=PyCF_ONLY_AST)
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1
(*starred)
  ^
SyntaxError: invalid syntax

That is, you can compile() the expression "(*starred)" with PyCF_ONLY_AST flag 
set with the old parser, but not with the new one. Without PyCF_ONLY_AST you 
get a SyntaxError with both parsers, though a with the old parser, the error 
message is "can't use starred expression here", not "invalid syntax".

--
components: Interpreter Core
messages: 370620
nosy: adamwill
priority: normal
severity: normal
status: open
title: compile() can compile a bare starred expression with `PyCF_ONLY_AST` 
flag with the old parser, but not the new one
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue40848>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40847] New parser considers empty line following a backslash to be a syntax error, old parser didn't

2020-06-02 Thread Adam Williamson


New submission from Adam Williamson :

While debugging issues with the black test suite in Python 3.9, I found one 
which black upstream says is a Cpython issue, so I'm filing it here.

Reproduction is very easy. Just use this four-line tester:

print("hello, world")
\

print("hello, world 2")

with that saved as `test.py`, check the results:

 sh-5.0# PYTHONOLDPARSER=1 python3 test.py
hello, world
hello, world 2
 sh-5.0# python3 test.py
  File "/builddir/build/BUILD/black-19.10b0/test.py", line 3

^
SyntaxError: invalid syntax

The reason black has this test (well, a similar test - in black's test, the 
file *starts* with the backslash then the empty line, but the result is the 
same) is covered in https://github.com/psf/black/issues/922 and 
https://github.com/psf/black/pull/948 .

--
components: Interpreter Core
messages: 370618
nosy: adamwill
priority: normal
severity: normal
status: open
title: New parser considers empty line following a backslash to be a syntax 
error, old parser didn't
type: behavior
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue40847>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37951] Disallow fork in a subinterpreter broke subprocesses in mod_wsgi daemon mode

2019-10-07 Thread Adam Williamson


Adam Williamson  added the comment:

It's this function:

https://github.com/freeipa/freeipa/blob/master/ipalib/install/kinit.py#L66

The function `run` is imported from `ipapython.ipautil`, it's defined here:

https://github.com/freeipa/freeipa/blob/master/ipapython/ipautil.py#L391

all of this is being run inside a WSGI.

--

___
Python tracker 
<https://bugs.python.org/issue37951>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37951] Disallow fork in a subinterpreter broke subprocesses in mod_wsgi daemon mode

2019-10-07 Thread Adam Williamson


Adam Williamson  added the comment:

Well, now our (Fedora QA's) automated testing of FreeIPA is showing what looks 
like a problem with preexec_fn (rather than fork) being disallowed:

https://bugzilla.redhat.com/show_bug.cgi?id=1759290

Login to the FreeIPA webUI is failing, and at the time it fails we see this 
error message on the server end:

[Mon Oct 07 09:22:19.521604 2019] [wsgi:error] [pid 32989:tid 139746234119936] 
[remote 10.0.2.102:56054] ipa: DEBUG: args=['/usr/bin/kinit', 'admin', '-c', 
'/run/ipa/ccaches/kinit_32989', '-E']
[Mon Oct 07 09:22:19.521996 2019] [wsgi:error] [pid 32989:tid 139746234119936] 
[remote 10.0.2.102:56054] ipa: DEBUG: Process execution failed
[Mon Oct 07 09:22:19.522189 2019] [wsgi:error] [pid 32989:tid 139746234119936] 
[remote 10.0.2.102:56054] ipa: INFO: 401 Unauthorized: preexec_fn not supported 
within subinterpreters

--
nosy: +adamwill
status: pending -> open

___
Python tracker 
<https://bugs.python.org/issue37951>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes

2018-03-02 Thread Adam Williamson

Adam Williamson <awill...@redhat.com> added the comment:

Yeah, I've added a comment there. I agree we can keep subsequent discussion in 
that issue. Closing this as a dupe.

I actually have the same thought as you, but I suspect making something that 
"worked" before start throwing an error might be a hard sell for some. Perhaps 
at least some kind of warning?

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12750] add cross-platform support for %s strftime-format code

2018-03-02 Thread Adam Williamson

Adam Williamson <awill...@redhat.com> added the comment:

On the "attractive nuisance" angle: I just ran right into this problem, and 
reported https://bugs.python.org/issue32988 .

As I suggested there, if Python doesn't try to fix this, I'd suggest it should 
at least *explicitly document* that using %s is unsupported and dangerous in 
more than one way (might not work on all platforms, does not do what it should 
for 'aware' datetimes on platforms where it *does* work). I think explicitly 
telling people NOT to use it would be better than just not mentioning it. At 
least for me, when I saw real code using it and that the docs just didn't 
mention it, my initial thought was "I guess it must be OK, and the docs just 
missed it out for some reason". If I'd gone to the docs and seen an explicit 
note that it's not supported and doesn't work right, that would've been much 
clearer and I wouldn't have had to figure that out for myself :)

For Python 2, btw, the arrow library might be a suitable alternative to 
suggest: you can do something like this, assuming you have an aware datetime 
object called 'awaredate' you want to get the timestamp for:

import arrow
ts = arrow.get(awaredate).timestamp

and it does the right thing.

--
nosy: +adamwill

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue12750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes

2018-03-02 Thread Adam Williamson

Adam Williamson <awill...@redhat.com> added the comment:

I'd suggest that if that is the case, it would be better for the docs to 
*specifically mention* that `%s` is not supported and should not be used, 
rather than simply not mentioning it.

When it's used in real code (note someone in the SO issue mentions "I have been 
going crazy trying to figure out why i see strftime("%s") a lot, yet it's not 
in the docs") and just *not mentioned* in the docs, this tends to give the 
impression that it's something usable that was perhaps just forgotten from the 
docs, or something. The situation would be much clearer if the docs said "DO 
NOT USE THIS, IT'S DANGEROUS AND DOESN'T DO WHAT YOU THINK" in big letters. 
(And suggested using .timestamp() on Python 3.3+, and possibly arrow's 
.timestamp on 2.7?)

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes

2018-03-02 Thread Adam Williamson

Adam Williamson <awill...@redhat.com> added the comment:

Paul: right. This is on Linux - specifically Fedora Linux, but I don't think it 
matters. glibc strftime and strptime depend on an underlying struct called 
'tm'. 'man strftime' says:

   %s The number of seconds since the Epoch, 1970-01-01 00:00:00 + 
(UTC). (TZ) (Calculated from mktime(tm).)

And 'man mktime' says:

The  mktime() function converts a broken-down time structure, expressed as 
local time, to calendar time representation. ... On success, mktime() returns 
the calendar time (seconds since the Epoch), expressed as a value of type 
time_t."

I am finding it hard to determine whether various C standards require the tm 
struct and mktime and strftime and so on to handle timezones, but I'm sort of 
inclining to the answer that "no they don't".

Basically I suspect what's going on in this case is that the timezone 
information gets lost somewhere in the chain down from Python to system 
strftime to system mktime, and Python doesn't make any adjustment to the actual 
date / time values before calling system strftime to try and account for this.

I think Python must do *something* more than purely converting to a tm and 
calling system strftime, though, as %Z does work, which it wouldn't if Python 
was purely converting to a non-timezone-aware tm struct and calling system 
strftime, I don't think...

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32988] datetime.datetime.strftime('%s') always uses local timezone, even with aware datetimes

2018-03-02 Thread Adam Williamson

New submission from Adam Williamson <awill...@redhat.com>:

Test script:

import pytz
import datetime
utc = pytz.timezone('UTC')
print(datetime.datetime(2017, 1, 1, tzinfo=utc).strftime('%s'))

Try running it with various system timezones:

[adamw@xps13k pagure (more-timezone-fun %)]$ TZ='UTC' python /tmp/test2.py
1483228800
[adamw@xps13k pagure (more-timezone-fun %)]$ TZ='America/Winnipeg' python 
/tmp/test2.py
1483250400
[adamw@xps13k pagure (more-timezone-fun %)]$ TZ='America/Vancouver' python 
/tmp/test2.py
1483257600

That's Python 2.7.14; same results with Python 3.6.4.

This does not seem correct. The correct Unix time for an aware datetime object 
should be a constant: for 2017-01-01 00:00 UTC it *is* 1483228800 . No matter 
what the system's local timezone, that should be the output of strftime('%s'), 
surely. What it seems to be doing instead is just outputting the Unix time for 
2017-01-01 00:00 in the system timezone.

I *do* note that strftime('%s') is completely undocumented in Python; neither 
https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior 
nor 
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior 
mentions it. However, it does exist, and is used in the real world; I found 
this usage of it, and the bug, in a real project, Pagure.

--
components: Library (Lib)
messages: 313169
nosy: adamwill
priority: normal
severity: normal
status: open
title: datetime.datetime.strftime('%s') always uses local timezone, even with 
aware datetimes
versions: Python 2.7, Python 3.6

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30062] datetime in Python 3.6+ no longer respects 'TZ' environment variable

2017-04-12 Thread Adam Williamson

Adam Williamson added the comment:

Hmm, after a bit more poking I found this:

https://docs.python.org/3/library/time.html#time.tzset

"Note

Although in many cases, changing the TZ environment variable may affect the 
output of functions like localtime() without calling tzset(), this behavior 
should not be relied on."

It seems like that's kinda what we're dealing with here. If I extend my tests 
to change TZ, call the test function, then call `time.tzset()` and call the 
test function again, the *second* call to the test function gives the different 
result, i.e. the `time.tzset()` call does what it claims and changes Python's 
conception of the 'current' timezone.

So while that note has been there all along, it seems like the behaviour 
actually changed between 3.5 and 3.6, and a change to 'TZ' is now less likely 
to be respected without a `tzset()` call. But given the doc note, perhaps that 
can't be considered a bug.

anaconda doesn't call `time.tzset()` anywhere at present. It's also 
multi-threaded, so making sure all the threads call `time.tzset()` after any 
thread has changed what the 'current' timezone is will be lots of fun to 
implement, I guess :/

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30062>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30062] datetime in Python 3.6+ no longer respects 'TZ' environment variable

2017-04-12 Thread Adam Williamson

New submission from Adam Williamson:

I can't figure out yet why this is, but it's very easy to demonstrate:

[adamw@adam anaconda (time-log %)]$ python35 
Python 3.5.2 (default, Feb 11 2017, 18:09:24) 
[GCC 7.0.1 20170209 (Red Hat 7.0.1-0.7)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import datetime
>>> os.environ['TZ'] = 'America/Winnipeg'
>>> datetime.datetime.fromtimestamp(0)
datetime.datetime(1969, 12, 31, 18, 0)
>>> os.environ['TZ'] = 'Europe/London'
>>> datetime.datetime.fromtimestamp(0)
datetime.datetime(1970, 1, 1, 1, 0)
>>> 

[adamw@adam anaconda (time-log %)]$ python3
Python 3.6.0 (default, Mar 21 2017, 17:30:34) 
[GCC 7.0.1 20170225 (Red Hat 7.0.1-0.10)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import datetime
>>> os.environ['TZ'] = 'America/Winnipeg'
>>> datetime.datetime.fromtimestamp(0)
datetime.datetime(1969, 12, 31, 16, 0)
>>> os.environ['TZ'] = 'Europe/London'
>>> datetime.datetime.fromtimestamp(0)
datetime.datetime(1969, 12, 31, 16, 0)
>>> 

That is, when deciding what timezone to use for operations that involve one, if 
the 'TZ' environment variable was set, Python 3.5 would use the timezone it was 
set to. Python 3.6 does not, it ignores it.

As you can see, if I twiddle the 'TZ' setting and call 
`datetime.datetime.fromtimestamp(0)` repeatedly under Python 3.5, I get 
different results - each one is the wall clock time at the epoch (timestamp 0) 
in the timezone specified as 'TZ'. If I do the same on Python 3.6, the 'TZ' 
setting is ignored and I always get the same result (the wall clock time of 
'the epoch' in Vancouver, which is my real timezone, and which I guess is being 
picked up from /etc/localtime or whatever).

This wound up causing a problem in the Fedora / Red Hat installer, anaconda:

https://bugzilla.redhat.com/show_bug.cgi?id=1433560

The 'current time zone' can be changed in anaconda. Shortly after it starts up, 
it automatically tries to guess the correct time zone via geolocation, and the 
user can also explicitly choose a timezone in the installer interface (or set 
one in a kickstart). Whenever the timezone is set in this way, an underlying 
library (libtimezonemap - https://launchpad.net/timezonemap) sets 'TZ' to the 
chosen timezone. It turns out other code in anaconda relies on Python 
respecting that setting, which Python 3.6 does not do. As a consequence, 
anaconda with Python 3.6 winds up setting the system time incorrectly. Also, 
the timestamps on all its log files are different now, and there may well be 
other consequences I didn't figure out yet.

The same applies to, e.g., `datetime.datetime.now()`: you can perform the same 
experiment with Python 3.5 and 3.6. If you change the 'TZ' env var while 
calling `datetime.datetime.now()` after each change, on Python 3.5, the naive 
datetime object it returns is the current time *in that timezone*. On Python 
3.6, regardless of what 'TZ' is set to, it always gives you the same time.

Is this an intended and/or desired change that we should adjust to somehow? Is 
there another way a running Python process can change what "the current" 
timezone is, for the purposes of datetime calculations like this?

--
components: Library (Lib)
messages: 291574
nosy: adamwill
priority: normal
severity: normal
status: open
title: datetime in Python 3.6+ no longer respects 'TZ' environment variable
versions: Python 3.6, Python 3.7

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30062>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29113] modulefinder no longer finds all required modules for Python itself, due to use of __import__ in sysconfig

2016-12-29 Thread Adam Williamson

New submission from Adam Williamson:

I'm not sure if this is really considered a bug or just an unavoidable 
limitation, but as it involves part of the stdlib operating on Python itself, I 
figured it was at least worth reporting.

In Fedora we have a fairly simple little script called python-deps:

https://github.com/rhinstaller/anaconda/blob/master/dracut/python-deps

which is used to figure out the dependencies of a couple of Python scripts used 
in the installer's initramfs environment, so the necessary bits of Python (but 
not the rest of it) can be included in the installer's initramfs.

Unfortunately, with Python 3.6, this seems to be broken for the core of Python 
itself, because of this change:

https://github.com/python/cpython/commit/a6431f2c8cf4783c2fd522b2f6ee04c3c204237f

which changed sysconfig.py from doing "from _sysconfigdata import 
build_time_vars" to using __import__ . I *think* that modulefinder can't cope 
with this use of __import__ and so misses that sysconfig requires 
"_sysconfigdata_m_linux_x86_64-linux-gnu" (or whatever the actual name is on 
your particular platform and arch).

This results in us not including the platform-specific module in the installer 
initramfs, so Python blows up on startup when the 'site' module tries to import 
the 'sysconfig' module.

We could work around this one way or another in the python-deps script, but I 
figured the issue was at least worth an upstream report to see if it's 
considered a significant issue or not.

You can reproduce the problem quite trivially by writing a test script which 
just does, e.g., "import site", and then running the example code from the 
ModuleFinder docs on it:

from modulefinder import ModuleFinder

finder = ModuleFinder()
finder.run_script('test.py')

print('Loaded modules:')
for name, mod in finder.modules.items():
print('%s: ' % name, end='')
print(','.join(list(mod.globalnames.keys())[:3]))

if you examine the output, you'll see that the 'sysconfig' module is included, 
but the site-specific module is not.

--
components: Library (Lib)
messages: 284304
nosy: adamwill
priority: normal
severity: normal
status: open
title: modulefinder no longer finds all required modules for Python itself, due 
to use of __import__ in sysconfig
versions: Python 3.6

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29113>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

https://github.com/tiran/defusedxml/pull/4 should fix this, I hope.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

https://paste.fedoraproject.org/511245/14824393/ is my cut at a fix for this, 
gonna test it out now.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

Digging some more, it looks like *only* Python 3.3 went so far out of its way 
to hide the pure-Python iterparse() - the code was changed again in 3.4 and it 
doesn't do that any more. So I think a way forward here is to make the code 
that uses _IterParseIterator specific to Python 3.3, and use the Python 2.7 
code (i.e. just use the iterparse() function) for 3.2 and 3.4+.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

Aha, so thanks to my colleague Patrick Uiterwijk, we see the problem. Since 
Python 3.3, Python doesn't actually use that pure-Python iterparse() function 
if it can instead replace it with a C version:

https://github.com/python/cpython/blob/3.3/Lib/xml/etree/ElementTree.py#L1705

"# Overwrite 'ElementTree.parse' and 'iterparse' to use the C XMLParser"

so the reason defusedxml wants to use _IterParseIterator on Python 3 is because 
if it just uses xml.etree.ElementTree.iterparse() it's getting the 
'accelerated' C implementation, not the pure-Python implementation it wants.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

serhiy: so, the funny thing is this: your fix is ultimately a reversion. Though 
we have to dig way back into the bowels of defusedxml to see this. 
Specifically, to this commit!

https://github.com/tiran/defusedxml/commit/03d4fc6cf246a209c2cf892b33f5b6cf5af4ecbd

that's the point at which Christian introduced a divergence between Python 2 
and Python 3 here, and essentially the same divergence remains between the 
`elif PY3:` and `else: # Python 2.7` blocks now. The Python 2.7 block in 
current defusedxml is in fact the same as your block, because `_iterparse` is 
just the parent `iterparse` function, as discovered by `_get_py3_cls()`.

So before applying your change, I kinda want to understand why Christian 
introduced this divergence in the first place. The commit message claims it's 
because Python 3.3 hid some pure python, but I don't quite understand that: 
https://github.com/python/cpython/blob/3.3/Lib/xml/etree/ElementTree.py looks 
like iterparse() was still perfectly available and usable for this purpose in 
3.3, just as it was in 3.2 and still appears to be in 3.6.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

Adam Williamson added the comment:

Ammar: yep, that's correct. There's code in defused's ElementTree.py - _ 
get_py3_cls() - which passes different values to _generate_etree_functions 
based on the Python 3 version.

For Python 3.2+, defused 0.4.1 expects to use the _IterParseIterator class from 
xml ElementTree , but that got removed in 3.6, so if you just use defused 0.4.1 
with Python 3.6, it asserts as soon as you try to import defusedxml.ElementTree 
at all:

>>> import defusedxml.ElementTree
Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/defusedxml-0.4.1/defusedxml/ElementTree.py", line 62, in 
_XMLParser, _iterparse, _IterParseIterator, ParseError = _get_py3_cls()
  File "/tmp/defusedxml-0.4.1/defusedxml/ElementTree.py", line 56, in 
_get_py3_cls
_IterParseIterator = pure_pymod._IterParseIterator
AttributeError: module 'xml.etree.ElementTree' has no attribute 
'_IterParseIterator'

Christian made a change to make _get_py3_cls() pass None to 
_generate_etree_functions() so you can at least import defusedxml.ElementTree, 
but he didn't change _generate_etree_functions() at all so it just doesn't have 
a code path that handles this at all; for Python 3.2+ it's expecting to get a 
real iterator, not None, and it just breaks completely trying to use None as an 
iterator:

 sh-4.3# echo "" > test.xml
 sh-4.3# python3
>>> import defusedxml.ElementTree
>>> parser = defusedxml.ElementTree.iterparse('test.xml')
Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/defusedxml-0.4.1/defusedxml/common.py", line 141, in iterparse
return _IterParseIterator(source, events, parser, close_source)
TypeError: 'NoneType' object is not callable

Serhiy, thanks for the suggestion! We'll try that out.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29050] xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml

2016-12-22 Thread Adam Williamson

New submission from Adam Williamson:

The changes made to xml.etree.ElementTree in this commit:

https://github.com/python/cpython/commit/12a626fae80a57752ccd91ad25b5a283e18154ec

break defusedxml , Christian Heimes' library of modified parsers that's 
intended to be safe for parsing untrusted input. As of now, it's not possible 
to have defusedxml working properly with Python 3.6; its ElementTree parsers 
cannot work properly.

Of course, defusedxml is an external library that does 'inappropriate' things 
(like fiddling around with internals of the xml library). So usually this 
should be considered just a problem for defusedxml to deal with somehow, and 
indeed I've reported it there: https://github.com/tiran/defusedxml/issues/3 . 
That report has more details on the precise problem.

I thought it was worthwhile reporting to Python itself as well, however, for a 
specific reason. The Python docs for the xml library explicitly cover and 
endorse the use of defusedxml:

"defusedxml is a pure Python package with modified subclasses of all stdlib XML 
parsers that prevent any potentially malicious operation. Use of this package 
is recommended for any server code that parses untrusted XML data." - 
https://docs.python.org/3.6/library/xml.html#the-defusedxml-and-defusedexpat-packages

so as things stand, the Python 3.6 docs will explicitly recommend people use a 
module which does not work with Python 3.6. Is this considered a serious 
problem?

It also looks to me (though I'm hardly an expert) as if it might be quite 
difficult and ugly to fix this on the defusedxml side, and the 'nicest' fix 
might actually be to tweak Python's xml module back a bit more to how it was in 
< 3.6 (but without losing the optimization from the commit in question) so it's 
easier for defusedxml to get at the internals it needs...but I could well be 
wrong about that.

Thanks!

--
components: XML
messages: 283854
nosy: adamwill
priority: normal
severity: normal
status: open
title: xml.etree.ElementTree in Python 3.6 is incompatible with defusedxml
type: behavior
versions: Python 3.6

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29050>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com