[issue31778] ast.literal_eval supports non-literals in Python 3

2018-01-04 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2018-01-04 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset d8ac4d1d5ac256ebf3d8d38c226049abec82a2a0 by Serhiy Storchaka in 
branch 'master':
bpo-31778: Make ast.literal_eval() more strict. (#4035)
https://github.com/python/cpython/commit/d8ac4d1d5ac256ebf3d8d38c226049abec82a2a0


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-11-09 Thread Neil Schemenauer

Neil Schemenauer  added the comment:

Just a comment on what I guess is the intended use of literal_eval(), i.e. 
taking a potentially untrusted string and turning it into a Python object.  
Exposing the whole of the Python parser to potential attackers would make me 
very worried.  Parsing code for all of Python syntax is just going to be very 
complicated and there can easily be bugs there.  Generating an AST and then 
walking over it to see if it is safe is also scary.  The "attack surface" is 
too large.  This is similar to the Shellshock bug. If you can trust the 
supplier of the string then okay but I would guess that literal_eval() is going 
to get used for untrusted inputs.

It would be really nice to have something like ast.literal_eval() that could be 
used for untrusted strings.  I would implement it by writing a retricted 
parser.  Keep it extremely simple.  Validate it by heavy code reviews and 
extensive testing (e.g. fuzzing).

--
nosy: +nascheme

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-11-09 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Ping.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-18 Thread Yury Selivanov

Change by Yury Selivanov :


--
nosy:  -yselivanov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-18 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

PR 4035 makes ast.literal_eval() more strict.

--
versions:  -Python 3.6, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-18 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +4009
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-18 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

"""
The string or node provided may only consist of the following Python literal 
structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and 
None.
"""

1+1 is not a literal number.

"""
It is not capable of evaluating arbitrarily complex expressions, for example 
involving operators or indexing.
"""

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-18 Thread R. David Murray

R. David Murray  added the comment:

"Safely evaluate an expression node or a string containing a Python expression."

The behavior you are citing matches that documentation, as far as I can see.  
1+1 is an expression involving supported literals.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-14 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

The support of parsing addition and subtraction at any level of nesting was 
added by bc95973b51abadc84960e7836ce313f12cf515cf. The commit message and NEWS 
entry don't contain an issue number, thus the rationale of this change is not 
known. Raymond, could you please explain?

--
nosy: +rhettinger, serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-14 Thread David Bieber

David Bieber  added the comment:

# Replies

> Rolling back previous enhancements would break existing code.

I sympathize completely with the need to maintain backward compatibility. And 
if this is the reason that this issue gets treated only as a documentation 
issue, rather than as a behavior issue, I can appreciate that.

If that is the case and literal_eval is not going to only evaluate literals, 
then for my use case I'll need a way to determine from a string whether it 
represents a literal. I can implement this myself using ast.parse and walking 
the resulting tree, looking for non-literal AST nodes. Would such an 
"is_literal" function would be more appropriate in the ast module than as a 
one-off function in Python Fire?

> The key promise that literal_eval makes is that it will not permit arbitrary 
> code execution.

I disagree that this is the only key promise, and here's my reasoning. The 
docstring has two sentences, and each one makes a promise:
1. "Safely evaluate an expression node or a string containing a Python 
expression."
2. "The string or node provided may only consist of the following Python 
literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, 
booleans, and None."
(1) says that evaluation is safe -- this is the key promise that you reference. 
(2) is also a promise though, that only certain types are allowed. While one 
could argue that the behavior of the function is not specified for inputs 
violating that criteria, I think the clear correct thing to do is to raise a 
ValueError if the value doesn't meet the criteria. This is what was done in 
Python 2, where the docstring for literal_eval is these same two sentences 
(modulo the inclusion of bytes and sets). It's my opinion that Python 2's 
behavior better matches the docstring as well as the behavior implied by the 
function's name.


# Additional observations
1. Python 2 _does_ support parsing complex literals, but does not support 
treating e.g. '1+1' as a literal.
ast.literal_eval('1+1j') # works in both Python 2 and Python 3
ast.literal_eval('1j+1') # raises a ValueError in Python 2, returns 1+1j in 
Python 3
ast.literal_eval('1+1') # raises a ValueError in Python 2, returns 2 in Python 3

2. Python 3 supports parsing addition and subtraction at any level of nesting.
ast.literal_eval('(1, (0, 1+1+1))') # raises a ValueError in Python 2, returns 
(1, (0, 3)) in Python 3

In my opinion, Python 2's behavior is correct in these situations since it 
matches the documentation and only evals literals as defined in the 
documentation.

# Source
The relevant code in Python 2.7.3 is 
[here](https://github.com/enthought/Python-2.7.3/blob/69fe0ffd2d85b4002cacae1f28ef2eb0f25e16ae/Lib/ast.py#L67).
 It explicitly allows NUM +/- COMPLEX, but not even COMPLEX +/- NUM. The 
corresponding code for Python 3 is 
[here](https://github.com/python/cpython/blob/ef611c96eab0ab667ebb43fdf429b319f6d99890/Lib/ast.py#L76).
 You'll notice it supports adding and subtracting arbitrary numeric types (int, 
float, complex).


---

Thanks for your replies and for hearing me out on this issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-13 Thread Nick Coghlan

Nick Coghlan  added the comment:

I'm marking this as documentation issue for now, as the operators that 
literal_eval allows are solely those where constant folding support is needed 
to correctly handle complex and negative numbers (as noted in the original 
post):

```
>>> dis.dis("+1")
  1   0 LOAD_CONST   1 (1)
  2 RETURN_VALUE
>>> dis.dis("-1")
  1   0 LOAD_CONST   1 (-1)
  2 RETURN_VALUE
>>> dis.dis("1+1")
  1   0 LOAD_CONST   1 (2)
  2 RETURN_VALUE
>>> dis.dis("1+1j")
  1   0 LOAD_CONST   2 ((1+1j))
  2 RETURN_VALUE
>>> dis.dis("2017-10-10")
  1   0 LOAD_CONST   3 (1997)
  2 RETURN_VALUE
```

So the key promise that literal_eval makes is that it will not permit arbitrary 
code execution, but the docs should make it clearer that it *does* permit 
constant folding for addition and subtraction in order to handle the full range 
of numeric literals.

If folks want to ensure that the input string *doesn't* include a binary 
operator, then that currently needs to be checked separately with ast.parse:

```
>>> type(ast.parse("2+3").body[0].value) is ast.BinOp
True
>>> type(ast.parse("2017-10-10").body[0].value) is ast.BinOp
True
```

For 3.7+, that check could potentially be encapsulated as an 
"allow_folding=True" keyword-only parameter (where the default gives the 
current behaviour, while "allow_folding=False" disables processing of UnaryOp 
and BinOp), but the docs update is the more immediate need.

--
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python
versions: +Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-13 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

It has been some time since literal_eval literally only evaluated literals.  
'constant_eval' might be a better name now, with the proviso of 'safely, in 
reasonable time'.

>>> from ast import literal_eval as le
>>> le('(1,2,3)')
(1, 2, 3)
>>> le('(1,2, (3,4))')
(1, 2, (3, 4))

I believe there was once a time when a simple tuple would be evaluated, while a 
nested one would not be.

"It is not capable of evaluating arbitrarily complex expressions, for example 
involving operators or indexing."  I do not read this as prohibiting all 
operators, but rather that now all will be accepted.

>>> le(2**2)
...
ValueError: malformed node or string: 4

Exponentiation of ints can take exponential time and can be used for denial of 
service attacks.

>>> le('2017-10-10')
1997

This is correct.  For '2017-10-10' to be a string representing a date, it must 
be quoted as a string in the code.

>>> le("'2017-10-10'")
'2017-10-10'

Rolling back previous enhancements would break existing code, so a deprecation 
period would be required.  But I would be inclined to instead update the doc to 
match the updated code better.  Lets see what others think.

--
nosy: +benjamin.peterson, brett.cannon, ncoghlan, terry.reedy, yselivanov
type: behavior -> enhancement
versions:  -Python 3.4, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-12 Thread David Bieber

Change by David Bieber :


--
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31778] ast.literal_eval supports non-literals in Python 3

2017-10-12 Thread David Bieber

New submission from David Bieber :

# Overview
ast.literal_eval supports some non-literals in Python 3.
The behavior of this function runs contrary to the documented behavior.


# The Issue
The 
[documentation](https://docs.python.org/3/library/ast.html#ast.literal_eval) 
says of the function "It is not capable of evaluating arbitrarily complex 
expressions, for example involving operators or indexing."

However, literal_eval is capable of evaluating expressions with certain 
operators, particular the operators "+" and "-".

As has been explained previously, the reason for this is to support "complex" 
literals such as 2+3j. However, this has unintended consequences which I 
believe to be indicative of a bug. Examples of the unintended consequences 
include `ast.literal_eval('1+1') == 2` `ast.literal_eval('2017-10-10') == 
1997`. I would expect each of these calls to literal_eval to throw a 
ValueError, as the input string is not a literal. Instead, literal_eval 
successfully evaluates the input string, in the latter case giving an 
unexpected result (since the intent of the string is to represent a 21st 
century date.)

Since issue arose as a [Python Fire 
issue](https://github.com/google/python-fire/issues/97), where the behavior of 
Python Fire was unexpected for inputs such as those described above (1+1 and 
2017-10-10) only in Python 3. For context, Python Fire is a CLI library which 
uses literal_eval as part of its command line argument parsing procedure.

I think the resolution to this issue is having literal_eval raise a ValueError 
if the ast of the input represents anything other than a Python literal, as 
described in the documentation. That is, "The string or node provided may only 
consist of the following Python literal structures: strings, bytes, numbers, 
tuples, lists, dicts, sets, booleans, and None." Additional operations, such as 
the binary operations "+" and "-", unless they explicitly create a complex 
number, should produce a ValueError.

If that resolution is not the direction we take, I also would appreciate 
knowing if there is another built in approach for determining if a string or 
ast node represents a literal.


# Reproducing
The following code snippets produce different behaviors in Python 2 and Python 
3.
```python
import ast
ast.literal_eval('1+1')
```

```python
import ast
ast.literal_eval('2017-10-10')
```


# References
- The Python Fire issue is here: https://github.com/google/python-fire/issues/97
- Prior discussion of a similar issue: https://bugs.python.org/issue22525
- I think is where complex number support was originally added: 
https://bugs.python.org/issue4907
- In this thread, https://bugs.python.org/issue24146, one commenter explains 
that literal_eval's support for "2+3" is an unintentional side effect of 
supporting complex literals.

--
messages: 304294
nosy: David Bieber
priority: normal
severity: normal
status: open
title: ast.literal_eval supports non-literals in Python 3
versions: Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com