[issue20115] NUL bytes in commented lines

2021-09-06 Thread Guido van Rossum

Guido van Rossum  added the comment:

Serhiy’s comment from 2014-01-04 gives the answer. It’s different reading from 
a file than from a string. And only “python x.py” still reads from a file.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-09-06 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

What I missed before is that duplicating the effect of the first two 
interactive entries (no exception) requires escaping the backslash so that the 
source argument for the explicit compile does not have a null. 

compile("'\\0'", '', 'exec')
 at 0x0214431CAA20, file "", line 1>
compile("#\\0", '', 'exec')
 at 0x0214431CAC30, file "", line 1>

So I did not actually see an exception to the rule.
---

*On Win 10*, I experimented with a version of Armin and Irit's example, without 
and with b'...' and 'wb'.

s = '#\x00\na\nb\n' 
print(len(s))  # 7
with open("f:/Python/a/nulltest.py", 'w') as f:
  f.write(s)
import nulltest

When I ran a local repository build of 3.9, 3.10, or 3.11 with
  f:\dev\3x>python f:/Python/a/nulltest.py
I got Irit's strange NameError instead of the proper ValueError.

When I ran with installed 3.9 or 3.10 with
  py -3.10 -m a.nulltest
I got the null-byte ValueError.

When I ran from IDLE's editor running on either installed or repository python, 
the import gave the null-byte ValueError.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-09-06 Thread Guido van Rossum


Guido van Rossum  added the comment:

Which part puzzles you?

I see that you tried

>>> #\0

This does not contain a null byte, just three characters: a hash, a backslash, 
and a digit zero.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-09-06 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

The compile() doc currently says ""This function raises SyntaxError if the 
compiled source is invalid, and ValueError if the source contains null bytes."  
And indeed, in repository 3.9, 3.10, 3.11,

>>> compile('\0','','exec')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: source code string cannot contain null bytes

Ditto when run same in a file from IDLE or command line.  The exception 
sometimes when the null is in a comment or string within the code.

>>> '\0'
'\x00'
>>> #\0
>>> compile('#\0','','single', 0x200)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: source code string cannot contain null bytes
>>> compile('"\0"','','single', 0x200)
ValueError: source code string cannot contain null bytes

I am puzzled because "\0" and #\0 in the IDLE shell are sent as strings 
containing the string or comment to compiled with the call above in codeop.  
There must be some difference in when \0 is interpreted.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-06-17 Thread Irit Katriel


Irit Katriel  added the comment:

See also issue1105770.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-06-16 Thread Guido van Rossum


Guido van Rossum  added the comment:

Yeah, null bytes should just be rejected. If someone comes up with a fix for 
this we'll accept it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-06-16 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

https://docs.python.org/3/reference/toplevel_components.html#file-input
says that file input and exec input (should) have the same grammar. This 
implies that the divergence is a bug.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2021-06-16 Thread Irit Katriel


Irit Katriel  added the comment:

This is still the same in 3.11. I added another line to the example's file, 
which shows more clearly what's happening:

>>> open('x.py', 'wb').write(b'#\x00\na\nb\n')

% ./python.exe x.py
Traceback (most recent call last):
  File "x.py", line 2, in 
a
NameError: name 'b' is not defined

--
nosy: +iritkatriel
versions: +Python 3.10, Python 3.11, Python 3.9 -Python 2.7, Python 3.3, Python 
3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-05-12 Thread Jakub Wilk

Changes by Jakub Wilk jw...@jwilk.net:


--
nosy: +jwilk

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-05-11 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-05-10 Thread ita1024

ita1024 added the comment:

Do not touch that please

The null bytes are already rejected when forbidden by the encoding (utf-8 for 
example).

Null byte characters in comments are perfectly valid in ISO8859-1 encoding, and 
a few scripts depend on them:
http://ftp.waf.io/pub/release/waf-1.7.16

Parsing the commented lines is also likely to slow down the parser, so keep 
your hands of it please! There are too many regressions already! 
http://bugs.python.org/issue21086

--
nosy: +ita1024

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-05-10 Thread Alex Gaynor

Changes by Alex Gaynor alex.gay...@gmail.com:


--
nosy: +alex

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I'll try, but I'm not sure this is possible. Some used C functions (e.g. 
fgets()) returns char* and doesn't work with string containing null bytes. Some 
public API (e.g. PyParser_SimpleParseString()) work with null-terminated C 
strings.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

See also issue13617.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-12 Thread Georg Brandl

Georg Brandl added the comment:

I'm in favor of PyPy's behavior: null bytes anywhere in the source, even in 
comments, usually mean there's something weird or fishy going on with either 
the editor or (if downloaded/copied) the source of the code.

--
nosy: +georg.brandl

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-10 Thread Terry J. Reedy

Terry J. Reedy added the comment:

Python should have a uniform definition of 'Python source' in both the doc and 
in practice in all source code processing functions. Currently, 2. Lexical 
analysis in the Language Manual just says Python reads program text as 
Unicode code points; the encoding of a source file can be given by an encoding 
declaration and defaults to UTF-8. UTF-8 encodes code point U+ as a null 
byte and this code point is nowhere excluded in the doc. (The definition of 
string literals uses 'source character' without any additional specification, 
so I take it to mean 'Unicode code point'.)

If U+ is a legal 'source character', it, as with other control chars not 
given special meaning, should be a SyntaxError unless occurring in a comment or 
string literal. Eval and exec exclude even the latter with 
TypeError: source code string cannot contain null bytes
If null bytes are legal, this is wrong.

Simply truncating lines as done by the CPython parser is wrong whether not not 
U+ is legal.

The simplest change would be to change the parser to match exec and add  other 
than U+000 after Unicode code points in the sentence quoted above.

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-10 Thread Terry J. Reedy

Terry J. Reedy added the comment:

Armin, what is the different behavior of PyPy?

We should perhaps get Guido's opinion on this issue.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-10 Thread Armin Rigo

Armin Rigo added the comment:

PyPy 2.x accepts null characters in all of import, exec and eval, and complains 
if they occur in non-comment.

PyPy 3.x refuses them in import, which is where this bug report originally 
comes from (someone complained that CPython 3.x accepts them but not PyPy 
3.x, even thought this complain doesn't really make sense as CPython just gets 
very confused by them).  I don't know about exec and eval.

We need a consistent decision for 3.5.  I suppose it's not really worth 
backporting it to CPython 2.7 - 3.3 - 3.4, but it's your choice.  PyPy will 
just follow the lead (or keep its current behavior for 2.x if CPython 2.x is 
not modified).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-04 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage:  - needs patch
type: compile error - behavior
versions: +Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Indeed.  CPython parser reads first line '#\x00\n' and save it in the buffer. 
But because C strings are used here (result of decode_str()), the line is 
truncated to '#'. As far as this data is not ended by '\n', it considered 
incomplete and next line is read and appended: '#' + 'a' - '#a'. And this line 
is commented out now.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-04 Thread Benjamin Peterson

Benjamin Peterson added the comment:

I guess NULL bytes should just be banned.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-04 Thread Armin Rigo

Armin Rigo added the comment:

Fwiw, both exec and eval() ban NUL bytes, which means that there is a strange 
case in which some files can be imported, but not loaded and exec'ed.  So I 
agree with Benjamin.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-03 Thread Armin Rigo

New submission from Armin Rigo:

This is probably the smallest example of a .py file that behaves differently in 
CPython vs PyPy, and for once, I'd argue that the CPython behavior is 
unexpected:

   # make the file:
open('x.py', 'wb').write('#\x00\na')

   # run it:
   python x.py

Expected: either some SyntaxError, or NameError: global name 'a' is not 
defined.  Got: nothing.  It seems that CPython completely ignores the line 
that is immediately after a line with a '#' and a following '\x00'.

--
components: Interpreter Core
messages: 207232
nosy: arigo
priority: low
severity: normal
status: open
title: NUL bytes in commented lines
type: compile error
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20115] NUL bytes in commented lines

2014-01-03 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@gmail.com:


--
nosy: +benjamin.peterson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20115
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com