[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-03 Thread Roundup Robot

Roundup Robot added the comment:

New changeset ecc8512b427d by Serhiy Storchaka in branch '3.3':
Issue #16741: Fix an error reporting in int().
http://hg.python.org/cpython/rev/ecc8512b427d

New changeset 4fd48a807812 by Serhiy Storchaka in branch 'default':
Issue #16741: Fix an error reporting in int().
http://hg.python.org/cpython/rev/4fd48a807812

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-03 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-03 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There is a test in test_unicode which expects an UnicodeError for 
int('\ud800'). Now it fails. Should we fix a test or int()?

--
resolution: fixed - 
status: closed - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-03 Thread Alexander Belopolsky

Alexander Belopolsky added the comment:

I'd say fix the test.  Raising ValueError is correct in this case.  
UnicodeError was an implementation artifact.

--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-08-03 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 7b023134ad83 by Serhiy Storchaka in branch '3.3':
Issue #16741: Remove testing of implementation artifact.
http://hg.python.org/cpython/rev/7b023134ad83

New changeset 1b4772ab420f by Serhiy Storchaka in branch 'default':
Issue #16741: Remove testing of implementation artifact.
http://hg.python.org/cpython/rev/1b4772ab420f

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-07-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

If there are no objections I'm going to commit patches soon.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-07-22 Thread Christian Heimes

Christian Heimes added the comment:

I don't like the idea to change the behavior of 2.7 so late in its release 
cycle. Benjamin, what's your opinion?

--
nosy: +benjamin.peterson, christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-07-22 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Yeah, let's just fix Python 3.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-07-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch for 2.7.

--
Added file: http://bugs.python.org/file30916/int_from_str-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-06-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Patch updated. It now reuses code for bytes-int in longobject.c and 
abstract.c, doesn't raise UnicodeDecodeError for non-utf-8 bytes, and always 
reports an invalid bytes literal as a bytes object.

--
Added file: http://bugs.python.org/file30515/int_from_str-3.3_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-05-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Are there any other comments?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-05-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thanks, for 3.4 I will use new formatting feature.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-05-05 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch based on Matthew's patch. It is smaller (+35 lines vs +59) but 
fixes error messages for more cases:

int(b'123\0') -- bytes string with null without base.
int(b'123\xbd') -- non-utf-8 bytes string.
int('123\ud800') -- lone surrogate in unicode string.

Unfortunately it is not easy to backport it to 2.7. PyErr_Format() in 2.7 works 
only with null-terminated strings. I propose to fix this issue on 3.3+ and 
declare it as won't fix for 2.7.

--
nosy: +chris.jerdonek
versions:  -Python 3.2
Added file: http://bugs.python.org/file30141/int_from_str.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-05-05 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
assignee:  - serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-05-05 Thread STINNER Victor

STINNER Victor added the comment:

int_from_str.patch:

+strobj = PySequence_GetSlice(u, 0, 200);
+if (strobj != NULL) {
+PyErr_Format(PyExc_ValueError,
+ invalid literal for int() with base %d: %R,
+ base, strobj);
+Py_DECREF(strobj);
+}

Oh, it remembers me that #7330 is still open.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2013-04-19 Thread Martin Morrison

Changes by Martin Morrison m...@ensoft.co.uk:


--
nosy: +isoschiz

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-30 Thread Matthew Barnett

Matthew Barnett added the comment:

I've attached a small additional patch for truncating the UTF-8.

I don't know whether it's strictly necessary, but I don't know that it's 
unnecessary either! (Better safe than sorry.)

--
Added file: http://bugs.python.org/file28492/issue16741#2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-30 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
nosy: +serhiy.storchaka
versions: +Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-29 Thread Matthew Barnett

Matthew Barnett added the comment:

I've attached a patch.

It now reports an invalid literal as-is:

 int(#\N{ARABIC-INDIC DIGIT ONE})
Traceback (most recent call last):
  File pyshell#1, line 1, in module
int(#\N{ARABIC-INDIC DIGIT ONE})
ValueError: invalid literal for int() with base 10: '#١'
 int(foo\x00bar)
Traceback (most recent call last):
  File pyshell#2, line 1, in module
int(foo\x00bar)
ValueError: invalid literal for int() with base 10: 'foo\x00bar'

There's a slight difference in that it truncates to 200 codepoints, not 200 
UTF-8 bytes.

--
keywords: +patch
Added file: http://bugs.python.org/file28487/issue16741.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-29 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti, haypo
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-23 Thread Matthew Barnett

Matthew Barnett added the comment:

It occurred to me that the truncation of the string when building the error 
message could cause a UnicodeDecodeError:

 int(1.ljust(199) + \u0100)
Traceback (most recent call last):
  File pyshell#0, line 1, in module
int(1.ljust(199) + \u0100)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 199: 
unexpected end of data

This is because it's truncating a UTF-8 string, and the truncation is in the 
middle of a multi-byte sequence.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-21 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
versions:  -Python 2.6, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-21 Thread Matthew Barnett

Matthew Barnett added the comment:

Python takes a long way round when converting strings to int. It does the 
following (I'll be talking about Python 3.3 here):

1. In function 'fix_decimal_and_space_to_ascii', the different kinds of spaces 
are converted to   and the different kinds of digits are converted to their 
equivalents in the ASCII range;

2. The resulting string is converted to UTF-8;

3. The resulting string is passed to 'PyLong_FromString', which expects a 
null-terminated string.

4. If 'PyLong_FromString' is unable to parse the string as an int, it builds an 
error message using the string that was passed into it, which it does by 
converting that string _back_ into Unicode.

As a result of step 4, the string that's reported as the value in the error 
message is _not_ necessarily correct.

For example:

 int(\N{ARABIC-INDIC DIGIT ONE})
1
 int(#\N{ARABIC-INDIC DIGIT ONE})
Traceback (most recent call last):
  File pyshell#1, line 1, in module
int(#\N{ARABIC-INDIC DIGIT ONE})
ValueError: invalid literal for int() with base 10: '#1'

And it also means a \x00 and anything after it will be omitted:

 int(foo\x00bar)
Traceback (most recent call last):
  File pyshell#2, line 1, in module
int(foo\x00bar)
ValueError: invalid literal for int() with base 10: 'foo'

And in a final point, 'PyLong_FromString' limits the length of the value it 
reports in the error message, and the code that does it includes this line:

slen = strlen(orig_str)  200 ? strlen(orig_str) : 200;

--
nosy: +mrabarnett

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-20 Thread ganges master

New submission from ganges master:

I'm not sure if it's a bug or just an inconvenience, but when a string 
containing \x00 is passed to int/float/etc, they return a misleading exception:

 int(abc)
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: invalid literal for int() with base 10: 'abc'
 int(\x00abc)
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: invalid literal for int() with base 10: ''
 float(\x00abc)
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: could not convert string to float:

I noticed the code does actually try to handle it:
http://hg.python.org/cpython/file/39803c20c9bf/Objects/intobject.c#l1066

but still, the reported error is very misleading.

--
components: Interpreter Core
messages: 177863
nosy: gangesmaster
priority: normal
severity: normal
status: open
title: `int()`, `float()`, etc think python strings are null-terminated
type: behavior
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16741] `int()`, `float()`, etc think python strings are null-terminated

2012-12-20 Thread Benjamin Peterson

Changes by Benjamin Peterson benja...@python.org:


--
nosy: +mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16741
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com