[issue28927] bytes.fromhex should ignore all whitespace

2017-03-31 Thread Donald Stufft
Changes by Donald Stufft : -- pull_requests: +994 ___ Python tracker ___ ___

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Terry J. Reedy
Terry J. Reedy added the comment: I think non-ASCII whitespace and digits are YAGNI until we are convinced otherwise by evidence from the field that people are routinely mixing other decimal digits with 'abcdef' as hex numerals. Anyone who does try such a thing can write a wrapper that first

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you for your contribution Robert. -- resolution: -> fixed stage: commit review -> resolved status: open -> closed ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Roundup Robot
Roundup Robot added the comment: New changeset fcc09d9ee7d4 by Serhiy Storchaka in branch 'default': Issue #28927: bytes.fromhex() and bytearray.fromhex() now ignore all ASCII https://hg.python.org/cpython/rev/fcc09d9ee7d4 -- nosy: +python-dev ___

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: LGTM. -- assignee: -> serhiy.storchaka components: +Interpreter Core stage: patch review -> commit review ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Robert Xiao
Robert Xiao added the comment: New patch with proper line lengths in documentation. -- Added file: http://bugs.python.org/file45967/fromhex.patch ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-19 Thread Robert Xiao
Robert Xiao added the comment: OK, I've attached a new version of the patch with the requested documentation changes (versionchanged and whatsnew). -- Added file: http://bugs.python.org/file45966/fromhex.patch ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-18 Thread Nick Coghlan
Nick Coghlan added the comment: +1 to the above Unicode whitespace discussion - the "ASCII space -> ASCII whitespace" change is relatively straightforward to implement, and clearly beneficial given Robert's point regarding the popularity of multi-line terminal-oriented hexdump formats.

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: No, it is not easy to implement. Currently bytes.fromhex() works only with ASCII strings and can just iterate over char*. If add support of non-ASCII whitespaces, we should add a support of non-ASCII digits, as in float(). This looks excessive. In case if

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-18 Thread Martin Panter
Martin Panter added the comment: As far as I know, non-ASCII newlines and whitespace are not supported in Python source code, so there is not a big need to support it in bytes.fromhex() either. But since bytes.fromhex() accepts Unicode strings, I think non-ASCII whitespace would be okay if it

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Is it worth to ignore also non-ASCII whitespaces for compatibility with string-to-number convertions and base64 decoder? >>> float('\xa01.0') 1.0 >>> base64.decodebytes(b'\xa0YWJj') b'abc' Note that not all spaces are ignored. They shouldn't break

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-17 Thread Nick Coghlan
Nick Coghlan added the comment: Ah, the parallel with base64 decoding and embedding encoded data in multi-line string literals is indeed a compelling one - I'd missed that. Given that rationale, +1 from me. Perhaps it would make sense to call that out directly in the documentation? Something

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-17 Thread Robert Xiao
Robert Xiao added the comment: I see your point, Nick. Can I offer a counterpoint? Most of the string parsers operate only on relatively short inputs, like numbers. Numbers in particular are rarely written with inner spaces, so it makes sense not to ignore internal whitespaces. On the other

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-17 Thread Nick Coghlan
Nick Coghlan added the comment: My recollection is that fromhex() ignores spaces to account for particularly common ways of formatting hex numbers as space separated groups: "CAFE F00D" "CAFEF00D CAFEF00D CAFEF00D" "CA FE F0 0D" etc Those show up even in structured hexadecimal

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-16 Thread Robert Xiao
Robert Xiao added the comment: Sorry, I should have clarified that these methods consider *ASCII whitespace* equivalent - just like my proposed patch. -- ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-16 Thread Robert Xiao
Robert Xiao added the comment: Terry, can you elaborate what you mean by a tradeoff? I feel like such a patch makes .fromhex more consistent with other string methods like .split() and .strip() which implicitly consider all whitespace equivalent. Martin, I've updated the patch to include

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-16 Thread Terry J. Reedy
Terry J. Reedy added the comment: I am a bit dubious about this. There is a tradeoff here between convenience and bug detection. The patch is not strictly necessary. >>> bytes.fromhex('ab\ncd'.replace('\n', '')) b'\xab\xcd' Bytes (and bytearray) .fromhex already ignores spaces. Not

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-09 Thread Martin Panter
Martin Panter added the comment: Seems a reasonable feature. The documentation would also need updating. Which specific (whitespace) characters do you propose to ignore? Just ASCII ones, as in bytes.isspace(), or others like b"\xA0" (non-breaking space) and U+2028 (line separator), as in

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-09 Thread Robert Xiao
Robert Xiao added the comment: I used Py_ISSPACE, which uses the .strip() default charset - I think this is a reasonable choice. We don't have to go crazy and support all the Unicode spaces. -- ___ Python tracker

[issue28927] bytes.fromhex should ignore all whitespace

2016-12-09 Thread Robert Xiao
New submission from Robert Xiao: bytes.fromhex ignores space characters now (yay!) but still barfs if fed newlines or tabs: >>> bytes.fromhex('ab\ncd') Traceback (most recent call last): File "", line 1, in ValueError: non-hexadecimal number found in fromhex() arg at position 2 >>>