Roundup Robot added the comment:
New changeset 33a8ef498b1e by Serhiy Storchaka in branch '2.7':
Issue #14850: Now a chamap decoder treates U+FFFE as undefined mapping
http://hg.python.org/cpython/rev/33a8ef498b1e
New changeset 13cd78a2a17b by Serhiy Storchaka in branch '3.2':
Issue #14850: Now
Serhiy Storchaka added the comment:
Fixed. Thank you for your answers, Martin.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
Serhiy Storchaka added the comment:
I no one objects I will commit this next year.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
assignee: - serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
stage: - patch review
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Serhiy Storchaka added the comment:
Does anyone have objections against the idea or the implementation of the
patch? Please review.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: +haypo
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Python-bugs-list mailing
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file25934/decode_charmap_fffe.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
Serhiy Storchaka added the comment:
Patch updated to resolve conflict with issue15379. Added tests. Added patches
for 3.2 and 2.7.
--
Added file: http://bugs.python.org/file27387/decode_charmap_fffe-3.3.patch
Added file: http://bugs.python.org/file27388/decode_charmap_fffe-3.2.patch
Changes by Serhiy Storchaka storch...@gmail.com:
--
components: +Unicode
keywords: +needs review
versions: +Python 3.4
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Python-bugs-list
Martin v. Löwis mar...@v.loewis.de added the comment:
U+FFFE is documented as representing an undefined mapping, see
http://docs.python.org/dev/c-api/unicode.html?highlight=charmap#PyUnicode_DecodeCharmap
So the base string case is correct; the derived string implementation also
needs to
Serhiy Storchaka storch...@gmail.com added the comment:
What is the use case for passing a string subclass to charmap_decode? Or in
other words, how did you stumble upon the bug?
I stumbled upon it, rewriting the charmap decoder (issue14874). Now
charmap decoder processes the two cases -- a
Serhiy Storchaka storch...@gmail.com added the comment:
U+FFFE is documented as representing an undefined mapping,
Yes, using U+FFFE for representing an undefined mapping in strings is
normal, the question was about string subclasses. And if we will correct
it for string subclasses, how far we
Martin v. Löwis mar...@v.loewis.de added the comment:
U+FFFE is documented as representing an undefined mapping,
Yes, using U+FFFE for representing an undefined mapping in strings is
normal, the question was about string subclasses.
What is the question? U+FFFE also represents an undefined
Serhiy Storchaka storch...@gmail.com added the comment:
What is the question? U+FFFE also represents an undefined mapping in
string subclasses.
What about classes that not subclassed string but ducktyped string by
implementing all string method? What about list/tuple/array.array of
integers
Martin v. Löwis mar...@v.loewis.de added the comment:
integers or 1-character strings? What about general mapping? Should
any of them have 0xFFFE or '\uFFFE' represent an undefined mapping?
The documentation says that the parameter can be a dictionary mapping
byte or a unicode string, which
Serhiy Storchaka storch...@gmail.com added the comment:
So the answer to your last question is yes. I hope that the answer to
your other questions follows from that
Thank you, this is the answer to all my questions. I've prepared a patch
to treat U+FFFE in general mapping as “undefined
Éric Araujo mer...@netwok.org added the comment:
What is the use case for passing a string subclass to charmap_decode? Or in
other words, how did you stumble upon the bug?
--
nosy: +eric.araujo
___
Python tracker rep...@bugs.python.org
New submission from Serhiy Storchaka storch...@gmail.com:
codecs.charmap_decode behaves differently with native and user string as decode
table.
import codecs
print(ascii(codecs.charmap_decode(b'\x00', 'replace', '\uFFFE')))
('\ufffd', 1)
class S(str): pass
...
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: +loewis
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
Python-bugs-list mailing
Changes by Terry J. Reedy tjre...@udel.edu:
--
nosy: +doerwalter, lemburg
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14850
___
___
23 matches
Mail list logo