[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-03-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: M.-A. Lemburg wrote: > Raymond Hettinger wrote: >> >> Raymond Hettinger added the comment: >> >>> If you agree, Raymond, I'll backport the patch. >> >> Yes. That will address Antoine's legitimate concern about making other >> backports harder, and it will

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-26 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Raymond Hettinger wrote: > > Raymond Hettinger added the comment: > >> If you agree, Raymond, I'll backport the patch. > > Yes. That will address Antoine's legitimate concern about making other > backports harder, and it will get all the Python's to us

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-26 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: On Fri, Feb 25, 2011 at 03:43:06PM +, Marc-Andre Lemburg wrote: > > Marc-Andre Lemburg added the comment: > > r88586: Normalized the encoding names for Latin-1 and UTF-8 to > 'latin-1' and 'utf-8' in the stdlib. Even though - or maybe exactly bec

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Fri, Feb 25, 2011 at 8:39 PM, Ezio Melotti wrote: .. > It would prefer to see the note added by Alexander in the doc mention *only* > the preferred spellings > (i.e. 'utf-8' and 'iso-8859-1') rather than all the variants that are > actually optimized

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: > If we ever decide to get rid of codec aliases in the core "If". -- ___ Python tracker ___ ___ Py

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Fri, Feb 25, 2011 at 8:29 PM, Antoine Pitrou wrote: .. >> For other spellings like "utf8" or "latin1", I wonder if it would be >> useful to emit a warning/suggestion to use the standard spelling. > > No, it would be an useless annoyance. If we ever de

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Ezio Melotti
Ezio Melotti added the comment: > For other spellings like "utf8" or "latin1", I wonder if it would be > useful to emit a warning/suggestion to use the standard spelling. It would prefer to see the note added by Alexander in the doc mention *only* the preferred spellings (i.e. 'utf-8' and 'iso

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread STINNER Victor
STINNER Victor added the comment: > For other spellings like "utf8" or "latin1", I wonder > if it would be useful to emit a warning/suggestion to use > the standard spelling. Why do you want to emit a warning? utf8 is now as fast as utf-8. -- ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: > For other spellings like "utf8" or "latin1", I wonder if it would be > useful to emit a warning/suggestion to use the standard spelling. No, it would be an useless annoyance. -- ___ Python tracker

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Éric Araujo
Éric Araujo added the comment: Such warnings about performance seem to me to be the domain of code analysis or lint tools, not the interpreter. -- ___ Python tracker ___ __

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Raymond Hettinger
Raymond Hettinger added the comment: > If you agree, Raymond, I'll backport the patch. Yes. That will address Antoine's legitimate concern about making other backports harder, and it will get all the Python's to use the canonical spelling. For other spellings like "utf8" or "latin1", I wond

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Marc-Andre Lemburg wrote: > > Marc-Andre Lemburg added the comment: > > I guess you could regard the wrong encoding name use as bug - it > slows down several stdlib modules for no apparent reason. > > If you agree, Raymond, I'll backport the patch. We m

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Ezio Melotti
Ezio Melotti added the comment: +1 on the backport. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I guess you could regard the wrong encoding name use as bug - it slows down several stdlib modules for no apparent reason. If you agree, Raymond, I'll backport the patch. -- title: b'x'.decode('latin1') is much slower thanb'x'.decode('latin

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Antoine Pitrou
Antoine Pitrou added the comment: > What's wrong with Marc's commit? He's using the standard names. That's a pretty useless commit and it will make applying patches and backports more tedious, for no obvious benefit. Of course that concern will be removed if Marc-André also backports it to 3.

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Closing the ticket again. The problem in question is solved. -- status: open -> closed ___ Python tracker ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > STINNER Victor added the comment: > >> r88586: Normalized the encoding names for Latin-1 and UTF-8 to >> 'latin-1' and 'utf-8' in the stdlib. > > Why did you do that? We are trying to find a solution together, and you > change

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Raymond Hettinger
Raymond Hettinger added the comment: What's wrong with Marc's commit? He's using the standard names. -- nosy: +rhettinger ___ Python tracker ___ ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread STINNER Victor
STINNER Victor added the comment: > r88586: Normalized the encoding names for Latin-1 and UTF-8 to > 'latin-1' and 'utf-8' in the stdlib. Why did you do that? We are trying to find a solution together, and you change directly the code without any review. Your commit doesn't solve this issue.

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Committed issue11303.diff and doc change in revision 88602. I think the remaining ideas are best addressed in issue11322. > Given that we are starting to have a whole set of such aliases > in the C code, I wonder whether it would be better to make the >

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Marc-Andre Lemburg wrote: > > I don't know who changed the encoding's package normalize_encoding() function > (wasn't me), but it's a really slow implementation. > > The original version used the .translate() method which is a lot faster. I guess that's

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I think we should reset this whole discussion and just go with Alexander's original patch issue11303.diff. I don't know who changed the encoding's package normalize_encoding() function (wasn't me), but it's a really slow implementation. The original vers

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: r88586: Normalized the encoding names for Latin-1 and UTF-8 to 'latin-1' and 'utf-8' in the stdlib. -- ___ Python tracker ___ _

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-25 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: (Not issue related) Ezio and Alexander: after reading your posts and looking back on my code: you're absolutely right. Doing resize(31) is pointless: it doesn't save space (mempool serves [8],16,24,32 there; and: dynamic, normalized coded names don't

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
STINNER Victor added the comment: > more_aggressive_normalization.patch Woops, normalizestring() comment points to itself. normalize_encoding() might also points to the C implementations, at least in a "# comment". -- ___ Python tracker

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
STINNER Victor added the comment: >> That won't work, Victor, since it makes invalid encoding >> names valid, e.g. 'utf(=)-8'. > .. but this *is* valid: ... Ah yes, it's because of encodings.normalize_encoding(). It's funny: we have 3 functions to normalize an encoding name, and each function

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: Probably not, but that part should be changed if possible, because is less efficient than the previous version that was allocating only 11 bytes. The problem here is that the previous versions was only changing/removing chars, whereas this might add spaces too,

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: +char lower[strlen(encoding)*2]; Is this valid in C-89? -- ___ Python tracker ___ ___ Py

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: The attached patch is a proof of concept to see if Steffen proposal might be viable. I wrote another normalize_encoding function that implements the algorithm described in msg129259, adjusted the shortcuts and did some timings. (Note: the function is not teste

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: That's ok by me. And 'happy hacker haypo' was not ment unfriendly, i've only repeated the first response i've ever posted back to this tracker (guess who was very fast at that time :)). -- ___ Python tracke

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Éric Araujo
Éric Araujo added the comment: Agreed with Marc-André. It seems too magic and error-prone to do anything else than stripping hyphens and spaces. Steffen: This is a rather minor change in an area that is well known by several developers, so don’t take it personally that Victor went ahead and

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: > That won't work, Victor, since it makes invalid encoding > names valid, e.g. 'utf(=)-8'. That already works in Python (thanks to encodings.normalize_encoding). The problem with the patch is that it makes names like 'iso88591' valid. Normalize to 'iso 8859 1' sh

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: >>> 'abc'.encode('utf(=)-8') b'abc' -- ___ Python tracker ___ ___ Python-bugs-list mailing li

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Thu, Feb 24, 2011 at 11:39 AM, Marc-Andre Lemburg wrote: > > Marc-Andre Lemburg added the comment: .. > That won't work, Victor, since it makes invalid encoding > names valid, e.g. 'utf(=)-8'. > .. but this *is* valid: b'abc' -- title: b'x'

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg > wrote: > .. >> I think rather than removing any hyphens, spaces, etc. the >> function should additionally: >> >> * add hyp

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > STINNER Victor added the comment: > > Ooops, I attached the wrong patch. Here is the new fixed patch. That won't work, Victor, since it makes invalid encoding names valid, e.g. 'utf(=)-8'. We really only want to add the functio

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: So happy hacker haypo did it, different however. It's illegal, but since this is a static function which only serves some specific internal strcmp(3)s it may do for the mentioned charsets. I won't boot my laptop this evening. -- ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Thu, Feb 24, 2011 at 11:31 AM, Marc-Andre Lemburg wrote: .. > I think rather than removing any hyphens, spaces, etc. the > function should additionally: > >  * add hyphens whenever (they are missing and) there's switch >   from [a-z] to [0-9] > This w

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
STINNER Victor added the comment: Ooops, I attached the wrong patch. Here is the new fixed patch. Without the patch: >>> import timeit >>> timeit.Timer("'a'.encode('latin1')").timeit() 3.8540711402893066 >>> timeit.Timer("'a'.encode('latin-1')").timeit() 1.4946870803833008 With the patch: >>

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg > wrote: > .. >> On this ticker, we're discussing just one application area: that >> of the builtin short cuts. >> > Fair eno

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > STINNER Victor added the comment: > > I think that the normalization function in unicodeobject.c (only used for > internal functions) can skip any character different than a-z, A-Z and 0-9. > Something like: > import re >

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: As promised, here's the list of places where the wrong Latin-1 encoding spelling is used: Lib//test/test_cmd_line.py: -- for encoding in ('ascii', 'latin1', 'utf8'): Lib//test/test_codecs.py: -- ef = codecs.EncodedFile(f, 'utf-8', 'latin1')

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg wrote: .. > On this ticker, we're discussing just one application area: that > of the builtin short cuts. > Fair enough. I was hoping to close this ticket by simply committing the posted patch, but it

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file20875/aggressive_normalization.patch ___ Python tracker ___ ___ Python-bu

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: That will also accept invalid names like 'iso88591' that are not valid now, 'iso 8859 1' is already accepted. -- ___ Python tracker ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
STINNER Victor added the comment: Patch implementing my suggestion. -- Added file: http://bugs.python.org/file20875/aggressive_normalization.patch ___ Python tracker ___ ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
STINNER Victor added the comment: I think that the normalization function in unicodeobject.c (only used for internal functions) can skip any character different than a-z, A-Z and 0-9. Something like: >>> import re >>> def normalize(name): return re.sub("[^a-z0-9]", "", name.lower()) ... >>>

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: If the first normalization function is flexible enough to match most of the spellings of the optimized encodings, they will all benefit of the optimization without having to go through the long path. (If the normalized encoding name is then passed through, the

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: So, well, a-ha, i will boot my laptop this evening and (try to) write a patch for normalize_encoding(), which will match the standart conforming LATIN1 and also will continue to support the illegal latin-1 without actually changing the two users PyUni

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Steffen Daode Nurpmeso wrote: > > Steffen Daode Nurpmeso added the comment: > > .. i don't have actually invented this algorithm (but don't ask me where i > got the idea from years ago), i've just implemented the function you see. > The algorithm itsel

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > On Thu, Feb 24, 2011 at 10:30 AM, Ezio Melotti wrote: > .. >> See also discussion on #5902. > > Mark has closed #5902 and indeed the discussion of how to efficiently > normalize

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: P.P.S.: separating alphanumerics is a win for things like, e.g. UTF-16BE: it gets 'utf 16 be' - think about the possible mispellings here and you see this algorithm is a good thing -- ___ Python tracker

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: (Everything else is beyond my scope. But normalizing _ to - is possibly a bad idea as far as i can remember the situation three years ago.) -- ___ Python tracker ___

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: .. i don't have actually invented this algorithm (but don't ask me where i got the idea from years ago), i've just implemented the function you see. The algorithm itself avoids some pitfalls in respect to combining numerics and significantly reduces

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: On Thu, Feb 24, 2011 at 10:30 AM, Ezio Melotti wrote: .. > See also discussion on #5902. Mark has closed #5902 and indeed the discussion of how to efficiently normalize encoding names (without changing what is accepted) is beyond the scope of that or the

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Ezio Melotti
Ezio Melotti added the comment: See also discussion on #5902. Steffen, your normalization function looks similar to encodings.normalize_encoding, with just a few differences (it uses spaces instead of dashes, it divides alpha chars from digits). If it doesn't slow down the normal cases (i.e.

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: (That is to say, i would do it. But not if _cpython is thrown to trash ,-); i.e. not if there is not a slight chance that it gets actually patched in because this performance issue probably doesn't mean a thing in real life. You know, i'm a slow pro

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Steffen Daode Nurpmeso
Steffen Daode Nurpmeso added the comment: I wonder what this normalize_encoding() does! Here is a pretty standard version of mine which is a bit more expensive but catches match more cases! This is stripped, of course, and can be rewritten very easily to Python's needs (i.e. using char[32]

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > In issue11303.diff, I add similar optimization for encode('latin1') and for > 'utf8' variant of utf-8. I don't think dash-less variants of utf-16 and > utf-32 are common enough

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : -- nosy: +jcea ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Éric Araujo
Éric Araujo added the comment: +1 for the patch. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: In issue11303.diff, I add similar optimization for encode('latin1') and for 'utf8' variant of utf-8. I don't think dash-less variants of utf-16 and utf-32 are common enough to justify special-casing. -- Added file: http://bugs.python.org/file20

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Éric Araujo
Changes by Éric Araujo : -- nosy: +eric.araujo versions: +Python 3.3 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

2011-02-23 Thread Alexander Belopolsky
New submission from Alexander Belopolsky : $ ./python.exe -m timeit "b'x'.decode('latin1')" 10 loops, best of 3: 2.57 usec per loop $ ./python.exe -m timeit "b'x'.decode('latin-1')" 100 loops, best of 3: 0.336 usec per loop The reason for this behavior is that 'latin-1' is short-circuite