New submission from Géry Ogam <[email protected]>:
There seems to be an encoding bug in Python 3.6.5 on Windows with the [timezone
constant](https://docs.python.org/3/library/time.html#timezone-constants)
`time.tzname`:
>>> import time
>>> time.tzname
('Paris, Madrid', 'Paris, Madrid (heure d\x92été)')
In the second string (the name of the local *DST* timezone), the escape
sequence `\x92` is (since it is in a *character* string, not in a byte string)
the Unicode code point [U+0092 PRIVATE USE 2
(PU2)](https://en.wikipedia.org/wiki/List_of_Unicode_characters), instead of
the Unicode code point [U+2019 RIGHT SINGLE QUOTATION
MARK](https://en.wikipedia.org/wiki/List_of_Unicode_characters) as expected,
which would have been displayed as `’` or `\u2019`, so `'Paris, Madrid (heure
d’été)'`.
This `\x92` obviously comes from the 0x92 byte of the [CP-1252
encoding](https://en.wikipedia.org/wiki/Windows-1252) for the `’` character,
but the byte has been badly handled in `time.tzname` somehow.
Indeed, quoting the [‘Lexical
analysis’](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)
chapter from the *Language Reference*:
> In a bytes literal, hexadecimal and octal escapes denote the byte with
> the given value. In a string literal, these escapes denote a Unicode
> character with the given value.
----------
components: Library (Lib)
messages: 315181
nosy: maggyero
priority: normal
severity: normal
status: open
title: Encoding issue in the name of the local DST timezone
type: behavior
versions: Python 3.6
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue33259>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com