New submission from Géry Ogam <gery.o...@gmail.com>:

There seems to be an encoding bug in Python 3.6.5 on Windows with the [timezone 
constant](https://docs.python.org/3/library/time.html#timezone-constants) 
`time.tzname`:

    >>> import time
    >>> time.tzname
    ('Paris, Madrid', 'Paris, Madrid (heure d\x92été)')

In the second string (the name of the local *DST* timezone), the escape 
sequence `\x92` is (since it is in a *character* string, not in a byte string) 
the Unicode code point [U+0092 PRIVATE USE 2 
(PU2)](https://en.wikipedia.org/wiki/List_of_Unicode_characters), instead of 
the Unicode code point [U+2019 RIGHT SINGLE QUOTATION 
MARK](https://en.wikipedia.org/wiki/List_of_Unicode_characters) as expected, 
which would have been displayed as `’` or `\u2019`, so `'Paris, Madrid (heure 
d’été)'`.

This `\x92` obviously comes from the 0x92 byte of the [CP-1252 
encoding](https://en.wikipedia.org/wiki/Windows-1252) for the `’` character, 
but the byte has been badly handled in `time.tzname` somehow.

Indeed, quoting the [‘Lexical 
analysis’](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)
 chapter from the *Language Reference*:

> In a bytes literal, hexadecimal and octal escapes denote the byte with
> the given value. In a string literal, these escapes denote a Unicode
> character with the given value.

----------
components: Library (Lib)
messages: 315181
nosy: maggyero
priority: normal
severity: normal
status: open
title: Encoding issue in the name of the local DST timezone
type: behavior
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33259>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to