New submission from Marc Richter <marc.richter.1...@googlemail.com>:

There's a special letter in German orthography called "eszett" (ß). This letter 
had no uppercase variant for hundreds of years until 2017, there was an 
uppercase variant added to the official German orthography called "capital 
eszett" (ẞ) [1].

Python's .upper() string method still translates this to "SS" (which was 
correct before 2017):

~ $ python3.7.0
Python 3.7.0 (default, Aug 29 2018, 17:15:17) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 'gruß'.upper()
'GRUSS'
>>>

The result of this example should have been 'GRUẞ' instead.
That being said, it's fair to inform about the fact that this letter is still 
quite unpopular in Germany; it is not even typeable with German keyboards, yet. 
Anyways, I think since this became officials orthography, it's not Python's job 
to adopt behaviors but clear rules instead.

I'm not sure if this affects .casefold() as well, since I do not get that 
method's scope.

BR,
Marc Richter


[1]: https://en.wikipedia.org/wiki/Capital_%E1%BA%9E

----------
components: Interpreter Core
messages: 327336
nosy: Marc Richter
priority: normal
severity: normal
status: open
title: string method .upper() converts 'ß' to 'SS' instead of 'ẞ'
type: behavior
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34928>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to