Frédéric Grosshans-André <[email protected]> added the comment:
@Gregory P. Smith
unicodedata.numeric, in the sdandard library, already handles non-Ascii
fractions in many scripts. The current “problem” is it outputs a float (even
for integers):
>>> unicodedata.numeric('⅔')
0.6666666666666666
The UnicodeData.txt file from the Unicode standard it takes its data from,
however, contains the corresponding “ascii fractions”. For example, below are
two lines of this file for two (very) different ways of encoding two thirds
2154;VULGAR FRACTION TWO THIRDS;No;0;ON;<fraction> 0032 2044
0033;;;2/3;N;FRACTION TWO THIRDS;;;;
1245B;CUNEIFORM NUMERIC SIGN TWO THIRDS DISH;Nl;0;L;;;;2/3;N;;;;;
Adding an exact value extraction to unicodedata should be doable, either via an
function or an extra keyword to the unicodedata.numeric function.
The only information that would be lost (but which is unavailable now anyway)
would be for the few codepoints which encode reducible fractions. As of unicode
13.0, these codepoints are
* ↉ U+2189 VULGAR FRACTION ZERO THIRDS
* 𐧷 U+109F7 MEROITIC CURSIVE FRACTION TWO TWELFTHS
* 𐧸 U+109F8 MEROITIC CURSIVE FRACTION THREE TWELFTHS
* 𐧹 U+109F9 MEROITIC CURSIVE FRACTION FOUR TWELFTHS
* 𐧻 U+109FB MEROITIC CURSIVE FRACTION SIX TWELFTHS
* 𐧽 U+109FD MEROITIC CURSIVE FRACTION EIGHT TWELFTHS
* 𐧾 U+109FE MEROITIC CURSIVE FRACTION NINE TWELFTHS
* 𐧿 U+109FF MEROITIC CURSIVE FRACTION TEN TWELFTHS
----------
nosy: +frederic.grosshans
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue43520>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com