https://bugs.documentfoundation.org/show_bug.cgi?id=169103

--- Comment #2 from Piotr Osada <[email protected]> ---
Additionally, Unicode includes superscript and subscript letters (beyond just
numbers) that are commonly used in scientific notation, phonetics (IPA), and
mathematical expressions. These should also be included in character
normalization.

MOTIVATION:
Scientific documents often mix formatting styles from different sources (manual
typing, AI-generated content, copy-paste from different software), making it
difficult to search for the same chemical formula or physical quantity written
in different ways.

BASIC EXAMPLES:
- Searching for "xn" should match: xⁿ, xₙ
- Searching for "H2O" should match: H₂O
- Searching for "CO2" should match: CO₂, CO²
- Searching for "Ca2+" should match: Ca²⁺
- Searching for "Cp" should match: Cₚ


USE CASES IN SCIENTIFIC DOCUMENTS:

PHYSICS
Mechanics:
aₜ, aₙ — tangential and normal acceleration
σₓₓ, τₓᵧ — stress tensor components

Thermodynamics:
Cᵥ, Cₚ — heat capacity at constant volume/pressure

CHEMISTRY
Ca²⁺ — calcium cation
[H₃O⁺] — hydronium ion

MATERIAL SCIENCE
Crystal Structure:
t-ZrO₂ — tetragonal zirconia
α-Fe, γ-Fe — iron crystal phases

Material stoichiometry:
ZrO₂₋ₓ — zirconium oxide (non-stoichiometric compound)


SUPERSCRIPT LETTERS (MODIFIERS)
Code    Char    Name
U+1D43  ᵃ       Modifier Letter Small A
U+1D47  ᵇ       Modifier Letter Small B
U+1D9C  ᶜ       Modifier Letter Small C
U+1D48  ᵈ       Modifier Letter Small D
U+1D49  ᵉ       Modifier Letter Small E
U+1DA0  ᶠ       Modifier Letter Small F
U+1D4D  ᵍ       Modifier Letter Small G
U+02B0  ʰ       Modifier Letter Small H
U+2071  ⁱ       Superscript Latin Small Letter I
U+02B2  ʲ       Modifier Letter Small J
U+1D4F  ᵏ       Modifier Letter Small K
U+02E1  ˡ       Modifier Letter Small L
U+1D50  ᵐ       Modifier Letter Small M
U+207F  ⁿ       Superscript Latin Small Letter N
U+1D52  ᵒ       Modifier Letter Small O
U+1D56  ᵖ       Modifier Letter Small P
U+02B3  ʳ       Modifier Letter Small R
U+02E2  ˢ       Modifier Letter Small S
U+1D57  ᵗ       Modifier Letter Small T
U+1D58  ᵘ       Modifier Letter Small U
U+1D5B  ᵛ       Modifier Letter Small V
U+02B7  ʷ       Modifier Letter Small W
U+02E3  ˣ       Modifier Letter Small X
U+02B8  ʸ       Modifier Letter Small Y
U+1DBB  ᶻ       Modifier Letter Small Z

SUPERSCRIPT CAPITALS
Code    Char    Name
U+1D2C  ᴬ       Modifier Letter Capital A
U+1D2E  ᴮ       Modifier Letter Capital B
U+1D30  ᴰ       Modifier Letter Capital D
U+1D31  ᴱ       Modifier Letter Capital E
U+1D33  ᴳ       Modifier Letter Capital G
U+1D34  ᴴ       Modifier Letter Capital H
U+1D35  ᴵ       Modifier Letter Capital I
U+1D36  ᴶ       Modifier Letter Capital J
U+1D37  ᴷ       Modifier Letter Capital K
U+1D38  ᴸ       Modifier Letter Capital L
U+1D39  ᴹ       Modifier Letter Capital M
U+1D3A  ᴺ       Modifier Letter Capital N
U+1D3C  ᴼ       Modifier Letter Capital O
U+1D3E  ᴾ       Modifier Letter Capital P
U+1D3F  ᴿ       Modifier Letter Capital R
U+1D40  ᵀ       Modifier Letter Capital T
U+1D41  ᵁ       Modifier Letter Capital U
U+1D42  ᵂ       Modifier Letter Capital W

SUBSCRIPT LETTERS
Code    Char    Name
U+2090  ₐ       Latin Subscript Small Letter A
U+2091  ₑ       Latin Subscript Small Letter E
U+2095  ₕ       Latin Subscript Small Letter H
U+1D62  ᵢ       Latin Subscript Small Letter I
U+2C7C  ⱼ       Latin Subscript Small Letter J
U+2096  ₖ       Latin Subscript Small Letter K
U+2097  ₗ       Latin Subscript Small Letter L
U+2098  ₘ       Latin Subscript Small Letter M
U+2099  ₙ       Latin Subscript Small Letter N
U+2092  ₒ       Latin Subscript Small Letter O
U+209A  ₚ       Latin Subscript Small Letter P
U+1D63  ᵣ       Latin Subscript Small Letter R
U+209B  ₛ       Latin Subscript Small Letter S
U+209C  ₜ       Latin Subscript Small Letter T
U+1D64  ᵤ       Latin Subscript Small Letter U
U+1D65  ᵥ       Latin Subscript Small Letter V
U+2093  ₓ       Latin Subscript Small Letter X

GREEK SUBSCRIPTS
Code    Char    Name
U+1D66  ᵦ       Greek Subscript Small Letter Beta
U+1D67  ᵧ       Greek Subscript Small Letter Gamma
U+1D68  ᵨ       Greek Subscript Small Letter Rho
U+1D69  ᵩ       Greek Subscript Small Letter Phi
U+1D6A  ᵪ       Greek Subscript Small Letter Chi

[16] https://symbl.cc/en/collections/superscript-and-subscript-letters/
[17] https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to