This proposal [gist
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800>] is the
result of the discussions from the thread "Prohibit invisible characters in
identifier names
<http://thread.gmane.org/gmane.comp.lang.swift.evolution/21022>". I hope it's
still on time for inclusion in Swift 3.
Sincerely,
João Pinheiro
Normalize Unicode Identifiers
Proposal: SE-NNNN
<https://gist.github.com/JoaoPinheiro/NNNN-normalize-identifiers.md>
Author: João Pinheiro <https://github.com/joaopinheiro>
Status: Awaiting review
Review manager: TBD
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#introduction>Introduction
This proposal aims to introduce identifier normalization in order to prevent
the unsafe and potentially abusive use of invisible or equivalent
representations of Unicode characters in identifiers.
Swift-evolution thread: Discussion thread
<http://thread.gmane.org/gmane.comp.lang.swift.evolution/21022>
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#motivation>Motivation
Even though Swift supports the use of Unicode for identifiers, these aren't yet
normalized. This allows for different Unicode representations of the same
characters to be considered distinct identifiers.
For example:
let Å = "Angstrom"
let Å = "Latin Capital Letter A With Ring Above"
let Å = "Latin Capital Letter A + Combining Ring Above"
In addition to that, default-ignorable characters like the Zero Width Space and
Zero Width Non-Joiner (exemplified below) are also currently accepted as valid
parts of identifiers without any restrictions.
let ab = "ab"
let ab = "a + Zero Width Space + b"
func xy() { print("xy") }
func xy() { print("x + <Zero Width Non-Joiner> + y") }
The use of default-ignorable characters in identifiers is problematical, first
because the effects they represent are stylistic or otherwise out of scope for
identifiers, and second because the characters themselves often have no visible
display. It is also possible to misapply these characters such that users can
create strings that look the same but actually contain different characters,
which can create security problems.
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#proposed-solution>Proposed
solution
Normalize Swift identifiers according to the normalization form NFC recommended
for case-sensitive languages in the Unicode Standard Annexes 15
<https://gist.github.com/JoaoPinheiro/UAX15> and 31
<https://gist.github.com/JoaoPinheiro/UAX31> and follow the Normalization
Charts <https://gist.github.com/JoaoPinheiro/NormalizationCharts>.
In addition to that, prohibit the use of default-ignorable characters in
identifiers except in the special cases described in UAX31
<https://gist.github.com/JoaoPinheiro/UAX31>, listed below:
Allow Zero Width Non-Joiner (U+200C) when breaking a cursive connection
Allow Zero Width Non-Joiner (U+200C) in a conjunct context
Allow Zero Width Joiner (U+200D) in a conjunct context
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#impact-on-existing-code>Impact
on existing code
This has potential to be a code-breaking change in cases where people may have
used distinct, but identical looking, identifiers with different Unicode
representations. The likelihood of that happening in actual code is very small
and the problem can be solved by renaming identifiers that don't conform to the
new normalized form into new non-colliding identifiers.
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#alternatives-considered>Alternatives
considered
The option of ignoring default-ignorable characters in identifiers was also
discussed, but it was considered to be more confusing and less secure than
explicitly treating them as errors.
<https://gist.github.com/JoaoPinheiro/5f226f46c67d235a7039c775a4300800#unaddressed-issues>Unaddressed
Issues
There was some discussion around the issue of Unicode confusable characters,
but it was considered to be out of scope for this proposal. Unicode confusable
characters are a complicated issue and any possible solutions also come with
significant drawbacks that would require more time and consideration._______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution