I've submitted a draft of the proposal on the thread "Normalize Unicode 
Identifiers <http://thread.gmane.org/gmane.comp.lang.swift.evolution/25126>". 
Please make any comments and recommendations there.

Sincerely,
João Pinheiro


> On 23 Jun 2016, at 18:30, Chris Lattner <[email protected]> wrote:
> 
> 
>> On Jun 23, 2016, at 9:17 AM, João Pinheiro via swift-evolution 
>> <[email protected] <mailto:[email protected]>> wrote:
>> 
>> 
>>> On 21 Jun 2016, at 20:15, Xiaodi Wu via swift-evolution 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> 
>>> On Tue, Jun 21, 2016 at 1:16 PM, Joe Groff <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Any discussion about this ought to start from UAX #31, the Unicode 
>>> consortium's recommendations on identifiers in programming languages:
>>> 
>>> http://unicode.org/reports/tr31/ <http://unicode.org/reports/tr31/>
>>> 
>>> Section 2.3 specifically calls out the situations in which ZWJ and ZWNJ 
>>> need to be allowed. The document also describes a stability policy for 
>>> handling new Unicode versions, other confusability issues, and many of the 
>>> other problems with adopting Unicode in a programming language's syntax.
>>> 
>>> That's a fantastic document--a very edifying read. Given Swift's robust 
>>> support for Unicode in its core libraries, it's kind of surprising to me 
>>> that identifiers aren't canonicalized at compile time. From a quick first 
>>> read, faithful adoption of UAX #31 recommendations would address most if 
>>> not all of the confusability and zero-width security issues raised in this 
>>> conversation.
>> 
>> From what I've read of UAX #31 <http://unicode.org/reports/tr31/> it does 
>> seem to address all of the invisible character issues raised in the 
>> discussion. Given their unicode status of of Default_Ignorable_Code_Points, 
>> I believe the best course of action would be to canonicalise identifiers by 
>> allowing invisible characters only where appropriate and ignoring them 
>> everywhere else.
>> 
>> The alternative to ignoring them would be to not canonicalise identifiers 
>> and treat invisible characters as an error instead.
>> 
>> This doesn't address the issue of unicode confusable characters, but solving 
>> that has additional problems of its own and would probably be better 
>> addressed in a different proposal entirely.
>> 
>> I'd like to start writing the proposal if there is agreement that this would 
>> be the best course of action.
> 
> Sounds great, please do.  Thanks!
> 
> -Chris
> 

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to