I haven't done any analysis, but on first glance it looks like it is based on
http://www.unicode.org/reports/tr31/#Alternative_Identifier_Syntax Mark <https://google.com/+MarkDavis> *— Il meglio è l’inimico del bene —* On Thu, Jun 5, 2014 at 5:46 PM, Jeff Senn <s...@maya.com> wrote: > Has anyone figured out whether character sequences that are non-canonical > (de)compositions but could be recomposed to the same result > are the same identifier or not? > > That is: are identifiers merely sequences of characters or intended to be > comparable as “Unicode strings” (under some sort of compatibility rule)? > > On Jun 5, 2014, at 11:27 AM, Martin v. Löwis <mar...@v.loewis.de> wrote: > > > Am 04.06.14 11:28, schrieb Andre Schappo: > >> The restrictions seem a little like IDNA2008. Anyone have links to > >> info giving a detailed explanation/tabulation of allowed and non > >> allowed Unicode chars for Swift Variable and Constant names? > > > > The language reference is at > > > > > https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html > > > > For reference, the definition of identifier-character is (read each > > line as an alternative) > > > > identifier-character → Digit 0 through 9 > > identifier-character → U+0300–U+036F, U+1DC0–U+1DFF, U+20D0–U+20FF, or > > U+FE20–U+FE2F > > identifier-character → identifier-head > > > > where identifier-head is > > > > identifier-head → Upper- or lowercase letter A through Z > > identifier-head → U+00A8, U+00AA, U+00AD, U+00AF, U+00B2–U+00B5, or > > U+00B7–U+00BA > > identifier-head → U+00BC–U+00BE, U+00C0–U+00D6, U+00D8–U+00F6, or > > U+00F8–U+00FF > > identifier-head → U+0100–U+02FF, U+0370–U+167F, U+1681–U+180D, or > > U+180F–U+1DBF > > identifier-head → U+1E00–U+1FFF > > identifier-head → U+200B–U+200D, U+202A–U+202E, U+203F–U+2040, U+2054, > > or U+2060–U+206F > > identifier-head → U+2070–U+20CF, U+2100–U+218F, U+2460–U+24FF, or > > U+2776–U+2793 > > identifier-head → U+2C00–U+2DFF or U+2E80–U+2FFF > > identifier-head → U+3004–U+3007, U+3021–U+302F, U+3031–U+303F, or > > U+3040–U+D7FF > > identifier-head → U+F900–U+FD3D, U+FD40–U+FDCF, U+FDF0–U+FE1F, or > > U+FE30–U+FE44 > > identifier-head → U+FE47–U+FFFD > > identifier-head → U+10000–U+1FFFD, U+20000–U+2FFFD, U+30000–U+3FFFD, or > > U+40000–U+4FFFD > > identifier-head → U+50000–U+5FFFD, U+60000–U+6FFFD, U+70000–U+7FFFD, or > > U+80000–U+8FFFD > > identifier-head → U+90000–U+9FFFD, U+A0000–U+AFFFD, U+B0000–U+BFFFD, or > > U+C0000–U+CFFFD > > identifier-head → U+D0000–U+DFFFD or U+E0000–U+EFFFD > > > > As the construction principle for this list, they say > > > > "Identifiers begin with an upper case or lower case letter A through Z, > > an underscore (_), a noncombining alphanumeric Unicode character in the > > Basic Multilingual Plane, or a character outside the Basic Multilingual > > Plan that isn’t in a Private Use Area. After the first character, digits > > and combining Unicode characters are also allowed." > > > > Regards, > > Martin > > _______________________________________________ > > Unicode mailing list > > Unicode@unicode.org > > http://unicode.org/mailman/listinfo/unicode > > > _______________________________________________ > Unicode mailing list > Unicode@unicode.org > http://unicode.org/mailman/listinfo/unicode >
_______________________________________________ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode