All right, I have gone through and formulated a set of characters to serve as the core of our operator symbols. I started with [:Sm:] and removed blocks and subheaders which are not clearly useful as operators (though may be reincorporated selectively in the future). Then I added the rest of the Arrows block, as well as punctuation symbols that are “operator-like”.
In particular, I kept Swift’s existing ASCII operators, and all of Swift’s Latin-1 operators except for currency signs and the copyright and registered trademark symbols. I also kept most of Swift’s existing General Punctuation operators. The end result is a set of 1,020 operator characters <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%5B%3ASm%3A%5D%0D%0A%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D&g=&i=>, which removes 1,628 symbols <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D%0D%0A%0D%0A-%5B%5B%3ASm%3A%5D%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D&g=&i=> from Swift’s existing operator set <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D&g=&i=> and adds just 4 new ones <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%5B%3ASm%3A%5D%0D%0A-%5Cp%7BBlock%3DSuperscripts+And+Subscripts%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Technical%7D%0D%0A-%5Cp%7BBlock%3DGeometric+Shapes%7D%0D%0A-%5Cp%7BBlock%3DMiscellaneous+Symbols%7D%0D%0A-%5Cp%7BBlock%3DAlphabetic+Presentation+Forms%7D%0D%0A-%5Cp%7BBlock%3DSmall+Form+Variants%7D%0D%0A-%5Cp%7BBlock%3DHalfwidth+And+Fullwidth+Forms%7D%0D%0A-%5Cp%7BBlock%3DMathematical+Alphanumeric+Symbols%7D%0D%0A-%5Cp%7BBlock%3DArabic+Mathematical+Alphabetic+Symbols%7D%0D%0A-%5Cp%7Bsubhead%3DVariant+letterforms+and+symbols%7D%0D%0A-%5Cp%7Bsubhead%3DLetterlike+symbol%7D%0D%0A%5Cp%7BBlock%3DArrows%7D%0D%0A%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%5D%0D%0A%5B%C2%A1+%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A6+%C2%A7+%C2%A9+%C2%AB+%C2%AC+%C2%AE+%C2%B0+%C2%B1+%C2%B6+%C2%BB+%C2%BF%5D+-+%5B%C2%A2+%C2%A3+%C2%A4+%C2%A5+%C2%A9+%C2%AE%5D%0D%0A%5Cp%7Bsubhead%3DGeneral+punctuation%7D+-+%5BU%2B203F+U%2B2040+U%2B2045+U%2B2046+U%2B2054%5D%0D%0A%5Cp%7Bsubhead%3DDouble+punctuation+for+vertical+text%7D%0D%0A%5Cp%7Bsubhead%3DArchaic+punctuation%7D+-+%5BU%2B2E31+U%2B2E33+U%2B2E34+U%2B2E3F%5D%0D%0AU%2B214B%5D%0D%0A%0D%0A-%5B%2F+%3D+%5C-+%2B+%21+*+%25+%3C+%3E+%5C%26+%7C+%5C%5E+~+%3F%0D%0AU%2B00A1+-+U%2B00A7%0D%0AU%2B00A9+U%2B00AB+U%2B00AC+U%2B00AE%0D%0AU%2B00B0+-+U%2B00B1%0D%0AU%2B00B6+U%2B00BB+U%2B00BF+U%2B00D7+U%2B00F7%0D%0AU%2B2016+-+U%2B2017%0D%0AU%2B2020+-+U%2B2027%0D%0AU%2B2030+-+U%2B203E%0D%0AU%2B2041+-+U%2B2053%0D%0AU%2B2055+-+U%2B205E%0D%0AU%2B2190+-+U%2B23FF%0D%0AU%2B2500+-+U%2B2775%0D%0AU%2B2794+-+U%2B2BFF%0D%0AU%2B2E00+-+U%2B2E7F%0D%0AU%2B3001+-+U%2B3003%0D%0AU%2B3008+-+U%2B3030%5D&g=&i=> (⅀ ؆ ؇ ⅋). I left out the “Full Stop” character, to be dealt with by whatever rules we decide upon for dots in operators. Here is the classification of the 1,020 characters I have identified as operators: [[:Sm:] -\p{Block=Superscripts And Subscripts} -\p{Block=Miscellaneous Technical} -\p{Block=Geometric Shapes} -\p{Block=Miscellaneous Symbols} -\p{Block=Alphabetic Presentation Forms} -\p{Block=Small Form Variants} -\p{Block=Halfwidth And Fullwidth Forms} -\p{Block=Mathematical Alphanumeric Symbols} -\p{Block=Arabic Mathematical Alphabetic Symbols} -\p{subhead=Variant letterforms and symbols} -\p{subhead=Letterlike symbol} \p{Block=Arrows} [/ = \- + ! * % < > \& | \^ ~ ?] [¡ ¢ £ ¤ ¥ ¦ § © « ¬ ® ° ± ¶ » ¿] - [¢ £ ¤ ¥ © ®] \p{subhead=General punctuation} - [U+203F U+2040 U+2045 U+2046 U+2054] \p{subhead=Double punctuation for vertical text} \p{subhead=Archaic punctuation} - [U+2E31 U+2E33 U+2E34 U+2E3F] U+214B] Additionally, I think it is worthwhile to consider including the “Drafting symbols” subheader and most of the “Miscellaneous technical” subheader. This would add 34 more operator characters <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5Cp%7Bsubhead%3DDrafting+symbols%7D%0D%0A%5Cp%7Bsubhead%3DMiscellaneous+technical%7D%0D%0A-%5BU%2B23E8%5D%5D&g=&i=> . I did not consider non-head operator characters <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=U%2B0300+-+U%2B036F%0D%0AU%2B1DC0+-+U%2B1DFF%0D%0AU%2B20D0+-+U%2B20FF%0D%0AU%2BFE00+-+U%2BFE0F%0D%0AU%2BFE20+-+U%2BFE2F%0D%0AU%2BE0100+-+U%2BE01EF&g=&i=>, which are predominantly combining marks and variant selectors, and should probably stay essentially as they are. Also, I kept the empty set and infinity sign as operators, though we may want to change that. There are a lot more symbols that could potentially become operators (eg. shapes, currency signs, APL, etc.). However in light of the prevailing view that we should start conservatively and add more in the future, I believe this set of 1,020 characters is a good place to begin. Nevin
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
