Re: [swift-evolution] A path forward on rationalizing unicode identifiers and operators

Taylor Swift via swift-evolution Sat, 30 Sep 2017 19:15:00 -0700

what happens if two public operator declarations conflict?

On Sat, Sep 30, 2017 at 9:10 PM, Jonathan Hull via swift-evolution <
swift-evolution@swift.org> wrote:


> I have a technical question on this:
>
> Instead of parsing these into identifiers & operators, would it be
> possible to parse these into 3 categories: Identifiers, Operators, and
> Ambiguous?
>
> The ambiguous category would be disallowed for the moment, as you say.
> But since they are rarely used, maybe we can allow a declaration (similar
> to how we define operators) that effectively pulls it into one of the other
> categories (not in terms of tokenization, but in terms of how it can be
> used in Swift).  Trying to pull it into both would be a compilation error.
>
> That way, Xiaodi can have a framework which lets her use superscript T as
> an identifier, and I can have one where I use superscript 2 to square
> things.  The obvious/frequently used characters would not be ambiguous, so
> it would only slow down compilation when the rare/ambiguous characters are
> used.
>
> In my mind, this would be the ideal solution, and it could be done in
> stages (with the ambiguous characters just being forbidden for now), but I
> am not sure if it is technically possible.
>
> Thanks,
> Jon
>
> On Sep 30, 2017, at 3:59 PM, Chris Lattner via swift-evolution <
> swift-evolution@swift.org> wrote:
>
>
> The core team recently met to discuss PR609 - Refining identifier and
> operator symbology:
> https://github.com/xwu/swift-evolution/blob/7c2c4df63b1d92a1677461f41bc638
> f31926c9c3/proposals/NNNN-refining-identifier-and-operator-symbology.md
>
> The proposal correctly observes that the partitioning of unicode
> codepoints into identifiers and operators is a mess in some cases.  It
> really is an outright bug for 🙂 to be an identifier, but ☹️ to be an
> operator.  That said, the proposal itself is complicated and is defined in
> terms of a bunch of unicode classes that may evolve in the “wrong way for
> Swift” in the future.
>
> The core team would really like to get this sorted out for Swift 5, and
> sooner is better than later :-).  Because it seems that this is a really
> hard problem and that perfection is becoming the enemy of good
> <https://en.wikipedia.org/wiki/Perfect_is_the_enemy_of_good>, the core
> team requests the creation of a new proposal with a different approach.
> The general observation is that there are three kinds of characters: things
> that are obviously identifiers, things that are obviously math operators,
> and things that are non-obvious.  Things that are non-obvious can be made
> into invalid code points, and legislated later in follow-up proposals
> if/when someone cares to argue for them.
>
>
> To make progress on this, we suggest a few separable steps:
>
> First, please split out the changes to the ASCII characters (e.g. . and \
> operator parsing rules) to its own (small) proposal, since it is unrelated
> to the unicode changes, and can make progress on that proposal
> independently.
>
>
> Second, someone should take a look at the concrete set of unicode
> identifiers that are accepted by Swift 4 and write a new proposal that
> splits them into the three groups: those that are clearly identifiers
> (which become identifiers), those that are clearly operators (which become
> operators), and those that are unclear or don’t matter (these become
> invalid code points).
>
> I suggest that the criteria be based on *utility for Swift code*, not on
> the underlying unicode classification.  For example, the discussion thread
> for PR609 mentions that the T character in “  xᵀ  ” is defined in unicode
> as a latin “letter”.  Despite that, its use is Swift would clearly be as a
> postfix operator, so we should classify it as an operator.
>
> Other suggestions:
>  - Math symbols are operators excepting those primarily used as
> identifiers like “alpha”.  If there are any characters that are used for
> both, this proposal should make them invalid.
>  - While there may be useful ranges for some identifiers (e.g. to handle
> european accented characters), the Emoji range should probably have each
> codepoint independently judged, and currently unassigned codepoints should
> not get a meaning defined for them.
>  - Unicode “faces”, “people”, “animals” etc are all identifiers.
>  - In order to reduce the scope of the proposal, it is a safe default to
> exclude characters that are unlikely to be used by Swift code today,
> including Braille, weird currency symbols, or any set of characters that
> are so broken and useless in Swift 4 that it isn’t worth worrying about.
>  - The proposal is likely to turn a large number of code points into
> rejected characters.  In the discussions, some people will be tempted to
> argue endlessly about individual rejections.  To control that, we can
> require that people point out an example where the character is already in
> use, or where it has a clear application to a domain that is known today:
> the discussion needs to be grounded and practical, not theoretical.
>
>
> Third, if there is interest sometime in the future, we can have subsequent
> proposals that expand the range of accepted code points, motivated by the
> specific application domain that cares about them.  These proposals will
> not be source breaking, so they can happen at any time.
>
>
> Is anyone interested in helping to push this effort forward?
>
> -Chris
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] A path forward on rationalizing unicode identifiers and operators

Reply via email to