Darren Duncan <dar...@darrenduncan.net> wrote on Wed, 24 Aug 2011 11:18:20 PDT:

> Smylers wrote:
>> Could we have underscores and hyphens mean the same thing? That is, Perl
>> 6 always interprets illo-figut and illo_figut as being the same
>> identifier (both for its own identifiers and those minted in programs),
>> with programmers able to use either separator on a whim?

> I oppose this.  Underscores and hyphens should remain distinct.

>> That would seem to be the most human-friendly approach.

> I disagree.  More human friendly is "if it looks different in any way then it 
> is 
> different".  (I am not also saying that same-looking things are equal, given 
> Unicode's redundancy.)

Your mentioning of Unicode is poignant.  In Unicode properties, you are not
supposed to have to worry about these things.    For example, from UTS#18:

    Note: Because it is recommended that the property syntax be lenient
          as to spaces, casing, hyphens and underbars, any of the
          following should be equivalent: \p{Lu}, \p{lu}, \p{uppercase
          letter}, \p{uppercase letter}, \p{Uppercase_Letter}, and
          \p{uppercaseletter}


Simillarly, since this applies to property names as well as to property
values, these are all the same:

    \p{GC              =Lu}
    \p{gc              =Lu}
    \p{General Category=Lu}
    \p{General_Category=Lu}
    \p{general_category=Lu}
    \p{general-category=Lu}
    \p{GENERAL-CATEGORY=Lu}
    \p{generalcategory =Lu}
    \p{GENERALCATEGORY =Lu}

I'll let you permute the RHS on your own. :)

However, I use the opposite of that sort of loose matching of identifiers
in my own code.  For example, when I make a named character alias, I always
use lowercase so that it looks different from an official one.

    use charnames ":full", ":alias" => {
        e_acute     => "LATIN SMALL LETTER E WITH ACUTE",
        ae          => "LATIN SMALL LETTER AE",
        smcap_ae    => "LATIN LETTER SMALL CAPITAL AE",  # this is a lowercase 
letter
        AE          => "LATIN CAPTIAL LETTER AE",
        oe          => "LATIN SMALL LIGATURE OE",
        smcap_oe    => "LATIN LETTER SMALL CAPITAL OE",  # this is a lowercase 
letter
        OE          => "LATIN CAPITAL LIGATURE OE",
    };

I don't make "E_ACUTE" and "eacute" also work there.  However, there is a
new ":loose" that does do that, but I suspect I shan't use it, since I use
both "ae" and "AE" differently in existing code.

--tom

Reply via email to