I have created a Hex package called `unicode` 
<https://hex.pm/packages/unicode> that contains part of the functionality: 
Right now it reads *DerivedCoreProperties.txt* and infers the different 
Derived Core Properties from that, as well as implementing some of the 
Regexp Compatibility Properties defined at 
http://www.unicode.org/reports/tr18/#Compatibility_Properties that only 
depend on the Derived Core Properties (digit, alphabetic, alphanumeric, 
lower, upper).

The properties in the Regexp Compatibility Properties table are the ones 
that seem the most useful, as these are often the checks someone wants to 
do on strings where Regular Expressions are too slow.

I haven't yet touched the properties for which the General Category of the 
character has to be known (We could maybe adapt Isaac's code for this? (-: 
). Right now, the compiled .beam file when using the Derived Core 
Properties (Math, Alphabetic, Lowercase, Uppercase) is +- 53kb large; 
becoming +- 280kb when defining function clauses for the other Derived Core 
Properties as well.


Have a wonderful day,

~Wiebe-Marten

On Tuesday, May 3, 2016 at 9:31:44 PM UTC+2, [email protected] wrote:
>
> I have seen multiple people (In the Elixir Slack group 
> <https://elixir-lang.slack.com/archives/general/p1462294660007855>, on 
> Reddit 
> <https://www.reddit.com/r/elixir/comments/4h4y4e/whats_missing_from_the_elixir_ecosystem/d2nvbwd>)
>  
> during the last couple of days requiring something that checks if a 
> (possibly long) string contains e.g. only alphanumeric characters.
>
> It is possible to do this using regular expressions right now:
> ~r/[^[:alnum:]]/u
>
> but this is very slow.
>
> My proposal is to add the following boolean functions to the String module:
>
>
>    -  alphabetic?
>    -  numeric?
>    -  alphanumeric?
>    -  whitespace?
>    -  uppercase? 
>    -  lowercase?
>    -  control_character?
>    
>
> Function heads for these functions can probably be best generated by using 
> compile-time macros similar to what other unicode-based functions already 
> use.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/5b712fbf-2fc7-4d9a-8b72-f1440d60ab72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to