I have created a Hex package called `unicode` <https://hex.pm/packages/unicode> that contains part of the functionality: Right now it reads *DerivedCoreProperties.txt* and infers the different Derived Core Properties from that, as well as implementing some of the Regexp Compatibility Properties defined at http://www.unicode.org/reports/tr18/#Compatibility_Properties that only depend on the Derived Core Properties (digit, alphabetic, alphanumeric, lower, upper).
The properties in the Regexp Compatibility Properties table are the ones that seem the most useful, as these are often the checks someone wants to do on strings where Regular Expressions are too slow. I haven't yet touched the properties for which the General Category of the character has to be known (We could maybe adapt Isaac's code for this? (-: ). Right now, the compiled .beam file when using the Derived Core Properties (Math, Alphabetic, Lowercase, Uppercase) is +- 53kb large; becoming +- 280kb when defining function clauses for the other Derived Core Properties as well. Have a wonderful day, ~Wiebe-Marten On Tuesday, May 3, 2016 at 9:31:44 PM UTC+2, [email protected] wrote: > > I have seen multiple people (In the Elixir Slack group > <https://elixir-lang.slack.com/archives/general/p1462294660007855>, on > Reddit > <https://www.reddit.com/r/elixir/comments/4h4y4e/whats_missing_from_the_elixir_ecosystem/d2nvbwd>) > > during the last couple of days requiring something that checks if a > (possibly long) string contains e.g. only alphanumeric characters. > > It is possible to do this using regular expressions right now: > ~r/[^[:alnum:]]/u > > but this is very slow. > > My proposal is to add the following boolean functions to the String module: > > > - alphabetic? > - numeric? > - alphanumeric? > - whitespace? > - uppercase? > - lowercase? > - control_character? > > > Function heads for these functions can probably be best generated by using > compile-time macros similar to what other unicode-based functions already > use. > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/5b712fbf-2fc7-4d9a-8b72-f1440d60ab72%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
