Re: [elixir-core:5662] Add boolean methods for different unicode character groups (String.alphanumeric?, etc)

Eric Meadows-Jönsson Tue, 03 May 2016 15:25:32 -0700

The problem is that the Unicode module is already big, the file size of the
.beam file is one of the largest in elixir. There are also issues compiling
this file on systems with 512mb memory. idna, an erlang library for
unicode, have similar issues on systems with low memory. Adding more
functions that will need a large number of function clauses will make the
issue worse and the size of the compiled elixir we distribute larger.


I think it's better to have this functionality in a library until we can
solve the memory issue and only have the bare necessities for unicode
support in stdlib. If we later can move it into stdlib it would be good to
have the API figured out and bugs fixed in another library that can iterate
faster.

On Tue, May 3, 2016 at 11:29 PM, eksperimental <[email protected]>
wrote:

> I'm not too sure if we should have all those many functions should be
> added. it could be too many of them, and not easy to extend..
> but how about an Unicode.info/1 function, that returns a tuple with
> information about that character. such as
> iex> Unicode.info("A")
> ...> {:alphanumeric, :uppercase, :ascii}
>
> It will be easy to improve as we find more information can be added,
> such as ISO types and other groups (Specially to encodings we are not
> familiar with)
>
> Additionally we could have check?/2 (or some better name probably!)
> iex> Unicode.check?("A", :uppercase)
> ...> true
> iex> Unicode.check?("A", :numeric)
> ...> false
>
>
> created, but On Tue, 3 May 2016 12:31:44 -0700 (PDT)
> [email protected] wrote:
>
> > I have seen multiple people (In the Elixir Slack group
> > <https://elixir-lang.slack.com/archives/general/p1462294660007855>,
> > on Reddit
> > <
> https://www.reddit.com/r/elixir/comments/4h4y4e/whats_missing_from_the_elixir_ecosystem/d2nvbwd
> >)
> > during the last couple of days requiring something that checks if a
> > (possibly long) string contains e.g. only alphanumeric characters.
> >
> > It is possible to do this using regular expressions right now:
> > ~r/[^[:alnum:]]/u
> >
> > but this is very slow.
> >
> > My proposal is to add the following boolean functions to the String
> > module:
> >
> >
> >    -  alphabetic?
> >    -  numeric?
> >    -  alphanumeric?
> >    -  whitespace?
> >    -  uppercase?
> >    -  lowercase?
> >    -  control_character?
> >
> >
> > Function heads for these functions can probably be best generated by
> > using compile-time macros similar to what other unicode-based
> > functions already use.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/20160504042910.57fd86e0.eksperimental%40autistici.org
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Eric Meadows-Jönsson

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAM_eapgAkKspZq1AGW_uFiT-fdJnkmNy5n6Tn5BksOstaAXhBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [elixir-core:5662] Add boolean methods for different unicode character groups (String.alphanumeric?, etc)

Reply via email to