On Sat, Nov 5, 2016 at 10:55 AM, Jovan Trujillo
<jovan.trujil...@gmail.com> wrote:
> Hi Aaron,
>    In perlre I read that \w
> "
>
>     \w        [3]  Match a "word" character (alphanumeric plus "_", plus
>                                       other connector punctuation chars plus
> Unicode
>                                       marks)
>
> "
>
> So since I didn't know what these 'other' connection punctuation chars are I
> avoided it. Unicode makes things more complicated for me. Do you know?
>

To exclude Unicode and ensure only ASCII, use the /a modifer,
eg,  /\w+/a

>From perlre:

/a

       is the same as "/u", except that "\d", "\s", "\w", and the Posix
       character classes are restricted to matching in the ASCII range only.
       That is, with this modifier, "\d" always means precisely the digits "0"
       to "9"; "\s" means the five characters "[ \f\n\r\t]"; "\w" means the 63
       characters "[A-Za-z0-9_]"; and likewise, all the Posix classes such as
       "[[:print:]]" match only the appropriate ASCII-range characters.

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to