[perl.git] branch smoke-me/khw-regcomp, created. v5.23.7-417-g797ead5

Karl Williamson Tue, 16 Feb 2016 12:51:12 -0800

In perl.git, the branch smoke-me/khw-regcomp has been created

<http://perl5.git.perl.org/perl.git/commitdiff/797ead50375ed771eb6db3431ba2f5efb3f4e43f?hp=0000000000000000000000000000000000000000>


        at  797ead50375ed771eb6db3431ba2f5efb3f4e43f (commit)

- Log -----------------------------------------------------------------
commit 797ead50375ed771eb6db3431ba2f5efb3f4e43f
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 20:59:10 2016 -0700

    perlapi: Hide the swash functions
    
    These should be internal only, and we may want to get rid of them
    someday.  Hide their existence so that people who don't already know
    about them won't be tempted to try to use them.

M       embed.fnc

commit 333842d020017775b6f059fbede972f5c0fe5acc
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 20:32:32 2016 -0700

    regcomp.h: Not all ANYOF flags are in use.
    
    So, it's better to not have a mask to include the unused ones.

M       regcomp.h

commit 1864b5847ebf813d95d82e0076e46d74126014d2
Author: Karl Williamson <[email protected]>
Date:   Tue Dec 29 22:48:09 2015 -0700

    regcomp.c: Extract code to a separate function
    
    This is in preparation for the next commit, where it will be called from
    a second place.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c

commit 2863b4c2e39544e40cbac880c2565f6eb3f7a221
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 17:35:22 2016 -0700

    regcomp.c: Rmv unnecessary tests
    
    This tested some flag bits, but these are guaranteed to be set by the
    first test in the 'if'.

M       regcomp.c

commit af40d9ead54b4e27c2ebe2e3a81833c6fee49d85
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:27:20 2016 -0700

    regcomp.c: Save a branch test
    
    This branch will only be true if the answer to the previous branch was
    also true, so can just move it to within that to avoid an unnecessary
    test.

M       regcomp.c

commit 3f8e6c0419be445f4230ce1b300f0e6eb6a7b69e
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:20:43 2016 -0700

    regcomp.c: Clarify -Dr output under /l
    
    It is now redundant to indicate that an ANYOF node is for locale, as the
    regnode type ANYOFL now clearly indicates that.  But also sometimes the
    node is only vaid if the runtime locale is a UTF-8 one.  That was not
    clearly indicated.

M       regcomp.c

commit 3725f9f62b6d2001579fee83641efe1de12afa74
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:13:29 2016 -0700

    regcomp.c: Rmv unnecessary -Dr output
    
    The previous commit removed all ambiguity as far as the 2nd [] in the
    -Dr output of a bracketed character class, so we can remove the
    clarification text, which is unnecessary, and clutters up the output.
    It is required to leave text in in the case where the expression is
    applicable only when the target being matched against is UTF-8.

M       regcomp.c

commit da6713ca169bc09575f111c4bc4a13afb0fc59a5
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:41:46 2016 -0700

    regcomp.c: -Dr output move
    
    This finishes the process of several commits ago of moving the output of
    what happens when the locale is UTF-8 into the first bracketed class
    expression in -Dr output.  This output thus now is accurate when the
    class is marked as inverted.

M       regcomp.c

commit f2dc6432795b903f4716b699f6824e88286b6de4
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:37:24 2016 -0700

    regcomp.c: -Dr: Add a pipe symbol for clarity
    
    This output of what gets compiled is the OR of the two [] bracketed
    expressions.  Add a '|' to indicate that.  Otherwise, it would legally
    mean one expression followed by the other.

M       regcomp.c

commit 7651d3ab3cbe7909f7ae3232a33eacea2c09fade
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:06:49 2016 -0700

    Explicitly show the chars in -Dr for which UTF-8-ness matters
    
    Prior to this commit, when displaying what a pattern compiles to,
    general text was used to indicate that the characters \x80 to \xFF all
    matches when the target being matches is not UTF-8, while some of them
    matched under UTF-8 as well.  This changes to be explicit to show
    precisely for which ones UTF-8-ness matters.

M       regcomp.c

commit 026a9f21a1d0eddf34ab7f1abc48337a5684fad6
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 14 10:33:31 2016 -0700

    later

M       regcomp.c

commit dd1f9c910b332c91262adfc9a399a6455693bd5d
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 14 10:26:46 2016 -0700

    regcomp.c: Output XXX

M       regcomp.c

commit 5e85ced4910c43e43575bfd0c975570292b71fec
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:25:10 2016 -0700

    regcomp.c: Move some -Dr output
    
    Under -Dr compilation output, there can be multiple [...][...]
    displayed.  Some items are output to show the matches that would be
    valid when the current locale is a UTF-8 one, and they currently aren't
    displayed in the first [...].  But they should be, for the case where
    the class is inverted.  For example /[^aQ]/li should display as
    [^aQ{utf8 locale}Aq].  Not having them in the first [ ] runs afoul of De
    Morgan's laws and could be misleading.
    
    This commit doesn't get them all the way there, but it is the first step
    in doing so.

M       regcomp.c

commit 7d6b45f05d2baa28b907f7952b830b8c7f5925fc
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:07:32 2016 -0700

    regcomp.c: No need to truncate some -Dr output
    
    When displaying what a /i regex pattern compiled into, in the case of
    some that are based on the current locale, certain matches are known to
    occur when the locale is a UTF-8 one.  These are listed separately from
    the other ones in the display, and there has been code to truncate it if
    it gets too big.  However, it can't ever get too large, as the only
    things in it are the alphabetics in the 0-FF range, as everything above
    that doesn't vary by locale.  So the worst case is not very large

M       regcomp.c

commit eac3355fa1c8cdc77f9a5336e0d0a4d8f58e17c2
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:00:36 2016 -0700

    regcomp.c: Comments, white-space, add grouping () for clarity

M       regcomp.c

commit b3c87891505d22b5ba3bfc76857ca98c126e8a81
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:02:07 2016 -0700

    Cast correctly to U8, not char
    
    U8 is what the function being called is expecting

M       regcomp.c

commit 9b9e198cfc422244dfb01719dafc7ea9461cab48
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 17:21:28 2016 -0700

    regcomp.c: Simplify a few lines of code
    
    This code had been written before the isMNEMONIC_CNTRL() macro was
    created.  Using the macro simplifies things a little.

M       regcomp.c

commit 51a07529d1b5c5a2f03f69fc819f8f32232594bb
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:51:50 2016 -0700

    regcomp.c: Clean up logic in function
    
    This function uses some crude heuristics to decide whether to make a
    synthetic start class or not.  This commit removes some redundancies.

M       regcomp.c

commit 43b3a8e8591503fbd04a400e792ce6cd9b9fb388
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:03:51 2016 -0700

    Add environment variable for -Dr: PERL_DUMP_RE_MAX_LEN
    
    The regex engine when displaying debugging info, say under -Dr, will elide
    data in order to keep the output from getting too long.  For example,
    the number of code points in all of Unicode matched by \w is quite
    large, and so when displaying a pattern that matches this, only the
    first some number of them are printed, and the rest are truncated,
    represented by "...".
    
    Sometimes, one wants to see more than what the
    compiled-into-the-engine-max shows.  This commit creates code to read
    this environment variable to override the default max lengths.  This
    changes the lengths for everything to the input number, even if they
    have different compiled maximums in the absence of this variable.
    
    I'm not  currently documenting this variable, as I don't think it works
    properly under threads, and we may want to alter the behavior in various
    ways as a result of gaining experience with using it.

M       embedvar.h
M       intrpvar.h
M       regcomp.c
M       regcomp.h

commit 2c7be3e53a7f3611248e4d3946c2f9fdc4a8bc1c
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 11 10:25:04 2016 -0700

    regcomp.c: -Dr \xZZ instead of \x{ZZ}
    
    The brackets are unnecessary and clutter the output.

M       regcomp.c

commit cb67236ae27307f68463452ad9e337c8ea3c4a7a
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 11 10:12:57 2016 -0700

    regcomp.c: Fix -Dr bug
    
    It was using a wrong length calculation, which under some circumstances
    caused the output to include extra bytes.  Also I added comments, and
    changed a variable name, so I don't have to figure this out again from
    scratch.

M       regcomp.c

commit 9c05439ce1b2b32caeb29ebbf97fde4983b96348
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:25:35 2016 -0700

    XXX need tests, comments Fix /\p{User-defined}/i

M       regcomp.c
M       t/re/pat_advanced.t

commit bd6e8e74b959d807bb5ba4d158874829b3a1e02d
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:04:36 2016 -0700

    regcomp.c: Use macro to hide complexity
    
    There is an existing macro that does these three lines in one source
    line.

M       regcomp.c

commit 66b1baaef5a9a386fa5f9bed92866c9a2ca795e5
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 13:49:00 2016 -0700

    Don't allow /\N{}/ under 're strict'
    
    This is the one remaining empty {} that was accepted under the
    experimental 'use re "strict"'.

M       embed.fnc
M       embed.h
M       pod/perldelta.pod
M       pod/perldiag.pod
M       proto.h
M       regcomp.c
M       t/re/reg_mesg.t

commit d105faa657594560f4116a4da47f4729f2c12186
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:20:49 2016 -0700

    perlrecharclass: Add some missing info

M       pod/perlrecharclass.pod

commit 5dd4d5811bed63a0c533d875032638fc314985b7
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:35:11 2016 -0700

    PATCH: [perl 127537] /\W/ regression with UTF-8
    
    This bug is apparently uncommon in the field, as I was the one who
    discovered it.  It requires a complemented posix class, like \W or \S,
    in an inverted character class, like [^\Wfoo] in a pattern that also has
    a synthetic start class generated by the regex optimizer for it .
    
    The fix is trivial.

M       pod/perldelta.pod
M       regcomp.c
M       t/re/re_tests

commit 6d26347314b6c6a2fe517443924a5c3bd4e1ef25
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 11:53:50 2016 -0700

    regcomp.c, toke.c: swap functions being inline static
    
    grok_bslash_x() is so large that no compiler will inline it.  Move it to
    dquote.c from dq_inline.c.  Conversely, move form_octal_warning() to
    dq_inline.c.  It is so tiny that the function call overhead is scarcely
    smaller than the function body.
    
    This also moves things in embed.fnc so all these functions.  are not
    visible outside the few files they are supposed to be used in.

M       dquote.c
M       dquote_inline.h
M       embed.fnc
M       embed.h
M       proto.h

commit faa731084701c5993d3a128fb5e50df3e6f00fcf
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 10 14:29:15 2016 -0700

    XXX partial don't push regex: Add ASCII/NASCII regnodes
    
    These are a little more efficient than using the POSIXA(:ascii:)
    mechanism.

M       pod/perldebguts.pod
M       regcomp.sym
M       regexec.c
M       regnodes.h

commit f0d98c7955eb28400675c2442ae713e58d1a6b62
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 3 13:41:11 2016 -0700

    constant.pm lower memory use

M       dist/constant/lib/constant.pm
-----------------------------------------------------------------------

--
Perl5 Master Repository

[perl.git] branch smoke-me/khw-regcomp, created. v5.23.7-417-g797ead5

Reply via email to