[perl.git] branch smoke-me/khw-regcomp, created. v5.23.7-417-gf7e44c8

Karl Williamson Tue, 16 Feb 2016 11:30:10 -0800

In perl.git, the branch smoke-me/khw-regcomp has been created

<http://perl5.git.perl.org/perl.git/commitdiff/f7e44c8b378454a00689de6778ffe7eeabe3143a?hp=0000000000000000000000000000000000000000>


        at  f7e44c8b378454a00689de6778ffe7eeabe3143a (commit)

- Log -----------------------------------------------------------------
commit f7e44c8b378454a00689de6778ffe7eeabe3143a
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 20:59:10 2016 -0700

    perlapi: Hide the swash functions
    
    These should be internal only, and we may want to get rid of them
    someday.  Hide their existence so that people who don't already know
    about them won't be tempted to try to use them.

M       embed.fnc

commit b63afe405a6a338988fe2f537c896cc86262e043
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 20:32:32 2016 -0700

    regcomp.h: Not all ANYOF flags are in use.
    
    So, it's better to not have a mask to include the unused ones.

M       regcomp.h

commit 778a2478d061bf88af552f05230eceabd931a0c0
Author: Karl Williamson <[email protected]>
Date:   Tue Dec 29 22:48:09 2015 -0700

    regcomp.c: Extract code to a separate function
    
    This is in preparation for the next commit, where it will be called from
    a second place.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c

commit 7bba5830a1ba04f88e96a1836dcbac85cda89d6b
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 17:35:22 2016 -0700

    regcomp.c: Rmv unnecessary tests
    
    This tested some flag bits, but these are guaranteed to be set by the
    first test in the 'if'.

M       regcomp.c

commit 5964444fe879fa0a3647de42582e14e10ed85e4d
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:27:20 2016 -0700

    regcomp.c: Save a branch test
    
    This branch will only be true if the answer to the previous branch was
    also true, so can just move it to within that to avoid an unnecessary
    test.

M       regcomp.c

commit 68554e0af843a9d222b88bffdcc188a24dff621c
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:20:43 2016 -0700

    regcomp.c: Clarify -Dr output under /l
    
    It is now redundant to indicate that an ANYOF node is for locale, as the
    regnode type ANYOFL now clearly indicates that.  But also sometimes the
    node is only vaid if the runtime locale is a UTF-8 one.  That was not
    clearly indicated.

M       regcomp.c

commit 3eeac7a7cb5c60f6d4a86333ac3e18fa1ca21240
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 16:13:29 2016 -0700

    regcomp.c: Rmv unnecessary -Dr output
    
    The previous commit removed all ambiguity as far as the 2nd [] in the
    -Dr output of a bracketed character class, so we can remove the
    clarification text, which is unnecessary, and clutters up the output.
    It is required to leave text in in the case where the expression is
    applicable only when the target being matched against is UTF-8.

M       regcomp.c

commit f4c28635c09635db6252fcf22abdee4b82b7249f
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:41:46 2016 -0700

    regcomp.c: -Dr output move
    
    This finishes the process of several commits ago of moving the output of
    what happens when the locale is UTF-8 into the first bracketed class
    expression in -Dr output.  This output thus now is accurate when the
    class is marked as inverted.

M       regcomp.c

commit 48810df50735caf5ba0fe87e4ea9a7bd7bd23d19
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:37:24 2016 -0700

    regcomp.c: -Dr: Add a pipe symbol for clarity
    
    This output of what gets compiled is the OR of the two [] bracketed
    expressions.  Add a '|' to indicate that.  Otherwise, it would legally
    mean one expression followed by the other.

M       regcomp.c

commit 049c754731a1b50362434f2016fc57a22022ae90
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 15:06:49 2016 -0700

    Explicitly show the chars in -Dr for which UTF-8-ness matters
    
    Prior to this commit, when displaying what a pattern compiles to,
    general text was used to indicate that the characters \x80 to \xFF all
    matches when the target being matches is not UTF-8, while some of them
    matched under UTF-8 as well.  This changes to be explicit to show
    precisely for which ones UTF-8-ness matters.

M       regcomp.c

commit 1c8498f13ccc407e6bc168c587a810cc7a33962a
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 14 10:33:31 2016 -0700

    later

M       regcomp.c

commit 657de4ab36366cba93eb2aa83d82ab8ee8e4e06a
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 14 10:26:46 2016 -0700

    regcomp.c: Output XXX

M       regcomp.c

commit 3d9c3c4bcb2323353f1c301d92c7c24fbf80f5ba
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:25:10 2016 -0700

    regcomp.c: Move some -Dr output
    
    Under -Dr compilation output, there can be multiple [...][...]
    displayed.  Some items are output to show the matches that would be
    valid when the current locale is a UTF-8 one, and they currently aren't
    displayed in the first [...].  But they should be, for the case where
    the class is inverted.  For example /[^aQ]/li should display as
    [^aQ{utf8 locale}Aq].  Not having them in the first [ ] runs afoul of De
    Morgan's laws and could be misleading.
    
    This commit doesn't get them all the way there, but it is the first step
    in doing so.

M       regcomp.c

commit b2a0e25758f677731e96f509ffa3918b7d5a6b74
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:07:32 2016 -0700

    regcomp.c: No need to truncate some -Dr output
    
    When displaying what a /i regex pattern compiled into, in the case of
    some that are based on the current locale, certain matches are known to
    occur when the locale is a UTF-8 one.  These are listed separately from
    the other ones in the display, and there has been code to truncate it if
    it gets too big.  However, it can't ever get too large, as the only
    things in it are the alphabetics in the 0-FF range, as everything above
    that doesn't vary by locale.  So the worst case is not very large

M       regcomp.c

commit 94dd3f3f3ab531f1cd9d086f81ff4d0976de002c
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 18:00:36 2016 -0700

    regcomp.c: Comments, white-space, add grouping () for clarity

M       regcomp.c

commit 782ac453836e593b0572af5683dda77acfdfcf0e
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:02:07 2016 -0700

    Cast correctly to U8, not char
    
    U8 is what the function being called is expecting

M       regcomp.c

commit 7d772c3f2a700b41d5322c35013ea789def283d8
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 17:21:28 2016 -0700

    regcomp.c: Simplify a few lines of code
    
    This code had been written before the isMNEMONIC_CNTRL() macro was
    created.  Using the macro simplifies things a little.

M       regcomp.c

commit 82be73e79610a7f694fba85140c1fe2ec7773421
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:51:50 2016 -0700

    regcomp.c: Clean up logic in function
    
    This function uses some crude heuristics to decide whether to make a
    synthetic start class or not.  This commit removes some redundancies.

M       regcomp.c

commit 4f208f2f6b366e3205ace32b485fa29e93b1ac70
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:03:51 2016 -0700

    Add environment variable for -Dr: PERL_DUMP_RE_MAX_LEN
    
    The regex engine when displaying debugging info, say under -Dr, will elide
    data in order to keep the output from getting too long.  For example,
    the number of code points in all of Unicode matched by \w is quite
    large, and so when displaying a pattern that matches this, only the
    first some number of them are printed, and the rest are truncated,
    represented by "...".
    
    Sometimes, one wants to see more than what the
    compiled-into-the-engine-max shows.  This commit creates code to read
    this environment variable to override the default max lengths.  This
    changes the lengths for everything to the input number, even if they
    have different compiled maximums in the absence of this variable.
    
    I'm not  currently documenting this variable, as I don't think it works
    properly under threads, and we may want to alter the behavior in various
    ways as a result of gaining experience with using it.

M       embedvar.h
M       intrpvar.h
M       regcomp.c
M       regcomp.h

commit 2c7be3e53a7f3611248e4d3946c2f9fdc4a8bc1c
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 11 10:25:04 2016 -0700

    regcomp.c: -Dr \xZZ instead of \x{ZZ}
    
    The brackets are unnecessary and clutter the output.

M       regcomp.c

commit cb67236ae27307f68463452ad9e337c8ea3c4a7a
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 11 10:12:57 2016 -0700

    regcomp.c: Fix -Dr bug
    
    It was using a wrong length calculation, which under some circumstances
    caused the output to include extra bytes.  Also I added comments, and
    changed a variable name, so I don't have to figure this out again from
    scratch.

M       regcomp.c

commit 9c05439ce1b2b32caeb29ebbf97fde4983b96348
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:25:35 2016 -0700

    XXX need tests, comments Fix /\p{User-defined}/i

M       regcomp.c
M       t/re/pat_advanced.t

commit bd6e8e74b959d807bb5ba4d158874829b3a1e02d
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 15 11:04:36 2016 -0700

    regcomp.c: Use macro to hide complexity
    
    There is an existing macro that does these three lines in one source
    line.

M       regcomp.c

commit 66b1baaef5a9a386fa5f9bed92866c9a2ca795e5
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 13:49:00 2016 -0700

    Don't allow /\N{}/ under 're strict'
    
    This is the one remaining empty {} that was accepted under the
    experimental 'use re "strict"'.

M       embed.fnc
M       embed.h
M       pod/perldelta.pod
M       pod/perldiag.pod
M       proto.h
M       regcomp.c
M       t/re/reg_mesg.t

commit d105faa657594560f4116a4da47f4729f2c12186
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:20:49 2016 -0700

    perlrecharclass: Add some missing info

M       pod/perlrecharclass.pod

commit 5dd4d5811bed63a0c533d875032638fc314985b7
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 15:35:11 2016 -0700

    PATCH: [perl 127537] /\W/ regression with UTF-8
    
    This bug is apparently uncommon in the field, as I was the one who
    discovered it.  It requires a complemented posix class, like \W or \S,
    in an inverted character class, like [^\Wfoo] in a pattern that also has
    a synthetic start class generated by the regex optimizer for it .
    
    The fix is trivial.

M       pod/perldelta.pod
M       regcomp.c
M       t/re/re_tests

commit 6d26347314b6c6a2fe517443924a5c3bd4e1ef25
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 13 11:53:50 2016 -0700

    regcomp.c, toke.c: swap functions being inline static
    
    grok_bslash_x() is so large that no compiler will inline it.  Move it to
    dquote.c from dq_inline.c.  Conversely, move form_octal_warning() to
    dq_inline.c.  It is so tiny that the function call overhead is scarcely
    smaller than the function body.
    
    This also moves things in embed.fnc so all these functions.  are not
    visible outside the few files they are supposed to be used in.

M       dquote.c
M       dquote_inline.h
M       embed.fnc
M       embed.h
M       proto.h

commit faa731084701c5993d3a128fb5e50df3e6f00fcf
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 10 14:29:15 2016 -0700

    XXX partial don't push regex: Add ASCII/NASCII regnodes
    
    These are a little more efficient than using the POSIXA(:ascii:)
    mechanism.

M       pod/perldebguts.pod
M       regcomp.sym
M       regexec.c
M       regnodes.h

commit f0d98c7955eb28400675c2442ae713e58d1a6b62
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 3 13:41:11 2016 -0700

    constant.pm lower memory use

M       dist/constant/lib/constant.pm
-----------------------------------------------------------------------

--
Perl5 Master Repository

[perl.git] branch smoke-me/khw-regcomp, created. v5.23.7-417-gf7e44c8

Reply via email to