[perl.git] branch smoke-me/khw-regex, created. v5.17.4-281-gfb484ce

Karl Williamson Sat, 13 Oct 2012 19:18:03 -0700

In perl.git, the branch smoke-me/khw-regex has been created

<http://perl5.git.perl.org/perl.git/commitdiff/fb484ce57776c883d329ed2d85e22b77e922290e?hp=0000000000000000000000000000000000000000>


        at  fb484ce57776c883d329ed2d85e22b77e922290e (commit)

- Log -----------------------------------------------------------------
commit fb484ce57776c883d329ed2d85e22b77e922290e
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 18:17:11 2012 -0600

    regcomp.c: Don't set /i in start class unless /l
    
    There is a deficiency in the optimizer in which it doesn't get rid of
    flags that it should.  One of these is if it should match /i or not.
    Currently it always (perhaps not quite, I don't know) assumes that it
    should match under /i, yielding false positives and slowing things down.
    But a recent commit changed the flag that tells it to do this, so that it
    only gets set if /l is also specified.  There is already existing code to
    work around the optimizer deficiency for /l.  This commit just moves the
    /i flag handling to that existing code, so it won't get invoked unless
    /l is specified.

M       regcomp.c

commit 152f1dc3abb20ad2c5a5460274d66f34d57d3aa6
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 10:00:18 2012 -0600

    regexp.t: Add 'no warnings "utf8";
    
    This .t works fine unless there are failures that it tries to output,
    and the handle hasn't been opened using utf8.  Because we aren't sure if
    that operation works, just turn off warnings.

M       t/re/regexp.t

commit 9acf8664c559ac0278089e7ef5735f69dc83d6b9
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 09:52:42 2012 -0600

    utf8.h: Correct some values for EBCDIC
    
    It occurred to me that EBCDIC has different maximums for the number of
    bytes a character can occupy.  This moves the definition in utf8.h to
    within an #ifndef EBCDIC, and adds the correct values to utfebcdic.h

M       utf8.h
M       utfebcdic.h

commit 01291d4b6228f961f316413776bf5e3b2771d0a3
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 09:20:11 2012 -0600

    regex: White-space, comment only; no code changes
    
    This outdents code that just had its containing block removed, and
    reflows its comments to fill 79 columns; and does some other white space
    adjustments, plus a typo in a comment.

M       regcomp.c
M       regexec.c
M       sv.c

commit 75622c5754c9604aa4015d822eb20cfcde91e244
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 09:15:37 2012 -0600

    regex: Rename macro to reflect its narrowed use
    
    This macro is now only used under locale; its other use has now been
    removed.  Change the name to reflect its only use.

M       regcomp.c
M       regcomp.h
M       regexec.c

commit 2c4c6afadc3a5098363b1f1f3e68c15374d662b3
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 09:07:05 2012 -0600

    regex: Splice out no longer used array element
    
    A recent commit removed all uses of an array element in the middle of an
    array.  This moves up the elements that followed it.

M       regcomp.c
M       regexec.c

commit ae5247c09a68157ae08d338c41cd02c3b3d38d5d
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 08:49:26 2012 -0600

    regex: Remove old code that tried to handle multi-char folds
    
    A recent commit has changed the algorithm used to handle multi-character
    folding in bracketed character classes.  The old code is no longer
    needed.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c
M       regcomp.sym
M       regexec.c
M       regnodes.h

commit a65fd577e041093dd8e91d91c51f78b93402d85a
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 12 11:42:38 2012 -0600

    regcomp.c: Fix-up indentaion; no code changes
    
    Indent a newly-formed block

M       regcomp.c

commit 03222e4baeb60fcc48d3d5519fc412d6ca319d3a
Author: Karl Williamson <[email protected]>
Date:   Thu Oct 11 21:49:31 2012 -0600

    PATCH: [perl #89774] multi-char fold + its fold in char class
    
    The design for handling characters that fold to multiple characters when
    the former are encountered in a bracketed character class is defective.
    The ticket reads, "If a bracketed character class includes a character
    that has a multi-char fold, and it also includes the first character of
    that fold, the multi-char fold will never be matched; just the first
    character of the fold.".   Thus, in the class /[\0-\xff]/i, \xDF will
    never be matched, because its fold is 'ss', the first character of
    which, 's', is also in the class.
    
    The reason the design is defective is that it doesn't allow for
    backtracking and trying the other options.
    
    This commit solves this by effectively rewriting the above to be
    / (?: \xdf | [\0-\xde\xe0-\xff] ) /xi.  And so the backtracking gets
    handled automatcially by the regex engine.

M       embedvar.h
M       intrpvar.h
M       pod/perldelta.pod
M       pod/perlre.pod
M       pod/perlrecharclass.pod
M       regcomp.c
M       sv.c
M       t/re/re_tests

commit 7f62429d27ea2645a9d3f340a322da39bf200309
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 12 11:24:34 2012 -0600

    regen/mk_invlists.pl: Make list for multi-fold chars
    
    This causes charclass_invlists.h to have a new list of all the
    characters whose fold is a sequence of more than one character.

M       charclass_invlists.h
M       regen/mk_invlists.pl

commit b6546165754863fd8eb3bd2363c69047fd24e059
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 12 09:10:10 2012 -0600

    mktables: Add table for chars with multi-char fold
    
    This will be used in a later commit

M       lib/unicore/mktables

commit ebefcf635ddccaae8224cd44688daefae51165a0
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 13 08:31:29 2012 -0600

    regcomp.c: Rename a macro, fix-up comments
    
    This very recently introduced macro's name could be clearer, and it can
    be used in another place, and the comment concerning that is slightly
    inaccurate.

M       regcomp.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

[perl.git] branch smoke-me/khw-regex, created. v5.17.4-281-gfb484ce

Reply via email to