[perl.git] branch smoke-me/khw-regex, created. v5.17.2-93-g4cb9e33

Karl Williamson Wed, 25 Jul 2012 13:03:47 -0700

In perl.git, the branch smoke-me/khw-regex has been created

<http://perl5.git.perl.org/perl.git/commitdiff/4cb9e33486d4738ef7443d971ff2853e89a04f95?hp=0000000000000000000000000000000000000000>


        at  4cb9e33486d4738ef7443d971ff2853e89a04f95 (commit)

- Log -----------------------------------------------------------------
commit 4cb9e33486d4738ef7443d971ff2853e89a04f95
Author: Karl Williamson <[email protected]>
Date:   Wed Jul 25 12:31:27 2012 -0600

    for smoke

M       embed.fnc
M       embed.h
M       lib/unicore/mktables
M       pod/perlebcdic.pod
M       proto.h
M       regcomp.c
M       regexec.c
M       t/re/pat_advanced.t
M       t/test.pl
M       utf8.h

commit c25ebd17d87927454dfb2e45b58f0b86ccca99e7
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 13:09:38 2012 -0600

    XXX squash_with_next

M       embedvar.h
M       handy.h
M       intrpvar.h
M       regcomp.c
M       sv.c

commit 35eda35c5828ea5a5456394e05ed39742cc2908a
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:55:42 2012 -0600

    Generate tables for chars that aren't in final fold pos
    
    This starts with the existing table that mktables generates that lists
    all the characters in Unicode that occur in multi-character folds, and
    aren't in the final positions of any such fold.
    
    It generates data structures with this information to make it quickly
    available to code that wants to use it.  Future commits will use these
    tables.

M       charclass_invlists.h
M       handy.h
M       l1_char_class_tab.h
M       regen/mk_PL_charclass.pl
M       regen/mk_invlists.pl

commit 263721dc09ffeede8fd1e535037f19f16f8cf403
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:44:55 2012 -0600

    regen/mk_invlists: Add mode to generate above-Latin1 only
    
    This change adds the ability to specify that an output inversion list is
    to contain only those code points that are above Latin-1.  Typically,
    the Latin-1 ones will be accessed from some other means.

M       regen/mk_invlists.pl

commit 8fd1e4fb6c2c8959d3844a0b00f3490298718246
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:38:41 2012 -0600

    Unicode::UCD::prop_invlist() Allow to return internal property
    
    This creates an optional undocumented parameter to this function to
    allow it to return the inversion list of an internal-only Perl property.
    This will be used by other functions in Perl, but should not be
    documented, as we don't want to encourage the use of internal-only
    properties, which are subject to change or removal without notice.

M       lib/Unicode/UCD.pm

commit cc142cd3c2c2eefb032fbb2ef233b6eb2a2640cf
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:37:52 2012 -0600

    mktables: Add comment to gen'd data file

M       lib/unicore/mktables

commit 6da9b8f4bb85be8c197d9668bf8b6bf87cd0a1f9
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:22:41 2012 -0600

    mktables: grammar in comments

M       lib/unicore/mktables

commit 6b1b9f9bdb6072c24bac03fc4416891ccadfadeb
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 12:20:42 2012 -0600

    regen/mk_PL_charclass.pl: Remove obsolete code
    
    Octals are no longer checked via this mechanism.

M       regen/mk_PL_charclass.pl

commit c03b40697da7610feb4ef9933052541d77d86501
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 11:51:43 2012 -0600

    regcomp.c: Make invlist_search() usable from re_comp.c
    
    This was a static function which I couldn't get to be callable from the
    debugging version of regcomp.c.  This makes it public, but known only
    in the regcomp.c source file.  It changes the name to begin with an
    underscore so that if someone cheats by adding preprocessor #defines,
    they still have to call it with the name that convention indicates is a
    private function.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c

commit a445b16824eff7203919f37267fcc3671a97d823
Author: Karl Williamson <[email protected]>
Date:   Mon Jun 18 11:41:18 2012 -0600

    perlop:clarify wording

M       pod/perlop.pod

commit 07c391951500cc26e066d64c484398962d2db86b
Author: Karl Williamson <[email protected]>
Date:   Sat Jun 16 20:02:07 2012 -0600

    regcomp.c: Rename static fcn to better reflect its purpose
    
    This function handles \N of any ilk, not just named sequences.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c

commit 5163624ca6288d8483ba37287628bf8fe6be0522
Author: Karl Williamson <[email protected]>
Date:   Sat Jun 16 19:55:15 2012 -0600

    regcomp.c: Make comment more accurate

M       regcomp.c

commit 14c3c8525e0a80cb5b51f20487bdded3624dbf54
Author: Karl Williamson <[email protected]>
Date:   Sat Jun 16 19:52:12 2012 -0600

    regcomp.c: Can now do /u instead of forcing to utf8
    
    Now that there is a /u modifier, a regex doesn't have to be in UTF-8 in
    order to force Unicode semantics.  Change this relict from the past.

M       regcomp.c

commit 6178a596d1d4a8dcd3bcea284db85150d2d265ff
Author: Karl Williamson <[email protected]>
Date:   Wed Jun 6 15:02:43 2012 -0600

    regcomp.c: Comments update
    
    This adds some comments and white-space lines, and updates other
    comments to account for the fact that trie handling has changed since
    they were written.

M       regcomp.c

commit 3b005b02cf84b5d5dd46eb74ee2e754b4f74a932
Author: Karl Williamson <[email protected]>
Date:   Mon May 28 10:49:37 2012 -0600

    regcomp.c: Remove variable whose value needed just once
    
    Previous commits have removed all but one instance of using this
    variable, so just use the expression it equates to.

M       regcomp.c

commit 06b8328217313d1f8b3aa5a6c988ea75cb204552
Author: Karl Williamson <[email protected]>
Date:   Mon May 28 10:42:03 2012 -0600

    regcomp.c: White-space only
    
    This indents and outdents to compensate for newly formed and orphan
    blocks, respectively; and reflows comments to fit in 80 columns

M       regcomp.c

commit e67833e34649e7ba56337f22d4aad708c726daf2
Author: Karl Williamson <[email protected]>
Date:   Sun May 27 01:08:46 2012 -0600

    regcomp.c: Trade stack space for time
    
    Pass 1 of regular expression compilation merely calculates the size it
    will need. (Note that Yves and I both think this is very suboptimal
    behavior.)  Nothing is written out during this pass, but sizes are
    just incremented.  The code in regcomp.c all knows this, and skips
    writing things in pass 1.  However, when folding, code in other files is
    called which doesn't have this size-only mode, and always writes its
    results out.  Currently, regcomp handles this by passing to that code a
    temporary buffer allocated for the purpose.  In pass1, the result is
    simply ignored; in pass2, the results are copied to the correct final
    destination.
    
    We can avoid that copy by making the temporary buffer large enough to
    hold the whole node, and in pass1, use it instead of the node.  The
    non-regcomp code writes to the same relative spot in the buffer that it
    will use for the real node.  In pass2 the real destination is used, and
    the fold gets written directly to the correct spot.
    
    Note that this increases the size pushed onto the stack, but code is
    ripped out as well.
    
    However, the main reason I'm doing this is not this speed-up; it is
    because it is needed by future commits to fix a bug.

M       regcomp.c

commit 5737363226e2a7826322f730d82b50f3cfb43fd0
Author: Karl Williamson <[email protected]>
Date:   Sun May 27 01:04:39 2012 -0600

    regcomp.c: Use mnemonic not numeric constant
    
    Future commits will add other uses of this number.

M       regcomp.c

commit a40335a726361a9e92d8e141efae42e789c3a095
Author: Karl Williamson <[email protected]>
Date:   Sat May 26 22:19:22 2012 -0600

    regcomp.c: Resolve EBCDIC inconsistency towards simpler
    
    This code has assumed that to_uni_fold() returns its folds in Unicode
    (i.e.  Latin1) rather than native EBCDIC.  Other code in the core
    assumes the opposite.  One has to change.  I'm changing this one, as the
    issues should be dealt with at the lowest level possible, which is in
    to_uni_fold().  Since we don't currently have an EBCDIC platform to test
    on, making sure that it all hangs together will have to be deferred
    until such time as we do.
    
    By doing this we make this code simpler and faster.  The fold has
    already been calculated, we just need to copy it to the final place
    (done in pass2).

M       regcomp.c

commit b49ca2d363d72c4b0d8f145a120b83f98e375567
Author: Karl Williamson <[email protected]>
Date:   Sat May 26 21:39:32 2012 -0600

    regcomp.c: Use function instead of repeating its code
    
    A new flag to to_uni_fold() causes it to do the same work that this code
    does, so just call it.

M       regcomp.c

commit 88ec19533fb0a74c36ab07676ef67758842e0789
Author: Karl Williamson <[email protected]>
Date:   Sat May 26 14:19:18 2012 -0600

    regcomp.c: Remove (almost) duplicate code
    
    A previous commit opened the way to refactor this so that the two
    fairly lengthy code blocks that are identical (except for changing the
    variable <len>) can have one of them removed.

M       regcomp.c

commit 57ac1bd7077929b803c91df635fde03bd3387227
Author: Karl Williamson <[email protected]>
Date:   Thu May 24 22:14:04 2012 -0600

    regcomp.c: Refactor so can remove duplicate code
    
    This commit prepares the way for a later commit to remove a chunk of
    essentially duplicate code.  It does this at the cost of an extra
    test of a boolean each time through the loop.  But, it saves calculating
    the fold unless necessary, a potentially expensive operation.  When the
    next input is a quantifier that calculated fold is discarded, unused.
    This commit avoids doing that calculation when the next input is a
    quantifier.

M       regcomp.c

commit 5cf91b4d07ad9c7831deccd2d5599320d52f83c9
Author: Karl Williamson <[email protected]>
Date:   Thu May 24 21:39:58 2012 -0600

    Revert "regcomp.c: Move duplicated code to inline function"
    
    This reverts commit 1ceb3049131abe6184db5a55104a620ffea6958d.

M       regcomp.c

commit 531b4d113fb48ee44d463e298be5e5d7138d15e7
Author: Karl Williamson <[email protected]>
Date:   Sun May 6 08:10:33 2012 -0600

    regcomp.c: Move duplicated code to inline function
    
    This simply extracts the code to one function with only required
    ancillary changes.  Later commits will clean things up

M       regcomp.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

[perl.git] branch smoke-me/khw-regex, created. v5.17.2-93-g4cb9e33

Reply via email to