In perl.git, the branch smoke-me/khw-ucd has been created

<http://perl5.git.perl.org/perl.git/commitdiff/f191d4185c958318ea276fc9a5cbb7ce6a19e53b?hp=0000000000000000000000000000000000000000>

        at  f191d4185c958318ea276fc9a5cbb7ce6a19e53b (commit)

- Log -----------------------------------------------------------------
commit f191d4185c958318ea276fc9a5cbb7ce6a19e53b
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 28 12:44:29 2012 -0700

    utf8.c: white-space only
    
    This adds an indent now that the code is in a newly created block

M       utf8.c

commit 6318886ed727557da4f73d95f245ec6e1128a3b8
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 28 10:47:25 2012 -0700

    utf8.c: Use the new compact case mapping tables
    
    This changes the Perl core when looking up the
    upper/lower/title/fold-case of a code point to use the newly created
    more compact tables.  Currently the look-up is done by a linear search,
    and the new tables are 54-61% of the size of the old ones, so that on
    average searches are that much shorter

M       utf8.c

commit 7730c35ea26c103cd07fc214fad2ef77d50cd241
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 28 09:51:58 2012 -0700

    mktables: Generate some delta tables
    
    This commit has the effect of changing the non-legacy tables for the lc,
    uc, tc, and fc properties to use maps of deltas from the code points
    instead of the code points themselves, thus shortening them
    significantly, and hence the time required to search through them.
    
    Note that these tables are new, and currently used only by Unicode::UCD.
    A future commit will change the Perl core to use them.

M       lib/Unicode/UCD.pm
M       lib/Unicode/UCD.t
M       lib/unicore/mktables

commit 7d96b3aea7a4ed926607d02ede5ea8f6b86eae38
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 28 09:26:29 2012 -0700

    mktables: Change generated file comment
    
    All the files that mktables generates that are for external-to-core use
    have now been changed so that the code requests explicitly for each that
    they have the comment that says they are for external use, but it is
    deprecated to use them.  That means that any files that haven't been so
    explicitly set should have the comment instead that says they are for
    internal use only.

M       lib/unicore/mktables

commit 5ad03ab051d90ad51a529be9aade9346d4e3c7ee
Author: Karl Williamson <[email protected]>
Date:   Fri Jan 27 11:33:51 2012 -0700

    mktables: Preserve old format in some tables
    
    Future commits will cause tables that map to code points to, in general,
    use deltas instead.  This ensures that files that contain tables and
    have been mentioned publicly in the past continue to have their current
    contents and format, so that applications that read them (such as
    Unicode::Normalize) are unaffected.

M       lib/unicore/mktables

commit 456a214a64e606d23febd688816b78c5c68abe41
Author: Karl Williamson <[email protected]>
Date:   Fri Jan 27 11:26:03 2012 -0700

    Unicode::UCD: pod and comment nits

M       lib/Unicode/UCD.pm

commit 56adcde11defe379793e124f010da2bb2cba63a7
Author: Karl Williamson <[email protected]>
Date:   Fri Jan 27 10:50:47 2012 -0700

    mktables: Allow generation of delta tables
    
    Delta tables are those in which the mapping is not stored as-is, but is
    modified to be the delta between the actual mapping and the code point
    it is for.  This allows for smaller tables that are faster to search and
    require less memory to store.
    
    For example, consider the lower case mapping of A=>a, B=b, ... Z=>z.
    Prior to this patch, this requires 26 entries in the table; now it
    requires just one.  This is because A=65 and a=97.  We store 97-65=32.
    And 32 is the same delta for each of A-Z, so we can store these as a
    single range each with the same value, 32.
    
    The delta tables tend to be half as large as the non-ones, or even
    smaller.
    
    This just enables the feature.  No tables currently use it.  For that,
    changes in other Unicode::UCD need to be coordinated.

M       lib/unicore/mktables

commit 4fe183eae1d0fe68ced3a0531150f348431d7586
Author: Karl Williamson <[email protected]>
Date:   Thu Jan 26 21:27:30 2012 -0700

    mktables: White-space, comments only
    
    A previous commit has added two nested blocks surrounding the affected
    code.  This looks like a big change, but it is in fact only white space
    plus reflowing things to fit in an 80 column window, plus slight changes
    to comments.
    
    I verified that there were no code changes by using a diff command that
    can ignore leading white space changes, and hence gave a more accurate
    difference listing

M       lib/unicore/mktables

commit 8ad97c97b2f42fbf7797d14e9e3b98e3b0cd0d5d
Author: Karl Williamson <[email protected]>
Date:   Thu Jan 26 21:01:33 2012 -0700

    mktables: Refactor if-else series
    
    This is a slight refactoring to avoid using 'next' in the loop, and to
    surround things with a bare block.  Future commits will want to
    do common code at the bottom of the loop, including a redo of the bare
    block.

M       lib/unicore/mktables

commit 91b67dda5250b7c7bb3bd1575df9ae77a32aeec4
Author: Karl Williamson <[email protected]>
Date:   Mon Jan 23 12:43:42 2012 -0700

    Unicode::UCD::prop_invmap(): Use regex to get trie
    
    This should speed up this test slightly

M       lib/Unicode/UCD.pm

commit cfc57f2347d54aff74aff4de333846c5c1d64ff8
Author: Karl Williamson <[email protected]>
Date:   Sun Jan 22 08:50:24 2012 -0700

    mktables: Don't generate no-longer used tables
    
    Previous commits have removed all uses of these tables, so they are no
    longer needed.

M       lib/unicore/mktables

commit ca6a2b3529849b388308bf3ef181b4cb054eec75
Author: Karl Williamson <[email protected]>
Date:   Sun Jan 22 08:35:34 2012 -0700

    Unicode::UCD: Rmv uses of no-longer needed tables
    
    Previous commits have expanded whats in the full case mapping tables
    to include the simple maps as well.  Thus the specially constructed
    tables need no longer be used, leading to simplification.

M       lib/Unicode/UCD.pm

commit feca1a35e0cdf17b0c45f04b0c91fc6eeea4d194
Author: Karl Williamson <[email protected]>
Date:   Sun Jan 22 08:27:11 2012 -0700

    UCD.t: white space only
    
    outdent now that surrounding block is removed

M       lib/Unicode/UCD.t

commit 182a0bdf2ef5ce74a2a8568acb5778ceef266653
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 20:04:51 2012 -0700

    mktables: Include simple mappings in full tables
    
    This changes the case change mapping tables to include the simple
    mappings.  This was done in 5.14 for the case folding table.  The full
    mappings are contained, as before, in a hash.  Now the simple mappings
    they override (when doing multi-char case changing) are added to the
    main body of the table, to the already existing simple mappings that
    aren't overridden.
    
    If the caller wants to do full mapping, it should look first in the
    hash, and only if not found, look in the main body.  If the caller wants
    only simple mapping, it ignores the hash.
    
    This is already how the code in utf8.c that reads these tables is
    constructed.
    
    The .t is modified to take into account that these code points are now
    in the main table body.

M       lib/Unicode/UCD.t
M       lib/unicore/mktables

commit 0755060e0a05c328fb5f712d57ab2dfe4ce78db2
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 15:27:00 2012 -0700

    mktables: Add duplicate tables
    
    This is for backwards compatibility.  Future commits will change these
    tables that are generated by mktables to be more efficient.  But the
    existence of them was advertised in v5.12 and v5.14, as something a Perl
    program could use because the Perl core did not provide access to their
    contents.  We can't change the format of those without some notice.
    
    The solution adopted is to have two versions of the tables, one kept in
    the original file name has the original format; and the other is free to
    change formats at will.
    
    This commit just creates copies of the original, with the same format.
    Later commits will change the format to be more efficient.
    
    We state in v5.16 that using these files is now deprecated, as the
    information is now available through Unicode::UCD in a stable API.  But
    we don't test for whether someone is opening and reading these files; so
    the deprecation cycle should be somewhat long;  they will be unused, and
    the only drawbacks to having them are some extra disk space and the time
    spent in having to generate them at Perl build time.
    
    This commit also changes the Perl core to use the original tables, so
    that the new format can be gradually developed in a series of patches
    without having to cut over the whole thing at once.

M       lib/unicore/mktables
M       utf8.c

commit 63704a7c5dbeaa80c8b35c9513ebb5c5f202dbd3
Author: Karl Williamson <[email protected]>
Date:   Thu Jan 26 11:30:37 2012 -0700

    mktables: avoid some extra work
    
    The object is already known to us as the loop variable, so no need to
    derive it again; and change the loop variable name and one other
    variable name to distinguish the table as being the full map one from
    the simple map one

M       lib/unicore/mktables

commit 7f5a3d57583c2f962d9f9175fda0f386a7e7357e
Author: Karl Williamson <[email protected]>
Date:   Thu Jan 26 09:52:26 2012 -0700

    mktables: Allow non-standard initializations of properties
    
    Some property tables have multiple values per code point.  These include
    the final Name-equivalent property in which some code points have more
    than one synonym; and the full case changing property tables that are
    supersets of the simple case changing tables, in which some code points
    have a full mapping that differs from the simple mapping.
    
    Prior to this patch, these could not be initialized simply using the
    Initialize parameter to the constructor, as it was unable to handle
    multiple values per code point.
    
    This also preserves the range type.

M       lib/unicore/mktables

commit b179c4398f5244b74260cdb8eb088c767302805e
Author: Karl Williamson <[email protected]>
Date:   Mon Jan 23 09:23:16 2012 -0700

    mktables: Comments, white-space and typo in message text only

M       lib/unicore/mktables

commit b3411c68af543906423eebccd93b499945860bfa
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 12:57:41 2012 -0700

    mktables: Refactor populating simple case folding tables
    
    These three tables are handled alike; this creates a loop to execute the
    same instructions on each of them.  Currently there is so little to do,
    that it wouldn't be worth it, except that future commits will add
    complications, and this makes those easier to handle.
    
    There is now a test that the input data is sane, and instead of
    overwriting a value in a table with a known identical value, we skip
    that.  This doesn't save much effort, because most of the work is
    looking up the value (which we can now check sanity for), but again will
    be useful for future commits.

M       lib/unicore/mktables

commit 40395caf55659468a177725a1b7b80ad662ebfcf
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 13:19:15 2012 -0700

    mktables: Assume a leading zero means hex format
    
    When calculating the format of a table, assume that there are no leading
    zeros if it is a decimal number, but that means hex.

M       lib/unicore/mktables

commit ce155ec91b0b0903b0f59d4f638114107fbc2a9f
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 12:47:01 2012 -0700

    mktables: Don't populate the _stc table
    
    This table was used only by Unicode::UCD which no longer uses it, and it
    turns out that the data in it are redundant.  This is in preparation for
    refactoring and removal of the table altogether.

M       lib/unicore/mktables

commit fe319fc7150504e7fd454235a0c9c2536870e349
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 12:37:28 2012 -0700

    Unicode::UCD: Don't read _stc table
    
    It turns out that currently in Unicode 6.0, this table is redundant.
    This prepares for removing it altogether.

M       lib/Unicode/UCD.pm

commit fb0dcb7b8e4452ce9f9dad00b8b1df9c7ef2020e
Author: Karl Williamson <[email protected]>
Date:   Sat Jan 21 09:21:40 2012 -0700

    mktables: Subroutine call needs to be fully qualified
    
    As it is calling something in a different package

M       lib/unicore/mktables
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to