[perl.git] branch smoke-me/khw-core created. v5.27.11-77-g743b6ca19e

Karl Williamson Sun, 06 May 2018 08:11:05 -0700

In perl.git, the branch smoke-me/khw-core has been created

<https://perl5.git.perl.org/perl.git/commitdiff/743b6ca19e5c571abec4cc8b5e0877ea30af537b?hp=0000000000000000000000000000000000000000>


        at  743b6ca19e5c571abec4cc8b5e0877ea30af537b (commit)

- Log -----------------------------------------------------------------
commit 743b6ca19e5c571abec4cc8b5e0877ea30af537b
Author: Karl Williamson <k...@cpan.org>
Date:   Sun May 6 09:08:06 2018 -0600

    t/porting/regen.t: Add test for new uni_keywords.h

commit 30cb9b0fa4b3514254f5f39a016cd2ac94e08d34
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 22:07:55 2018 -0600

    regen/mk_invlists.pl: Fix outdated comments

commit 9dc5476bf6017820fd9a88fcde48a8961486850d
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 21:21:45 2018 -0600

    regen/mk_invlists.pl: use re 'qr/aa'
    
    This makes sure that all patterns in this file are compiled under /aa.
    Doing this can catch bugs.  The bug the previous commit fixes would have
    been caught if we did this.

commit a4783fb2193e2c3dab8296d30cb4ea657d0451f1
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 20:46:21 2018 -0600

    regen/mk_invlists.pl: Fix chicken and egg problem
    
    The problem here is that it was using a regular expression pattern to
    determine if a code point is the integer 0.  When a new Unicode release
    comes along and adds a new block of decimals, this routine should be run
    before the interpreter is compiled for real.  And the pattern won't know
    about the new block, so this would fail.
    
    Solve the problem by using only Unicode::UCD to discover this info, and
    not a pattern.

commit 1158b614b7a19bf10b1740bd47b6a4e2e3d5fb35
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 19:53:18 2018 -0600

    mktables: Add, change some comments

commit a390add465d5df4ce84256ae8b402e51d4fa61eb
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 12:13:37 2018 -0600

    utf8.c: Use a more generic enum instead of explicit ptr
    
    This changes, where possible, the reference to an inversion list, from
    its specific name, to using an enum value (or a #define to an enum
    value) which is an offset into a list of inversion lists.
    
    This seems slightly more robust to me, as we don't have to know the
    precise name of the table, but can use an enum which may have #define's
    for it to create synonyms.  Some versions of Unicode may not have the
    precise name, but regen/mk_invlists.pl creates synonyms where possible,
    so the chances of it being undefined go down.
    
    Currently there is an inconsistency in the tables' names.  Some recent
    ones all begin with 'PL_'.  That was when I thought these tables were
    all going to be public.  But then it turned out that they could just be
    defined in one file (utf8.c), so the prefix is probably unnecessary.
    Older tables didn't have that, and haven't changed.  I'm not sure how it
    will or should turn out.

commit dff12247092e76f6af7388eeabf5e0880c8c4ce8
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 12:01:27 2018 -0600

    utf8.c: Reorder some initialization code
    
    This puts the code into various related groups.

commit a9022eee5e4a513e078bc8fc3e97ea8100b199cf
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 11:38:18 2018 -0600

    utf8.c: Fix \p{}  work on old Unicodes
    
    This change to use one #define instead of a synonym causes the code to
    work unchanged on any Unicode version.  The synonym isn't defined in
    very old Unicodes, so this wouldn't compile for them.

commit 2ee52ad108a5adb49858ef16a001fc3db76bb52e
Author: Karl Williamson <k...@cpan.org>
Date:   Sat May 5 11:28:09 2018 -0600

    utf8.c: qr/\p{}/ Handle Unihan numeric properties
    
    The Unihan data base is not shipped with perl due to its size.  But we
    allow someone to copy its files into the unicore directory and recompile
    perl in order to get access to its properties.  Some of those properties
    are numeric, which, like the nv property, require special handling in
    utf8.c.  This commit adds that handling.

commit be0a00edd128ba8dbb96dc82f0de34adc5236851
Author: Karl Williamson <k...@cpan.org>
Date:   Fri May 4 22:25:54 2018 -0600

    mktables: Handle cjkiicore properly
    
    This property is not normally compiled by perl, but an installation may
    choose to use it.  It was failing some tests because this is a special
    property that is like a perl dual-var.  It is both binary, and
    non-binary, and commit 346f9bfbe12 forgot that.

commit 1caeedf042be23d2105bc83261c80fdcbbd6206b
Author: Karl Williamson <k...@cpan.org>
Date:   Fri May 4 21:26:31 2018 -0600

    PATCH: [perl #133175] script run free from wrong pool panic
    
    Setting the pointer to NULL after freeing signals the code in later
    interations that it has been freed already

commit 045b2702237f86159a6997fed6a95163dc5ebfe1
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 17:26:42 2018 -0600

    regen/mk_invlists.pl: Fix-ups for early Unicode versions
    
    In some of these, certain properties aren't defined yet, so have no
    entries.  Just add a check for that, and compensate.

commit 2d6d50a0b32ce180cab03c5384ba1d1f534d0549
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 16:42:29 2018 -0600

    regcomp.c: Simplify
    
    Under /a pattern matching, the matches of the [:posix:] classes are
    restricted to the ASCII range.  Previously, in a time/space trade-off
    that favored space, we created the list of matching characters at
    pattern compilation time by ANDing the full-range Posix class with the
    set of ASCII characters.
    
    But now, the tables for just the ASCII-range classes are generated
    anyway, so there's no need to do that compilation-time intersection.
    This slightly simplifies the code.

commit eaf915a33ac6b66097fbd6f5a5c0f872cf0cc67f
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 15:47:11 2018 -0600

    mktables: Add guard against Unicode breakage
    
    This adds a check that a new Unicode version doesn't create a rational
    number that is too close to a current rational for our existing
    floating point precision.  Should this happen, we can increase the
    precision we use.

commit eebe620ad6c29a0eaa2ea2827b56dcd248458978
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 15:24:19 2018 -0600

    Add tests for qr/\p{}/
    
    This adds tests for nv=integer, where 'integer' is expressed in %e.

commit 65a7e777edd2c114c05e7e40adacdb0f5e5b0d3b
Author: Karl Williamson <k...@cpan.org>
Date:   Mon Apr 30 19:05:54 2018 -0600

    utf8.c: Handle qr!\p{nv=6/8}!
    
    I thought this worked before, but it turns out it never did.  This
    commit allows the rational number specified in looking up the Numeric
    Value property to not be in lowest possible terms.  Unicode even
    furnishes some of its data in non-lowest form, so we should accept
    this.

commit 809b6b625641ed8ab3e9d1d0e21543a70892c60e
Author: Karl Williamson <k...@cpan.org>
Date:   Mon Apr 30 10:39:46 2018 -0600

    utf8.c: Use \p{nv=float}
    
    Now that the float data is available to us (in the previous commit), we
    can take advantage of it, and avoid swash creation.
    
    We just use the perl atof() to convert the input string to an NV, and
    then convert back to a string, but in guaranteed canonical form.  Then
    we look that up.

commit d46e9c5da76a5ccdc8d1c7b7a8a3c9aaa1db0dc6
Author: Karl Williamson <k...@cpan.org>
Date:   Thu Apr 26 12:29:54 2018 -0600

    regen/mk_invlists.pl: Add \p{nv=float} data
    
    The previous commit revised how nv=float is handled.  This commit adds
    data for handling that to charclass_invlists.h, so that the next commit
    can use that and avoid swash creation.

commit 8f05e6682fb6f6b36d6ed9eda8e21bd40c100c77
Author: Karl Williamson <k...@cpan.org>
Date:   Sun Apr 29 21:08:37 2018 -0600

    Revise \p{nv=float} lookup
    
    The Numeric Value property allows one to find all code points that have
    a certain numeric value.  An example would be to match against any
    character in any of the world's scripts which is effectively equivalent
    to the digit zero.
    
    It is documented that we accept either integers (like \p{nv=9}) or
    rationals (like \p{nv=1/2}).  But we also accept floating point
    representations in case a conversion to numeric has happened.  I think
    it is right that we not document these and their vagaries.  One reason
    is that Unicode might someday create a new rational number that, to the
    precision we currently accept, is indistinguishable from an existing
    one, so that we would have to increase the precision.
    
    But there was a bug I introduced years ago.  I thought that in order for
    a float to be considered to match a close rational, that 3 significant
    digits of precision would be needed, like .667 to match 2/3.  That still
    seems reasonable.   But I didn't implement that concept.  Instead, prior
    to this commit, it was 3 (not necessarily significant) digits, so that
    for 1/160, it would match .001.
    
    This commit corrects that, and makes the lookup simpler.  mktables will
    use sprintf %e to get the number normalized and having the 3 signicant
    digits required.  At runtime, a floating number is normalized using the
    same format, and the result looked up in a hash.  This eliminates the
    need to worry about matching within some epsilon.
    
    Further simplifications in utf8_heavy.pl are achieved by making a more
    precise definition as to what an acceptable number looks like, so we
    don't have to check later to see if what matched really was one.

commit e85cc6749f4f936c9f8a79e2b69a92ffc81bc6ad
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 21:18:59 2018 -0600

    regen/mk_invlists.pl: Add to list of props to keep together
    
    Using the same idea as pp_hot.c, the Unicode properties actually used by
    perl are attempted to be kept together so that paging in one is likely
    to page in others.  A few were omitted prior to this commit.

commit db6e1c8d2febafb0cb0a3bf95c73ee7a990e73b4
Author: Karl Williamson <k...@cpan.org>
Date:   Thu Apr 26 02:08:53 2018 -0600

    regen/mk_invlists.pl: Create synonyms for perl props
    
    This allows our code to not have to be so precise as to which alias for
    a property it uses.

commit 0bd2e10de9e2742b6693f3f530c35e0e0997a913
Author: Karl Williamson <k...@cpan.org>
Date:   Thu Apr 26 02:02:05 2018 -0600

    regen/mk_invlists.pl: Prefer certain property names
    
    This sorts various properties to be first, so that there names will be
    used instead of others.  This gives more stability to the core using
    particular names: a new version of the Unicode standard is less likely
    to come up with a different name, which, if it did, the core would have
    to change to use it.
    
    The preferred names are available in all Unicode versions

commit e2bff3f50b19eac6a4d9399e28a408752818492f
Author: Karl Williamson <k...@cpan.org>
Date:   Thu Apr 26 01:55:34 2018 -0600

    regen/mk_invlists.pl:  Add comment

commit 27ed9f2c200ef58e907156bcb75f621f77062921
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 20:58:47 2018 -0600

    regen/mk_invlists.pl: Remove some unnecessary #if's
    
    Things aren't actually getting switched here, so no need for them.

commit 165eb7d2cebe5425d959575d23a4b14a183e8912
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 16:53:07 2018 -0600

    regen/mk_invlists.pl: Change die into warning
    
    I found an instance in compiling early Unicode releases where this
    circumstance is legitimate

commit 1fa927891ac65108a5dcdf334a2117db0fd56861
Author: Karl Williamson <k...@cpan.org>
Date:   Mon Apr 30 10:06:14 2018 -0600

    utf8.c: Use menominic variable name

commit 79ccbc27822114986daa681d60477613e3f9eff8
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 16:51:22 2018 -0600

    utf8.c: Fix typo in comment

commit 400c63b7af065f29f610f9d01bc419803192fc3b
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 16:36:09 2018 -0600

    regen/mk_invlists.pl: Slight speed up
    
    Instead of checking each time if an element already exists in an array
    before adding it, just add it, and afterwards remove all redundant ones.

commit 89cebcf3f936d57870ac5b6d1a1c81a12575a88b
Author: Karl Williamson <k...@cpan.org>
Date:   Sun Apr 29 21:14:48 2018 -0600

    utf8.c: Use variable instead of repeating expression
    
    Set a variable to the result of this expression which is used in
    multiple places.

commit 73b559ca79e4db0d3bb97eedfa3b5ec15c76017c
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 16:20:50 2018 -0600

    Remove support for qr/\p{_CanonDCIJ}
    
    This is the third and final obsolete property that is being removed in 3
    sequential commits.  The property is not used in cpan, and is being
    removed as part of the cleanup instigated because another of the 3 would
    require extra code to handle if we were to keep it around.

commit fc4634d310fe592d81eb37ea4fab005bca7e6eac
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 14:21:04 2018 -0600

    Remove support for qr/\p{_Comb_Above}/
    
    This property is no longer used in the core, nor in cpan, and is marked
    as for core use only, not necessarily stable.  I have kept it around
    because it was work to remove it, but now the revamping of the property
    lookup scheme was causing failures with a similar property, and the
    previous commit removed that one.
    
    There are just three of these properties, and I think it's time to
    remove support for all three.  The next commit will do the same for the
    third one.

commit 832ef61c1992178f259f0b2a809e48049ae27268
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 15:07:14 2018 -0600

    Remove qr/\p{_Case_Ignorable}/
    
    This property is no longer used in the core, nor in cpan, and is marked
    as for core use only, not necessarily stable.  I have kept it around
    because it was work to remove it, but now the revamping of the property
    lookup scheme was causing failures with it, when compiling on early
    Unicode releases.  That could be fixed with extra work, but simply
    removing it also fixes the problem and avoids future maintenance
    costs.

commit 3f5c80b30e8d4b39848092342fb85a885ec23c79
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 13:27:34 2018 -0600

    qr/\p{...}/: Rmv redundant text from warning msg detail
    
    This text is emitted when compiling a pattern using a deprecated
    property.  The text is added detail to the main text of the message
    (which isn't changing), and is redundant because it just says it's
    deprecated, and the main message already says that.

commit 3abb0d51aca7c14ecf09dddcbdff99047e2dc4c0
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Apr 25 12:49:19 2018 -0600

    Unicode::UCD: Avoid uninit message
    
    I found a case where this array can be empty, so add a test for that to
    avoid trying to look at the first (non-existent) element.

commit 2e22d131ecdebaf5dbb9989144b504ea1717e9dc
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 22:02:21 2018 -0600

    regen/mph.pl: Add comment to generated code
    
    That code is uni_keywords.h

commit a1bc151896b0085b96b271807c121fc568f5e98b
Author: Karl Williamson <k...@cpan.org>
Date:   Fri Apr 20 11:37:20 2018 -0600

    regen/charset_translations.pl: #if indent is 2 spaces
    
    It was instead making it 3,7,11...

commit 9d2c511a683d948c0bb441b0746aa25904357649
Author: Karl Williamson <k...@cpan.org>
Date:   Fri Apr 20 11:26:27 2018 -0600

    Make the SCX enums public
    
    These enums are scheduled to be used outside the files that they now are
    defined in.

commit 6b11eb64d8b9a4bbc645d0b8a2a33c9f71f4e933
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 17:46:03 2018 -0600

    regen/mk_invlists.pl: Omit unnecessary #if's
    
    In places, #endifs were unconditionally added followed by the
    same #ifdef they just ended.

commit e9fecd5cdb4309ccd8534bfb2041a633a79c1b99
Author: Karl Williamson <k...@cpan.org>
Date:   Fri Apr 20 10:59:40 2018 -0600

    regen/mk_invlists.pl: uni_keywords.c no longer exists
    
    So no need to do an #ifdef for it.

commit b17fe1e1bc673bf480c24f2cb1a390b75e889c4f
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 17:00:13 2018 -0600

    regen/mk_invlists.pl: depends on mk_PL_charclass.pl
    
    The previous 2 commits show that this script is subtly dependent on
    mk_PL_charclass.pl.  Make that explicit.

commit c83f3e932b67430b6a26dacf8558837064d66b16
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 16:55:40 2018 -0600

    regen/mk_PL_charclass.pl: White-space only
    
    Outdent code that had its surrounding block removed

commit 31aba025ab59e5093a40adeb656cbba0914f3cb6
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 16:47:58 2018 -0600

    regen/mk_PL_charclass.pl: Revamp
    
    The change in 5.28 to having precompiled Unicode properties leaves this
    program with a chicken-and-egg problem.  Prior to this commit, it used
    those properties to construct its output, relying on them to be using
    the latest Unicode data, but the code that generates the tables from
    that data uses the output of this program, with potentially disastrous
    results.
    
    This commit changes to use the data itself, through Unicode::UCD.

commit abb222f5fa50d491645c4e65c57241ecdc24d6a2
Author: Karl Williamson <k...@cpan.org>
Date:   Tue Apr 24 15:30:05 2018 -0600

    regen/mk_PL_charclass.pl: sort output table
    
    This makes it easier to verify that future commits don't change
    anything.

commit b9e05e5b69eba85761699fad9724fccf9d8f922a
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 16:11:39 2018 -0600

    numeric.c: White-space only
    
    Outdent after the previous commit removed an enclosing block

commit d27b3c3e8b6b018f11801187efc2342f8290e8bb
Author: Karl Williamson <k...@cpan.org>
Date:   Tue May 1 14:23:23 2018 -0600

    grok_atoUV: allow non-C strings and document
    
    This changes the internal function grok_atoUV() to not require its input
    to be NUL-terminated.  That means the existing calls to it must be
    changed to set the ending position before calling it, as some did
    already.
    
    This function is recommended to use in a couple of pods, but it wasn't
    documented in perlintern.  This commit does that as well.

commit 64f27c42a5394d1a58684ea72063ce725b52eeae
Author: Karl Williamson <k...@cpan.org>
Date:   Mon Apr 30 10:46:01 2018 -0600

    Create my_atof3()
    
    This is like my_atof2(), but with an extra argument signifying the
    length of the input string to parse.  If that length is 0, it uses
    strlen() to determine it.
    
    Then my_atof2() just calls my_atof3() with a zero final parameter.
    And this commit just uses the bulk of the current my_atof2() as the core
    of my_atof3().  Changes were needed however, because it relied on
    NUL-termination in a number of places.
    
    This allows one to convert a string that isn't necessarily
    NUL-terminated to an NV.

commit 716d5b7246bf2d7dc430c22b0ad7a74423d1f296
Author: Karl Williamson <k...@cpan.org>
Date:   Mon Apr 30 11:48:46 2018 -0600

    embed.fnc: Fix my_atof2() entry
    
    This was using the incorrect formal parameter name.  It did not generate
    an error because the function declares a variable with the incorrect
    name, so that this was actually asserting on the wrong thing.

-----------------------------------------------------------------------

-- 
Perl5 Master Repository

[perl.git] branch smoke-me/khw-core created. v5.27.11-77-g743b6ca19e

Reply via email to