In perl.git, the branch khw/ebcdic has been created

<http://perl5.git.perl.org/perl.git/commitdiff/8c3631e66cb042fe0218898ea872d946ed8c98a5?hp=0000000000000000000000000000000000000000>

        at  8c3631e66cb042fe0218898ea872d946ed8c98a5 (commit)

- Log -----------------------------------------------------------------
commit 8c3631e66cb042fe0218898ea872d946ed8c98a5
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 19:51:06 2013 -0600

    XXX skip fold_grind

M       t/re/fold_grind.t

commit 934db412d9ae3561431ea34726b0f39ae8e661eb
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 15:44:59 2013 -0600

    XXX t/TEST: Avoid SIGPIPEs

M       t/TEST

commit 2e17a7dfd4f882a07853673fcbd2045dd50156ed
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 15:49:08 2013 -0600

    XXX Temporarily test normalization

M       cpan/Unicode-Normalize/t/fcdc.t
M       cpan/Unicode-Normalize/t/form.t
M       cpan/Unicode-Normalize/t/func.t
M       cpan/Unicode-Normalize/t/illegal.t
M       cpan/Unicode-Normalize/t/norm.t
M       cpan/Unicode-Normalize/t/null.t
M       cpan/Unicode-Normalize/t/partial1.t
M       cpan/Unicode-Normalize/t/partial2.t
M       cpan/Unicode-Normalize/t/proto.t
M       cpan/Unicode-Normalize/t/split.t
M       cpan/Unicode-Normalize/t/test.t
M       cpan/Unicode-Normalize/t/tie.t

commit ba33f4801f787811d6755609d693dade4ab86883
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 14:06:50 2013 -0600

    op/index.t: Fix tests for EBCDIC
    
    Commit 8a38a836 erroneously translates literals into the native
    encoding, causing a double translation, which is garbage.

M       t/op/index.t

commit eb0caa5938588a48a3200bb4d5b55778700c53b8
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 20:43:38 2013 -0600

    op/chop.t: Fix for EBCDIC
    
    One test is skipped because the code point is not representable on
    EBCDIC platforms.  Another test is modified to work on EBCDIC.

M       t/op/chop.t

commit 4e5ddac7a375018ec36c714adf1654959e0ee45b
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 19:56:50 2013 -0600

    t/op/lc.t: Fix to work under EBCDIC
    
    This had code that attempted this, but it was wrong.  The conversion to
    EBCDIC must be done before the \U, or similar.

M       t/op/lc.t

commit 2f4245b5614220c610cf2bdde317ce3f9e466560
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 16:37:04 2013 -0600

    XXX fix this later based on comment

M       dist/IO/t/io_utf8argv.t

commit 124dfb336bdd28d9afbb775ab1fa606a1c2524c8
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 15:33:55 2013 -0600

    Skip some tests under EBCDIC
    
    EBCDIC won't work on these because of inherent differences from ASCII

M       t/porting/customized.t
M       t/porting/manifest.t

commit 11229d42f806816d435c6d13e86cc413d9837e76
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 15:04:14 2013 -0600

    porting/bincompat.t: Skip under EBCDIC
    
    because the sorting order is different

M       t/porting/bincompat.t

commit 97f0086a7de6b22de58ba496eafe1bce98d133e9
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 14:59:50 2013 -0600

    t/re/regex_sets.t: So will pass under EBCDIC

M       t/re/regex_sets.t

commit 88399154c8729ac73025de203d6187d08528e276
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 14:59:26 2013 -0600

    t/porting/bincompat.t: Typo in comment

M       t/porting/bincompat.t

commit 32ce59a8a95029f711b8b182b16d20bd823f449e
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 13:09:09 2013 -0600

    XXX fix \x{too large}

M       dist/IO/IO.xs
M       doop.c
M       inline.h
M       pp.c
M       pp_pack.c
M       regcomp.c
M       sv.c
M       toke.c
M       utf8.c
M       utf8.h

commit 6cd52ca028b49fe1c992aa34d4767fdbedcf7fc6
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 17:59:59 2013 -0600

    mktables: Fix typo in comment
    
    This used a real CTRL-X, instead of $^X

M       lib/unicore/mktables

commit 72a17c7994901e6d0debbb8063882b501e0c61db
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:16:08 2013 -0600

    utf8.c: Fix so UTF-16 to UTF-8 conversion works under EBCDIC

M       utf8.c

commit 9cebae80c6fb07b3c361e4a1f212035d5cbfbdd6
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:14:34 2013 -0600

    utf8.h, utfebcdic.h: Add #define

M       utf8.h
M       utfebcdic.h

commit fb48cebf74ecc9feb2143d9575a9dab4dd51b57c
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:11:25 2013 -0600

    utf8.c: Use mnemonics instead of hex numbers

M       utf8.c

commit 002953ccdf79eabb9d4ebacf93e9d33e77b3d136
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 20 22:15:58 2013 -0600

    lib/Unicode/UCD.t: Allow to run under EBCDIC,

M       lib/Unicode/UCD.t

commit a7da53806ae5ab2fe747fbca61b01b9d714e9527
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 15:27:31 2013 -0600

    t/op/quotemeta.t: EBCDIC fixes

M       t/op/quotemeta.t

commit ab6a74bf3887e9771e8bd1a784dd09d09aa2a9d4
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:32:55 2013 -0600

    t/re/fold_grind.t: Fixes for EBCDIC

M       t/re/fold_grind.t

commit 1d567712905275d08234ff905769afca070df5e5
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:21:09 2013 -0600

    t/lib/charnames/alias: Fix some EBCDIC problems

M       t/lib/charnames/alias

commit bac753642226e98ea24a10c6c01398f8d6ede7f3
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:20:24 2013 -0600

    t/uni/class.t: Make work on EBCDIC

M       t/uni/class.t

commit ab0b8ebd0feda959e4348dc9d870cacaba487f92
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:01:57 2013 -0600

    feature/unicode_strings.t: Fix to work on EBCDIC

M       lib/feature/unicode_strings.t

commit 109f5a2d6969c4d7bb746937462d509c01d9ca9c
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:33:59 2013 -0600

    XXX lib/strict.t: Skip for now
    
    this is to see if this is what is causing the tests to not finish.

M       lib/strict.t

commit b3c3d6bfdbc53d4e671d403ef0b9d759f1206b1f
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:12:30 2013 -0600

    regen/regcharclass.pl: make more EBCDIC friendly
    
    One of the possible inputs to this process is a string.  This clarifies
    that it must be specified in Unicode characters, and adds code to
    translate it to native, if necessary.

M       regen/regcharclass.pl

commit 1d70a4d2eccd095471f9f82bcb634907378deb80
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:10:46 2013 -0600

    XXX regen/regcharclass.pl: maybe temp comment out utf8_char

M       regen/regcharclass.pl

commit 0e5956b106ff1ed38566625caf801380bbd0fc7e
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 22:23:47 2013 -0600

    XXX temporary regcharclass.h
    
    This has parts that were generated on z/OS, and look good; the ones that
    are still problematic are omitted;
    
    f

M       regcharclass.h

commit d445b1883da94ba106c3f99cbe47cd306145e8aa
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:09:53 2013 -0600

    XXX temporary comment out multi-char folds

M       regcomp.c
M       regen/regcharclass.pl
M       regexec.c
M       t/re/fold_grind.t
M       t/re/reg_fold.t

commit b6f2efad927f6c6194a3e6ed30251dbf75b12c1f
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 22:00:29 2013 -0600

    XXX temp skip perl5db.t

M       lib/perl5db.t

commit f342c336cda468b9963a211ce4ac73fa699ef7e5
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 11:45:06 2013 -0600

    pp.c: White-space only
    
    Make a ternary operation more clear

M       pp.c

commit db5dcc1ca991ce81d5d053679dbe11698a69f20f
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 11:43:42 2013 -0600

    Fix valid_utf8_to_uvchr() for EBCDIC

M       utf8.c

commit b6f5a6b6c88d043a30f64e2087738324f530deb8
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 17 21:42:20 2013 -0600

    t/test.pl: Add comment about EBCDIC

M       t/test.pl

commit 66b67846140086354660e07adf737bd1532cb808
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 17 17:39:33 2013 -0600

    XXX makedepend.SH: Why does 255 work and 250 not?

M       makedepend.SH

commit b975040e0c85f3aa35686e93855751f0eec54038
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:48:22 2013 -0600

    XXX regen/mk_PL_charclass.pl: Make EBCDIC friendly
    
    need more of a commit message

M       regen/mk_PL_charclass.pl

commit c8412df2dc07cc732d37ffdf2fd9e50300359ee0
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:44:44 2013 -0600

    XXX make various things more EBCDIC friendly
    
    Adds trailing white space errors
    Need to know what to do about ^A meaning 0x1, and M-foo meaning meta

M       lib/DB.pm
M       lib/dumpvar.pl
M       lib/perl5db.pl
M       lib/sigtrap.pm

commit 2d353d20dea20a4f67fba4af7267cfa72fd26e23
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:41:57 2013 -0600

    XXX charnames.t: Make more EBCDIC friendly
    
    Why need utf8::unicode_to_native

M       lib/charnames.t

commit f84792cc468aa38a0f03a3e911a53709399aa695
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:41:15 2013 -0600

    XXX: Fixup commit message.
    
    Fix UTF8_ACUUMULATE, utf8.c

M       utf8.c
M       utf8.h

commit 45bb2cd85dab5a6241f319e10c1f35e9edd502cd
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 16:52:45 2013 -0600

    regcomp.c: Fix bug in EBCDIC
    
    The POSIXA and NPOSIXA regnodes need to set the bits on only the ASCII
    code points, but under EBCDIC those code points are 0-127.

M       regcomp.c

commit eabe6d3121839d2f07a7bfe19d60090e1f43db71
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 12:26:15 2013 -0600

    hints/os390.sh: Suppress bogus compiler message

M       hints/os390.sh

commit 2e31633c43e5bb0f5e7dfde4f4634c2cfcbfa86e
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 11:57:24 2013 -0600

    re/charset.t: Allow to work on EBCDIC
    
    This just converts the hard-coded character numbers to native, so will
    work on any platform.

M       t/re/charset.t

commit 2e0a5cf83f1b0d4146737b18ef491f331551d306
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 11:50:35 2013 -0600

    XS-APItest/t/handy.t: Change output message
    
    On EBCDIC platforms, the output is not in terms of \N{U+}; change text
    to \x{ }

M       ext/XS-APItest/t/handy.t

commit 0865dde6c519a53c9f1a6998ada14704f5b148c1
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 21:44:16 2013 -0600

    XXX Dumper.xs: Don't know why this stopped compiling

M       dist/Data-Dumper/Dumper.xs

commit 662d1194df6eefbbc96487f4d89001d7ad3288ee
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:22:28 2013 -0600

    toke.c: Fix an ASCII-platform dependency

M       toke.c

commit 749948f9cc69c2c2b9382f29f6d5658aac286d18
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:20:23 2013 -0600

    toke.c: Simplify some code
    
    We don't have to test separately for lower vs uppercase here, as
    upper/lower case A-Z and a-z are not intermixed in the gaps in A-Z and
    a-z under EBCDIC.

M       toke.c

commit 70747320d4c021160ee117764039c616da649249
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:18:12 2013 -0600

    genpacksizetables.pl: Correct comment typo

M       genpacksizetables.pl

commit 29e76c9403614b6e00389f0ea8bb409dae4972a1
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:17:39 2013 -0600

    APItest/t/handy.t: Make EBCDIC-friendly

M       ext/XS-APItest/t/handy.t

commit 2d9e9b61ed244ea7ca2ff8cdf875779514278168
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:16:14 2013 -0600

    Data-Dumper: Make EBCDIC-friendly

M       dist/Data-Dumper/Dumper.xs

commit 13052d59c41162667e726da92f52257494fe75b0
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:14:31 2013 -0600

    sv.c: Make less ASCII-centric

M       sv.c

commit 748859ec6ed2097fb8b49965a3a1ce7d36d709fc
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:07:52 2013 -0600

    lib/charnames.t: Make some tests work under EBCDIC

M       lib/charnames.t

commit f28b9f1384dfc875f8ec1ceb94c40c23b79c61b9
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:05:46 2013 -0600

    dump.c: Make less ASCII-centric:
    
    This has the added advantage of being clearer as to what is going on.

M       dump.c

commit c9e2fab704b87758b5eca28720d449a8cdb357ac
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:02:52 2013 -0600

    hv.c: Stop being ASCII-centric
    
    This uses macros which work cross-platform.  This has the added advantge
    that it is much clearer what is going on.

M       hv.c

commit 12deaf49accc6d33b1083c15ccc69fed23beedd9
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 12 22:34:17 2013 -0600

    t/TEST: Don't bail if fails in t/base unless minitest
    
    In order to completely compile Perl, many modules must have been parsed
    and compiled, so if there is a full perl, we know that things basically
    work.  The purpose of bailing out is that if these supposedly very base
    level functionality tests don't work, there's no point in continuing.
    But over the years, tests of more esoteric functionality have been
    added here, and if one of them doesn't work, it still could be that Perl
    pretty much does work.
    
    I believe it would be best to move such non-basic tests elsewhere, but
    that's work, and hasn't bitten us much so far; this change lessens the
    severity of the biting even more.  Where it will really bite is if
    things are so bad that a full perl binary can't be compiled, and we are
    trying to figure out why using minitest.

M       t/TEST

commit 82620cf60a6274cf74c3ffe6ad838e36d80badc6
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 11 15:11:10 2013 -0600

    Added Porting/reorder_charclass_invlists.pl
    
    This program is used too bootstrap perl onto a non-ASCII platform with
    no pre-existing perl.

M       MANIFEST
A       Porting/reorder_charclass_invlists.pl

commit c280a33045a5ed8295dc767bf4725b981fae5282
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 22:17:31 2013 -0600

    XXX See if changing \xE2 to \xE1 causes lex.t to work for EBCDIC
    
    \xE2 is 'S' in EBCDIC, and so is going to be legal.  \xE1 is not an
    ASCII equivalent.

M       t/base/lex.t

commit 2d2faceb8dfe7fa91d1018b69d8320e4a8908150
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 8 11:01:32 2013 -0700

    XXX EBCDIC header files

M       charclass_invlists.h
M       l1_char_class_tab.h
M       unicode_constants.h

commit 2b403a25d58faf2bc37b600a23226d172ee16118
Author: John Goodyear <[email protected]>
Date:   Sat Mar 2 12:31:25 2013 -0700

    XXX Temporary for z/OS long long support

M       Configure
M       hints/os390.sh

commit b5c3fe1ce2fe1f5c33c2a574e36389c40ac01c8d
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 13:11:07 2013 -0600

    XXX Temporary comment out ParseXS check
    
    this is to get things to compile for now

M       dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS.pm

commit 297b5854f4ba9e7ac9d3e1740d963977b8a5c0e2
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 11:34:10 2013 -0600

    XXX Collate, Normalize: Allow to compile under EBCDIC

M       cpan/Unicode-Collate/Collate.pm
M       cpan/Unicode-Collate/mkheader
M       cpan/Unicode-Normalize/Normalize.pm
M       cpan/Unicode-Normalize/mkheader

commit 30661b2a7e1e2b41e5dfae3107f8285e06421f2b
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 9 21:57:38 2013 -0700

    XXX dquote_static.c: Silence wrong warning on EBCDIC
    
    Unsure of whether to add the 2nd !isCNTRL_L1 to silence return trip,
    which should be a separate commit anyway.
    
    This silences an inappropriate warning that doesn't happen on ASCII
    platforms.  CTRL-T maps to 0x14 on both ASCII and EBCDIC platforms.  But
    0x14 is a C1 control on EBCDIC, a C0 on ASCII.  Therefore the test that
    it's a control should include both C0 and C1, which isCNTRL_L1() does.
    
    Also has a white-space change, outdenting a line so it doesn't wrap in
    an 80 column window.

M       dquote_static.c

commit a6d2095203e9645f0bf8dc263eecc38c9133818d
Author: Karl Williamson <[email protected]>
Date:   Thu Mar 7 12:08:41 2013 -0700

    utfebcdic.h: Change 'unsigned char' to U8
    
    This is for consistency with the rest of Perl

M       utfebcdic.h

commit 507466062fabd214fe2d5337b57c892699494129
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 8 08:11:38 2013 -0700

    regen/regcharclass.pl: Make more EBCDIC-friendly
    
    This commit changes the code generated by the macros so that they work
    right out-of-the-box on non-ASCII platforms for non-UTF-8 inputs.  THEY
    ARE WRONG for UTF-8, but this is good enough to get perl bootstrapped
    onto the target platform, and regcharclass.pl can be run there,
    generating macros correct UTF-8.

M       regcharclass.h
M       regen/regcharclass.pl

commit b1ac448036b9d03a2bf3648331d209b9454ef7c5
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 6 21:30:01 2013 -0700

    utfebcdic.h: Add (UV) cast
    
    The operand of this macro is implicitly a UV.  Make sure that it is.

M       utfebcdic.h

commit 349b42566f89409917b8e0f30dbc675f950281fa
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 6 17:04:58 2013 -0700

    handy.h: Allow bootstrapping to non-ASCII platform
    
    This adds a bunch of macros and moves things around to support
    conditional compilation when Configure is called with
    -DBOOTSTRAP_CHARSET.  Doing so causes the usual macros that are
    table-driven to not be used, since the table may not be valid when
    bringing Perl up for the first time on a non-ASCII platform.
    
    This allows it to compile using the platform's native C library ctype
    functions, which should work enough to compile miniperl, and allow the
    table to be changed to be valid.  Then Configure can be re-run to not
    bootstrap, and normal compilation can proceed

M       handy.h
M       inline.h

commit 61c99f606fe04771e2667dc361ed249a902f0ade
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 13:43:26 2013 -0700

    gv.c: Remove EBCDIC dependency

M       gv.c

commit 03fda7817c07a35b91bad506fd03443eb25c49ed
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 13:00:47 2013 -0700

    toke.c: Remove EBCDIC dependency

M       toke.c

commit 786c3815c5af584f35c1d6a905fce2ca5271588c
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 09:14:25 2013 -0700

    toke.c: Remove character set dependency
    
    Instead of hard-coding the bit patterns that comprise the Byte Order
    Mark in the UTF-8 or UTF-EBCDIC encodings, use the generated ones for
    the current platform.
    
    This removes some EBCDIC-only code.

M       toke.c

commit a6a2187fff671044da32b8568cba6c870a357dfc
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 09:10:27 2013 -0700

    unicode_constants.h: Add #defines for Byte Order Mark
    
    These will be used in future commits

M       regen/unicode_constants.pl
M       unicode_constants.h

commit 4e9412556c4c31a36dc35e428cd5b84fac369c81
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 15:04:18 2013 -0700

    XXX: Find a cleaner way. Handle missing is_UTF8_CHAR_utf8_safe
    
    This macro may not be present, and is currently used exclusively in
    IS_UTF8_CHAR, which itself may be undefined, and code should cope with
    that.  This is a work-around until a better solution is found.

M       utf8.c
M       utf8.h

commit 4c60e91de523b5dc6374cc6a68a5452e4d4e93a3
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 14:09:04 2013 -0700

    Add Porting tool for help with non-ASCII platforms
    
    Porting/reorder_l1_char_class_tab.pl is used to bootstrap Perl onto a
    non-ASCII platform with no working Perl.

M       MANIFEST
A       Porting/reorder_l1_char_class_tab.pl
M       regen/mk_PL_charclass.pl

commit 7056c99221c128cf97690a4cf8d8c7bc803be7f2
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 13:06:58 2013 -0700

    inline.h: Reorder functions
    
    The comment implied that the functions below it in the file were
    deprecated, but in fact only the next two functions were.  This
    clarifies that and moves them so they are the final ones in the file

M       inline.h

commit 40d73dedff80f3ccc71b51cf74210098b0cd45c0
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:33:42 2013 -0700

    utfebcdic.h: Add comment

M       utfebcdic.h

commit 0e1c5617732f3784e3296f05d647f589a66aaf72
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:12:11 2013 -0700

    utf8.h: Clean up START_MARK definition and use
    
    The previous definition broke good encapsulation rules.  UTF_START_MARK
    should return something that fits in a byte; it shouldn't be the caller
    that does this.  So the mask is moved into the definition.  This means
    it can apply only to the portion that creates something larger than a
    byte.  Further, the EBCDIC version can be simplified, since 7 is the
    largest possible number of bytes in an EBCDIC UTF8 character.

M       utf8.h
M       utfebcdic.h

commit aaffe3e0a2d7125a5753a8025dbff10b6b14d572
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:05:26 2013 -0700

    utf8.h: Move #includes
    
    These two files were only being #included for non-ebcdic compiles; they
    should be included always.

M       utf8.h

commit 1a5e34c4666e551b78141be95e3cf3587d45163f
Author: John Goodyear <[email protected]>
Date:   Sat Mar 2 11:49:14 2013 -0700

    utfebcdic.h: Remove extra parameter expansions
    
    These two macros were improperly expanding the parameters as well as
    defining the operation, leading to compile errors.

M       utfebcdic.h

commit f6fce40c80d8273765d7301419efa7210a202390
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 1 08:28:52 2013 -0700

    utf8.h: Simplify UTF8_EIGHT_BIT_foo on EBCDIC
    
    These macros were previously defined in terms of UTF8_TWO_BYTE_HI and
    UTF8_TWO_BYTE_LO.  But the EIGHT_BIT versions can use the less general
    and simpler NATIVE_TO_LATN1 instead of NATIVE_TO_UNI because the input
    domain is restricted in the EIGHT_BIT.  Note that on ASCII platforms,
    these both expand to the same thing, so the difference matters only on
    EBCDIC.

M       utf8.h

commit 5b226e050960d73b6558662ae2c97ccfd54c192c
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 28 09:25:27 2013 -0700

    XXX temp:  show makedepend cerr

M       makedepend.SH

commit 5c5fd5d1f9d189b6bc46e233aac1a572a6b90a82
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 21:59:11 2013 -0700

    makedepend.SH: Split too long lines; properly join
    
    I had thought that a continuation introduced a space.  But no,
    a continuation can happen in the middle of a token.
    
    And this splits lines that are getting very long to avoid preprocessor
    limitations.

M       makedepend.SH

commit ff22cf260abba64771f943d6984f0f9988f334dd
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 15:51:28 2013 -0700

    makedepend.SH: White-space only
    
    Align continuation backslashes

M       makedepend.SH

commit 2c15a2a936a0d3cdd4c25fc13fee8c2336108168
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 14:39:28 2013 -0700

    makedepend.SH: Remove some unnecessary white space
    
    Multi-line preprocessor directives are now joined into single lines.
    This can create lines too long for the preprocessor to handle.  This
    commit removes blanks adjoining comments that get deleted.  This makes
    things somewhat less likely to exceed the limit.
    
    This commit also fixes several [] which were meant to each match a tab
    or a blank, but editors converted the tabs to blanks

M       makedepend.SH

commit 4048977b9f58527a57b7b291c17dce5f4de1159e
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 14:30:51 2013 -0700

    makedepend.SH: Retain '/**/' comments
    
    These comments may actually be necessary.

M       makedepend.SH

commit 65e16f59f688201424c6d6a8e8f46a9a80659ffa
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 08:38:19 2013 -0700

    handy.h: Remove extraneous parens

M       handy.h

commit 3fde07d2347ace252ffc9de9f5c6776c431b5b89
Author: Andy Dougherty <[email protected]>
Date:   Wed Feb 27 13:06:07 2013 -0500

    Disable gcc-style function attributes on z/OS.
    
    John Goodyear <[email protected]> reports that the z/OS C compiler
    supports the attribute keyword, but not exactly the same as gcc.
    Instead of a "warning", the compiler emits an "INFORMATIONAL" message
    that Configure fails to detect.  Until Configure is fixed, just disable
    the attributes altogether.
    
    John Goodyear

M       hints/os390.sh

commit 838c1dd97dfb29e7e14080f154b0ecd321a759e5
Author: Andy Dougherty <[email protected]>
Date:   Wed Feb 27 09:12:13 2013 -0500

    Change os390 custom cppstdin script to use fgrep.
    
    Grep appears to be limited to 2048 characters, and truncates
    the output for cppstin.  Fgrep apparently doesn't have that limit.
    Thanks to John Goodyear <[email protected]> for reporting this.

M       hints/os390.sh

commit f70b3f9e49925d7604450d7b9f7e900f287e2c89
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:45:19 2013 -0700

    utf8.c: Use more clearly named macro
    
    In the case of invariants these two macros should do the same thing,
    but it seems to me that the latter name more clearly indicates what is
    going on.

M       utf8.c

commit 18d724da27c02bf7e1daa9930c54082deb889482
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:35:12 2013 -0700

    Add macro OFFUNISKIP
    
    This means use official Unicode code point numbering, not native.  Doing
    this converts the existing UNISKIP calls in the code to refer to native
    code points, which is what they meant anyway.  The terminology is
    somewhat ambiguous, but I don't think will cause real confusion.
    NATIVESKIP is also introduced for situations where it is important to be
    precise.

M       toke.c
M       utf8.c
M       utf8.h
M       utfebcdic.h

commit ed061a43a9a3519efdf8afea130f55952c3d1723
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:22:19 2013 -0700

    toke.c: white space only

M       toke.c

commit 89e236dd42e3a9b6df771574f781bd65a7422500
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 12:08:50 2013 -0700

    utf8.c: Deprecate two functions
    
    This is to force any code that has been using these functions to change.
    Since the Unicode tables are now stored in native order, these functions
    should only rarely be needed.
    
    However, the functionality of these is needed, and in actuality, on
    ASCII platforms, the native functions are #defined to these.  So what
    this commit does is rename the functions to something else, and create
    wrappers with the old names, so that anyone using them will get the
    deprecation.

M       embed.fnc
M       embed.h
M       mathoms.c
M       proto.h
M       toke.c
M       utf8.c
M       utf8.h

commit 45df15e7c07c624d50d3b580e541c85b5db15b0a
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 11:26:09 2013 -0700

    Deprecate uvuni_to_utf8()
    
    Code should almost never be dealing with non-native code points

M       embed.fnc
M       embed.h
M       proto.h
M       toke.c
M       utf8.c
M       utf8.h

commit c5e36c9e77e87f565076c6b863be4220ca2962e0
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 11:02:33 2013 -0700

    Deprecate utf8_to_uni_buf()
    
    Now that the tables are stored in native order, there is almost no need
    for code to be dealing in Unicode order.

M       embed.fnc
M       proto.h
M       utf8.c

commit 57b5fafa629e8af3bf665a68d3884b84ae2bec08
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 09:00:18 2013 -0700

    makedepend.SH: Comment out unnecessary code
    
    This causes problems currently for z/OS.  But, since we don't know why
    it was there, I'm leaving it in as a placeholder.

M       makedepend.SH

commit 99a91382afd746f6ca02b16c3085dce0f5b97e89
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 20:26:44 2013 -0700

    Deprecate valid_utf8_to_uvuni()
    
    Now that all the tables are stored in native format, there is very
    little reason to use this function; and those who do need this kind of
    functionality should be using the bottom level routine, so as to make it
    clear they are doing nonstandard stuff.

M       embed.fnc
M       proto.h
M       utf8.c

commit e6cb508aaf4bc6db649ebc0b2ebc810e8029e17b
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 20:14:26 2013 -0700

    utf8.c: Swap which fcn wraps the other
    
    This is in preparation for the current wrapee becoming deprecated

M       embed.fnc
M       embed.h
M       proto.h
M       utf8.c
M       utf8.h

commit d074a1772727a7514d2ecd38acce971ba4d348ef
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 19:29:34 2013 -0700

    utf8.c: Skip a no-op
    
    Since the value is invariant under both UTF-8 and not, we already have
    it in 'uv'; no need to do anything else to get it

M       utf8.c

commit fb51f468200aebfe411759bd4db0182c17285d0a
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 19:26:50 2013 -0700

    utf8.c: Move comment to where makes more sense

M       utf8.c

commit c02f992e1eacd22e6fe0c9e1db0d1049692c4091
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:30:10 2013 -0700

    APItest: Test native code points, instead of Unicode

M       ext/XS-APItest/APItest.pm
M       ext/XS-APItest/APItest.xs
M       ext/XS-APItest/t/utf8.t

commit 3352cc3b3de83b30d7b0e75a9afcd2646a675546
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:25:08 2013 -0700

    XXX CPAN Normalize
    
    This converts Unicode::Normalize to use the native tables that are used
    by Perl starting in XXX, while using the Unicode-ordered ones that were
    used before then.
    
    Another alternative would be to have mktables generate just these tables
    in Unicode ordering.

M       cpan/Unicode-Normalize/Normalize.xs

commit 67296a525e9bb5138681cd2ff5ffc5f4225b7ed0
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:22:55 2013 -0700

    XXX CPAN prob wrong Collate
    
    This changes to implicity usenative code points.  This is likely wrong,
    as the module comes with its own data, that are probably in terms of
    Unicode

M       cpan/Unicode-Collate/Collate.xs

commit 6d1be47bb6999be66b0c8b891e9586adc5014679
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:12:53 2013 -0700

    XXX CPAN Encode.xs
    
    Use core function if available.  This will insulate this code from any
    future changes.

M       cpan/Encode/Encode.xs

commit 13c166c07b0f22bd4ed38a46b9513d0d029de894
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:04:24 2013 -0700

    XXX CPAN and unsure Encode

M       cpan/Encode/Encode.xs
M       cpan/Encode/Unicode/Unicode.xs

commit 4e0b74f8768363ae98071708587377a7f2483e77
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:00:47 2013 -0700

    XXX CPAN Encode.xs: fix indent

M       cpan/Encode/Encode.xs

commit 5e8ccee83d2a456abc591a70b5df42c6edafafa2
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 17:23:15 2013 -0700

    Don't refer to U+XXXX when mean native
    
    These messages say the output number is Unicode, but it is really
    native, so change to saying is 0xXXXX.

M       regen/regcharclass_multi_char_folds.pl
M       regexec.c

commit d9e1e889c59bf2c573b085d7f199bad9a8cd4aa8
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:43:59 2013 -0700

    Convert some uvuni() to uvchr()
    
    All the tables are now based on the native character set, so using
    uvuni() in almost all cases is wrong.

M       cygwin/cygwin.c
M       doop.c
M       op.c
M       pp_pack.c
M       regcomp.c
M       regexec.c
M       toke.c
M       utf8.c

commit 058bbca773482f3d7bb7c8c01411f1b41da7c97c
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:25:47 2013 -0700

    handy.h: White space only

M       handy.h

commit 4dd5322213be0e855e05d7d0834ae1457c91f75c
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:19:49 2013 -0700

    t/test.pl: Allow native/latin1 string conversions to work on utf8.
    
    These functions no longer have the hard-coded definitions in them,
    but now end up resolving to internal functions, so that new encodings
    could be added and these would automatically understand them.
    
    Instead of using tr///, these now go character by character and
    converting to/from ord, which is slower, but allows them to operate on
    utf8 strings.
    
    Peephole optimization should make these essentially no-ops on ascii
    platforms.

M       t/test.pl

commit 932b0316ab1a1ea24eed7119819d5b957e9f7a95
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:05:55 2013 -0700

    t/test.pl: Simplify ord to/from native fcns
    
    This commit changes these functions from converting to/from a string to
    calling utf8:: functions which operate on ordinals instead.

M       t/test.pl

commit aef3a5780dead383b47161d4b97c9e7cb9f6125a
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 15:35:38 2013 -0700

    Make casing tables native
    
    These are final tables that haven't been converted to native character
    set casing.

M       perl.h
M       utfebcdic.h

commit 54cd5ae25172e8ec7d961ff6a968758238b9e62a
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 15:32:30 2013 -0700

    utfebcdic.h: Remove trailing spaces

M       utfebcdic.h

commit adc50d1afb9eb27f62411ba15c8e20a4db1ce546
Author: Karl Williamson <[email protected]>
Date:   Fri Feb 22 18:55:26 2013 -0700

    EBCDIC has the unicode bug too
    
    We have not had a working modern Perl on EBCDIC for some years.  When I
    started out, comments and code led me to conclude erroneously that
    natively it supported semantics for all 256 characters 0-255.  It turns
    out that I was wrong; it natively (at least on some platforms) has the
    same rules (essentially none) for the characters which don't correspond
    to ASCII onees, as the rules for these on ASCII platforms.
    
    A previous commit for 5.18 changed the docs about this issue.  This
    current commit forces ASCII rules on EBCDIC platforms (even should there
    be one that natively uses all 256).  To get all 256, the same things
    like 'use feature "unicode_strings"' must now be done.

M       handy.h

commit e3bfc0e5f8413f1578d4e83e94178d38b5e58087
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:47:52 2013 -0700

    handy.h: Solve a failure to compile problem under EBCDIC
    
    handy.h is included in files that don't include perl.h, and hence not
    utf8.h.  We can't rely therefore on the ASCII/EBCDIC conversion
    macros being available to us.  The best way to cope is to use the native
    ctype functions.  Most, but not all, of the macros in this commit
    currently resolve to use those native ones, but a future commit will
    change that.

M       handy.h

commit f1e66be89c668764953607b19b5a122fd9af537f
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:35:12 2013 -0700

    handy.h: Simplify some macro definitions
    
    Now, only one of the macros relies on magic numbers (isPRINT), leading
    to clearer definitions.

M       handy.h

commit 8792a29f9a2ee27bcf20cd15adc05983f2ec9350
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:26:49 2013 -0700

    handy.h: Combine macros that are same in ASCII, EBCDIC
    
    These 4 macros can have the same RHS for their ASCII and EBCDIC
    versions, so no need to duplicate their definitions
    
    This also enables the EBCDIC versions to not have undefined expansions
    when compiling without perl.h

M       handy.h

commit 46dd209e4da8bf8ffc4dbc794698e8794528de45
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 10:39:48 2013 -0700

    Deprecate NATIVE_TO_NEED and ASCII_TO_NEED
    
    These macros are no longer called in the Perl core.  This commit turns
    them into functions so that they can use gcc's deprecation facility.
    
    I believe these were defective right from the beginning, and I have
    struggled to understand what's going on.  From the name, it appears
    NATIVE_TO_NEED taks a native byte and turns it into UTF-8 if the
    appropriate parameter indicates that.  But that is impossible to do
    correctly from that API, as for variant characters, it needs to return
    two bytes.  It could only work correctly if ch is an I8 byte, which
    isn't native, and hence the name would be wrong.
    
    Similar arguments for ASCII_TO_NEED.
    
    The function S_append_utf8_from_native_byte(const U8 byte, U8** dest)
    does what I think NATIVE_TO_NEED intended.

M       embed.fnc
M       mathoms.c
M       proto.h
M       toke.c
M       utf8.h
M       utfebcdic.h

commit a0b5c74da1f9c5849abec6afbd1cb9661ec9e9a7
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 10:26:43 2013 -0700

    Remove remaining calls of NATIVE_TO_NEED
    
    These calls are just copying the input to the output byte by byte.
    There is no need to worry about UTF-8 or not, as the output is just an
    exact copy of the input

M       toke.c

commit e5a4bb5592a44d08893f1961a15c514ec412b57f
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 08:12:15 2013 -0700

    toke.c: Remove some NATIVE_TO_NEED calls
    
    I believe NATIVE_TO_NEED is defective, and will remove it in a future
    commit.  But, just in case I'm wrong, I'm doing it in small steps so
    bisects will show the culprit.  This removes the calls to it where the
    parameter is clearly invariant under UTF-8 and UTF-EBCDIC, and so the
    result can't be other than just the parameter.

M       toke.c

commit 9a9ce86c3f016da7528c1dabd6d90d5949fde3d4
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 08:22:07 2013 -0700

    toke.c: in [A-Za-z] use macros that exclude non-ASCII alphas
    
    This code is attempting to deal with the problem of holes in the ranges
    a-z and A-Z in EBCDIC.  Prior to this patch, it accepeted things like A
    WITH GRAVE, etc, which shouldn't have the special processing to deal
    with the holes

M       toke.c

commit 8e7f014dcfc57c64ba3559d0c1be68c5f05d8674
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 19 15:13:19 2013 -0700

    Use real illegal UTF-8 byte
    
    The code here was wrong in assuming that \xFF is not legal in UTF-8
    encoded strings.  It currently doesn't work due to a bug, but that may
    eventually be fixed: [perl #116867].  The comments are also wrong that
    all bytes are legal in UTF-EBCDIC.
    
    It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
    (C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
    of an illegal overlong sequence.
    
    This creates a #define for an illegal byte using one of the real illegal
    ones, and changes the code to use that.
    
    No test is included due to #116867.

M       op.c
M       toke.c
M       utf8.h

commit 1aa0e2614a0a970eea70ea05e467628d79b82cff
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 14:00:13 2013 -0700

    toke.c: Don't remap \N{} for EBCDIC
    
    Everything is now in native,

M       toke.c

commit ba177aaa2614f3866a3f90a4c33715c85fa520dd
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 13:50:45 2013 -0700

    toke.c: Remove remapping for EBCDIC for octal
    
    The code prior to this commit converted something like \04 into its
    EBCDIC equivalent only in double-quoted strings.  This was not done in
    patterns, and so gave inconsistent results.  The correct thing to do
    should be to do the native thing, what someone who works on a platform
    would think \04 do.  Platform independent characters are available
    through \N{}, either by name or by U+.
    
    The comment changed by this was wrong, as in some cases it was native,
    and in some cases Unicode.

M       toke.c

commit f675ca0fba3f9f3126e54a7050f56375a02159b3
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 13:47:13 2013 -0700

    Remove EBCDIC remappings
    
    Now that the tables are stored in native format, we shouldn't be doing
    remapping.
    
    Note that this assumes that the Latin1 casing tables are stored in
    native order; not all of this has been done yet.

M       handy.h
M       perly.c
M       pp.c
M       regcomp.c
M       regexec.c
M       utf8.c

commit 05bf6c23cc557da2285c02669e591675b74f8156
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 12:46:05 2013 -0700

    Add and use macro to return EBCDIC
    
    The conversion from UTF-8 to code point should generally be to the
    native code point.  This adds a macro to do that, and converts the
    core calls to the existing macro to use the new one instead.  The old
    macro is retained for possible backwards compatibility, though it
    probably should be deprecated.

M       handy.h
M       pp.c
M       regcomp.c
M       regexec.c
M       toke.c
M       utf8.c
M       utf8.h

commit b4e5fa92124601aa80e6ccc8e4eaa4b90ba8b3ec
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 09:18:06 2013 -0700

    charnames: fix nit in comment

M       lib/_charnames.pm

commit a15a37f165320051138b7d2cb19cb106e1b3e011
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 16 11:05:44 2013 -0700

    charnames: Make work in EBCDIC
    
    Now that mktables generates native tables, the only thing that was
    needed was to make U+ mean Unicode instead of native.

M       lib/_charnames.pm
M       lib/charnames.pm

commit 04dcb5efe9d698409d22d543bfdb989cf7e17460
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 16 09:35:56 2013 -0700

    Unicode::UCD: Work on non-ASCII platforms
    
    Now that mktables generates native tables, it is a fairly simple matter
    to get Unicode::UCD to work on those platforms.

M       lib/Unicode/UCD.pm

commit d2488f7a16f46b1341eccbb067031d97749d6147
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 14 22:16:38 2013 -0700

    mktables: Generate native code-point tables
    
    The output tables for mktables are now in the platform's native
    character set.  This means there is no change for ASCII platforms, but
    is a change for EBCDIC ones.
    
    Since we currently don't have any EBCDIC test platforms, I tested this
    by faking it out to generate EBCDIC data, and then eye-balled the
    results.
    
    Code that didn't realize there was a potential difference between EBCDIC
    and non-EBCDIC platforms will now start to work; code that tried to do
    the right thing under these circumstances will no longer work.  Fixing
    that comes in later commits.

M       lib/unicore/mktables

commit 16275570dc0e9ad7d1c23138f8118b9f4125be0a
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 14 10:50:00 2013 -0700

    Fix some EBCDIC problems
    
    These spots have native code points, so should be using the macros for
    native code points, instead of Unicode ones.

M       regcomp.c
M       sv.c
M       toke.c

commit 01c1f79a43ce7368423d61f218786a1cca0c1031
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 13 22:10:19 2013 -0700

    Remove unnecessary temp variable in converting to UTF-8
    
    These areas of code included a temporary that is unnecessary.

M       inline.h
M       regcomp.c
M       sv.c

commit 55ba60080d38bb813bb22fe319467eafe98112bf
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 13 22:00:55 2013 -0700

    utf8.h: Correct macros for EBCDIC
    
    These macros were incorrect for EBCDIC.  The 3 step process given in
    utfebcdic.h wasn't being followed.

M       utf8.h

commit d2afd318fa11731c9e968f35e02034740424df2d
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 9 21:23:30 2013 -0700

    Extract common code to an inline function
    
    This fairly short paradigm is repeated in several places; a later commit
    will improve it.

M       embed.fnc
M       embed.h
M       inline.h
M       pp_pack.c
M       proto.h
M       sv.c
M       toke.c
M       utf8.c

commit e0e84feef603ad0bf7fa332e27ef3dcdb4e219ba
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 21:35:57 2013 -0700

    Don't use EBCDIC macro for a C language escape
    
    C recognizes '\a' (for BEL); just use that instead of a look-up.
    
    regen/unicode_constants.pl could be used to generate the character for
    the ESC (set in surrounding code), but I didn't do that because of
    potential bootstrapping problems when porting to an EBCDIC platform
    without a working perl.  (The other characters generated in that .pl are
    less likely to cause problems when compiling perl.)

M       regcomp.c
M       toke.c

commit 4faa2ff47cb5c3ce0d31eb373ab3a5cf7f9cb9b5
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 19:53:38 2013 -0700

    Use byte domain EBCDIC/LATIN1 macro where appropriate
    
    The macros like NATIVE_TO_UNI will work on EBCDIC, but operate on the
    whole Unicode range.  In the locations affected by this commit, it is
    known that the domain is limited to a single byte, so the simpler ones
    whose names contain LATIN1 may be used.
    
    On ASCII platforms, all the macros are null, so there is no effective
    change.

M       handy.h
M       regcomp.c
M       utf8.c

commit 2ebefc4e3b20845d89122c3cc40f7efad25a6bb8
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 14:31:09 2013 -0700

    Use new clearer named #defines
    
    This converts several areas of code to use the more clearly named macros
    introduced in a recent commit

M       op.c
M       toke.c
M       utf8.c
M       utf8.h
M       utfebcdic.h

commit 2fe70efa4720587860d851dbafe90d57b648180e
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 13:52:31 2013 -0700

    utf8.h, utfebcdic.h: Create less confusing #defines
    
    This commit creates macros whose names mean something to me, and I don't
    find confusing.  The older names are retained for backwards
    compatibility.  Future commits will fix bugs I introduced from
    misunderstanding the meaning of the older names.
    
    The older names are now #defined in terms of the newer ones, and moved
    so that they are only defined once, valid for both ASCII and EBCDIC
    platforms.

M       utf8.h
M       utfebcdic.h

commit 67009e5c43781b9023ea7743975b508c987730db
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 4 14:22:02 2013 -0700

    pp_ctl.c: Use isCNTRL instead of hard-coded mask
    
    This is clearer and portable to EBCDIC.

M       pp_ctl.c

commit 0614f01d8e1a49a3e2f8dc194fa7b9d137588798
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:51:05 2013 -0700

    utf8.c: is_utf8_char_slow() should use native length
    
    What is passed is the actual length of the native utf8 character.  What
    this was calculating was the length it would be if it were a Unicode
    character, and then compares, apples to oranges.

M       utf8.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to