[perl.git] branch khw/ebcdic, created. v5.17.10-223-g6233fba

Karl Williamson Sun, 07 Apr 2013 10:11:47 -0700

In perl.git, the branch khw/ebcdic has been created

<http://perl5.git.perl.org/perl.git/commitdiff/6233fba3d2d49e563f4ea2572f8f68448056bef2?hp=0000000000000000000000000000000000000000>


        at  6233fba3d2d49e563f4ea2572f8f68448056bef2 (commit)

- Log -----------------------------------------------------------------
commit 6233fba3d2d49e563f4ea2572f8f68448056bef2
Author: Karl Williamson <[email protected]>
Date:   Sun Apr 7 10:45:14 2013 -0600

    t/op/goto.t: Generalize for EBCDIC

M       t/op/goto.t

commit 15178cec1b0d4739f319636ed2627238a6d5b56a
Author: Karl Williamson <[email protected]>
Date:   Sun Apr 7 10:44:42 2013 -0600

    XXX Try uni/fold.t

M       regcomp.c
M       regexec.c
M       t/uni/fold.t

commit da1e9097abadf0dc22bac43a07ec9bd749ec320b
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 21:03:44 2013 -0600

    regcomp.c: White-space only, wrap comment to fit

M       regcomp.c

commit 5d010ceec9f7d342300f81bd025b59ec0fce9cf1
Author: Karl Williamson <[email protected]>
Date:   Wed Apr 3 20:15:17 2013 -0600

    t/re/pat.t: Generalize for EBCDIC

M       t/re/pat.t

commit d5d84ec8c076b3e28ff351f2c0027321e4b76745
Author: Karl Williamson <[email protected]>
Date:   Wed Apr 3 21:56:02 2013 -0600

    XXX t/op/pack.t: Generalize for EBCDIC
    
    One unknown what to do: uuencode

M       t/op/pack.t

commit 9c34958392a6374abbb62b96caa3bb60fa7bf9bc
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 12:56:52 2013 -0600

    regcomp.c: In EBCDIC [i-j] exclude also ASCII
    
    i and j are not adjacent in EBCDIC.  This excluded any alphabetic
    characters between them, but allowed other ascii ones.

M       regcomp.c
M       t/re/pat_advanced.t

commit 64b0a4552758f0d39a8080e12eec10896b449682
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 12:54:42 2013 -0600

    utf8.c: Don't use slower general-purpose function
    
    There is a macro that accomplishes the same task for a two byte UTF-8
    encoded character, and avoids the overhead of the general purpose
    function call.

M       utf8.c

commit e5c07b886feb79e2d9e47f471a5fa48422ccad17
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 12:53:07 2013 -0600

    utf8.c: Don't do ++ in macro parameter
    
    The formal parameter gets evaluated multiple times on an EBCDIC
    platform, thus incrementing more than the intended once.

M       utf8.c

commit ac1fd0a63d42677bb5a29b3c0300071848647d32
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 12:50:48 2013 -0600

    utf8.c: Use macro instead of duplicating code
    
    There is a macro that accomplishes this task, and is easier to read.

M       utf8.c

commit fdd7e90c503be4775e917ee5f84d7234b8722abe
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 10:15:05 2013 -0600

    t/io/bom.t: Fix to run under EBCDIC

M       t/io/bom.t

commit 37b79eec25e558439c78cf793bcb001ea954ace4
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 23:34:50 2013 -0600

    t/uni/overload.t: EBCDIC fixes

M       t/uni/overload.t

commit 653b5719044dc86a44b9f051f6f7f78371c54e31
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 23:34:13 2013 -0600

    t/uni/method.t: EBCDIC fixes

M       t/uni/method.t

commit 04f4a4e18d34959d4a1920809ea7eb474cdea2a0
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 23:33:28 2013 -0600

    t/op/utf8magic.t: EBCDIC fixes

M       t/op/utf8magic.t

commit ca31c671d36616093668b567fd794097e96073ff
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 23:32:57 2013 -0600

    t/op/evalbytes.t: EBCDIC fixes

M       t/op/evalbytes.t

commit 4cd8343c45816293973d002161b405e96da3ee78
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 16:20:20 2013 -0600

    lib/utf8.pm: Fix pod verbatim line wrap

M       lib/utf8.pm
M       t/porting/known_pod_issues.dat

commit 4bb248d97567e0dd474880a0f1a31c87d7632861
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 13:27:42 2013 -0600

    t/op/length.t: EBCDIC fixes

M       t/op/length.t

commit d0c8b69d6191acf930291df9f7ce560292e90ba4
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 13:01:54 2013 -0600

    t/op/utfhash.t: XXX Add debug

M       t/op/utfhash.t

commit 35b2dcc20cf842f700bd93f3e55446a61830c788
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 12:21:21 2013 -0600

    Data-Dumper/Dumper.pm: Fix for EBCDIC

M       dist/Data-Dumper/Dumper.pm

commit d9266a49879d2e3a51825168fba5e933e9597019
Author: Karl Williamson <[email protected]>
Date:   Fri Apr 5 12:15:58 2013 -0600

    Dumper.xs: Don't translate character twice
    
    utf8_to_uvchr() already returns the native code point; no need to
    convert again.  This code is only executed on Perls before 5.15

M       dist/Data-Dumper/Dumper.xs

commit 360f7577f6ffcf698c8e6762577ab495821c9036
Author: Karl Williamson <[email protected]>
Date:   Sat Apr 6 20:39:22 2013 -0600

    dist/IO/t/io_utf8argv.t: Generalize and enable EBCDIC
    
    Infrastructure now exists to have this test run on EBCDIC platforms.

M       dist/IO/t/io_utf8argv.t

commit 38a5c52000de12f8d5dea539d2c4ad17925988ad
Author: Karl Williamson <[email protected]>
Date:   Wed Apr 3 21:59:16 2013 -0600

    utf8.h: Clarify comments

M       utf8.h

commit 9d20e811c58118a9bc4b02ef85e944d72ad3db43
Author: Karl Williamson <[email protected]>
Date:   Wed Apr 3 19:06:52 2013 -0600

    XXX CPAN cpan/Test/lib/Test.pm: Fixes for EBCDIC

M       cpan/Test/lib/Test.pm

commit ba3c8d2c4ba4256734f9a9eb4502a4c0ea07623f
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 22:29:16 2013 -0600

    t/re/pat_re_eval.t: Some EBCDIC fixes

M       t/re/pat_re_eval.t

commit e44448b579db03fa0a95fa00c395d0d6a713a6f5
Author: Karl Williamson <[email protected]>
Date:   Tue Apr 2 07:11:19 2013 -0600

    t/test.pl:  Add fcn for UTF-EBCDIC conversion
    
    This adds the function byte_utf8a_to_utf8n().  This takes the bytes that
    form a UTF-8 string and convert them to the bytes that form that string
    on the native platform.

M       t/test.pl

commit ea8897c37034fd35726fb289b4c8de808b9d8bac
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 22:28:43 2013 -0600

    dist/Storable/t/utf8.t: Fix to run under EBCDIC

M       dist/Storable/t/utf8.t

commit 97d8870aa8853a438fd39e9288933cfc827a4a10
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 22:28:08 2013 -0600

    t/uni/variables.t: Fix to run under EBCDIC

M       t/uni/variables.t

commit 62e0378a74b66ae8b8b46c7174331a2d2dd20448
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 21:08:20 2013 -0600

    t/op/split.t: EBCDIC fixes

M       t/op/split.t

commit fbf7d0605e643f958ef9e3460770ffafb922e0d8
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 20:43:03 2013 -0600

    re/pat_advanced.t: EBCDIC fixes
    
    This includes not skipping some EBCDIC that formerly was, since we now
    have testing infrastructure that makes this easy.

M       t/re/pat_advanced.t

commit dd643d85544e1dc6a69df9b606ce90c536dc492e
Author: Karl Williamson <[email protected]>
Date:   Mon Apr 1 20:01:04 2013 -0600

    t/io/utf8.t: EBCDIC fixes

M       t/io/utf8.t

commit 040751c39116a513f60ce2fc55ddb9a546cdb7a4
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 30 21:13:38 2013 -0600

    Unicode::UCD.pm: Nits

M       lib/Unicode/UCD.pm

commit 7069c335eaf5e3f871046517c950706db2ee395c
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 30 12:32:09 2013 -0600

    t/uni/fold.t: EBCDIC fixes

M       t/uni/fold.t

commit f013b3dfaaa85050aaec9f48e66a81a756e46f54
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 29 15:22:28 2013 -0600

    XXX t/op/tiehandle.t: skip for now; deep recursion

M       t/op/tiehandle.t

commit 13b1af4e1b75cd80de9ad863554b4499745d78fe
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 29 14:56:16 2013 -0600

    XXX better commit msg utf8.c: Avoid unnecessary UTF-8 conversions
    
    This changes the code so that converting to UTF-8 is avoided unless
    necessary.  For such inputs, the conversion back from UTF-8 is also
    avoided.  The cost of doing this is that the first swatches are combined
    into one that contains the values for all characters 0-255, instead of
    having multiple swatches.  That means when first calculating the swatch
    it calculates all 256, instead of 128 (160 on EBCDIC).
    
    This also fixes an EBCDIC bug in which characters in this range were
    being translated twice.

M       utf8.c

commit 06df919f56139f7596aaf5be3476cec0b0476155
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 29 13:34:59 2013 -0600

    utf8.c: No need to check for UTF-8 malformations
    
    This function assumes that the input is well-formed UTF-8, even though
    until this commit, the preferatory comments didn't say so.  The API does
    not pass the buffer length, so there is no way it could check for
    reading off the end of the buffer.  One code path already calls
    valid_utf8_to_uvchr(); this changes the remaining code path to correspond.

M       utf8.c

commit b2d20945409f18635a8825c46703026c9a1af298
Author: Karl Williamson <[email protected]>
Date:   Thu Mar 28 19:56:39 2013 -0600

    utf8.c: Remove redundant assignment.
    
    This variable is always set just below.

M       utf8.c

commit b2579f51da5889206d17362f1a8663d1cfb50bba
Author: Karl Williamson <[email protected]>
Date:   Thu Mar 28 17:19:16 2013 -0600

    XXX enable _invlist_dump;

M       embed.fnc
M       embed.h
M       proto.h

commit b0274520f06dcbbc2d7aa7f941a70a9ff19dc127
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 8 11:01:32 2013 -0700

    XXX EBCDIC header files

M       charclass_invlists.h
M       l1_char_class_tab.h
M       regcharclass.h
M       unicode_constants.h

commit 3b7d1593bf29820a5cf2cb94a5a04874a265e956
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 12:26:15 2013 -0600

    hints/os390.sh: Suppress bogus compiler message

M       hints/os390.sh

commit 3199a043e132462e17cf58688363632128fab8ba
Author: John Goodyear <[email protected]>
Date:   Sat Mar 2 12:31:25 2013 -0700

    XXX Temporary for z/OS long long support

M       Configure
M       hints/os390.sh

commit b7a7286ac19598dc2d2fd0d20d72bfe09b0c4a9a
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 18:17:28 2013 -0600

    Add test that to/from native character set works
    
    For non-ASCII systems, there are character set translation tables.  This
    makes sure the two accessible ones are inverses of each other.  If not,
    nothing can be expected to work right.

M       MANIFEST
A       t/base/translate.t

commit 1a9fef82d9f45c85aa1887d413ea5897c0593aa2
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 16:55:55 2013 -0600

    lib/feature/bundle: Fix some things to pass under EBCDIC

M       t/lib/feature/bundle

commit 6f5c3df55a0c39e5f1914539d1c06f8e0f501600
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 16:08:04 2013 -0600

    XS-APItest/t/fetch_pad_names.t: Skip if EBCDIC
    
    This could be ported, but there's a lot of stuff to convert; would need
    a function to convert byte strings that form legal UTF-8 into legal
    UTF-EBCDIC

M       ext/XS-APItest/t/fetch_pad_names.t

commit d1cef3029a590582cc9989f0d81ac82d377a3de8
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 12:11:59 2013 -0600

    XXX Temporary lib/charnames.t, comment out to see if gets further

M       lib/charnames.t

commit 7fd727f691eed4a27deb5724604fa5853b359ae9
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 12:05:53 2013 -0600

    XXX ext/XS-APItest/t/utf8.t: Fix so passes EBCDIC
    
    This involves skipping much of the tests.  Reexamine later

M       ext/XS-APItest/t/utf8.t

commit f4594486b2ba3c7b47c9b0ded7eed7b69071376a
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 11:27:06 2013 -0600

    ext/re/t/re_funcs_u.t: Fix to work under EBCDIC

M       ext/re/t/re_funcs_u.t

commit 20902a99304fdaf9949da00b9b734f60f6518079
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 11:11:22 2013 -0600

    XXX dist/IO/t/io_utf8argv.t: Temporarily skip if EBCDIC

M       dist/IO/t/io_utf8argv.t

commit 9f766a2029c669230825ee2d3b4673bfbac8aad1
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 10:33:44 2013 -0600

    t/op/print.t: Skip an EBCDIC test
    
    This could be written (the values would probably change depending on the
    code page), but the code that would get exercised is unlikely to vary
    depending on character set.

M       t/op/print.t

commit 474e6f41b658af9d9641bcd1273d4c0e9e55a937
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 19:51:06 2013 -0600

    XXX skip folding tests

M       t/re/fold_grind.t
M       t/re/reg_fold.t
M       t/uni/fold.t

commit a9054db42d8ed16f810b0f80d6dfd7835c0b9051
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 15:44:59 2013 -0600

    XXX t/TEST: Avoid SIGPIPEs

M       t/TEST

commit 5e6696cd5c07c70db43143ca794cc56f8ddb4471
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 15:49:08 2013 -0600

    XXX Temporarily test normalization

M       cpan/Unicode-Normalize/t/fcdc.t
M       cpan/Unicode-Normalize/t/form.t
M       cpan/Unicode-Normalize/t/func.t
M       cpan/Unicode-Normalize/t/illegal.t
M       cpan/Unicode-Normalize/t/norm.t
M       cpan/Unicode-Normalize/t/null.t
M       cpan/Unicode-Normalize/t/partial1.t
M       cpan/Unicode-Normalize/t/partial2.t
M       cpan/Unicode-Normalize/t/proto.t
M       cpan/Unicode-Normalize/t/split.t
M       cpan/Unicode-Normalize/t/test.t
M       cpan/Unicode-Normalize/t/tie.t

commit 71bf6258b58aec0c3ac191fe3e209301d75e90a0
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 26 14:06:50 2013 -0600

    op/index.t: Fix tests for EBCDIC
    
    Commit 8a38a836 erroneously translates literals into the native
    encoding, causing a double translation, which is garbage.

M       t/op/index.t

commit 9864434c134a154369aecb3e02777909f661e7db
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 20:43:38 2013 -0600

    op/chop.t: Fix for EBCDIC
    
    One test is skipped because the code point is not representable on
    EBCDIC platforms.  Another test is modified to work on EBCDIC.

M       t/op/chop.t

commit ca647d69fcfa14dbfe23df1174d679747b301b85
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 19:56:50 2013 -0600

    t/op/lc.t: Fix to work under EBCDIC
    
    This had code that attempted this, but it was wrong.  The conversion to
    EBCDIC must be done before the \U, or similar.

M       t/op/lc.t

commit 832f77daa12ad57defba7422e6dbd3e9e1c105bc
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 15:33:55 2013 -0600

    Skip some tests under EBCDIC
    
    EBCDIC won't work on these because of inherent differences from ASCII

M       t/porting/customized.t
M       t/porting/manifest.t

commit 0fa346f8014e51190e9b9d52ceeb8c524736ee3b
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 15:04:14 2013 -0600

    porting/bincompat.t: Skip under EBCDIC
    
    because the sorting order is different

M       t/porting/bincompat.t

commit 1c21ea7e921436447351932f53a255e2eb39d262
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 14:59:50 2013 -0600

    t/re/regex_sets.t: So will pass under EBCDIC

M       t/re/regex_sets.t

commit 44b71db81e6a0d63e1a224bec909c42673fc51f2
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 14:59:26 2013 -0600

    t/porting/bincompat.t: Typo in comment

M       t/porting/bincompat.t

commit d5c1a29e9fa8a02d4a54ae31420fc5b5b4a32df4
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 25 13:09:09 2013 -0600

    XXX fix \x{too large}

M       dist/IO/IO.xs
M       doop.c
M       inline.h
M       pp.c
M       pp_pack.c
M       regcomp.c
M       sv.c
M       toke.c
M       utf8.c
M       utf8.h

commit 46df074fa9c8ec9efa9b052a8227b4e03a3ea53e
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 17:59:59 2013 -0600

    mktables: Fix typos in comments
    
    One of these fixes is for where a real CTRL-X was specified, instead of
    $^X

M       lib/unicore/mktables

commit 98c3260272b9f937ec83a26be5d17b45580be43f
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:16:08 2013 -0600

    utf8.c: Fix so UTF-16 to UTF-8 conversion works under EBCDIC

M       utf8.c

commit 06310c02e3d101aa751e11e2214dd3af15709b06
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:14:34 2013 -0600

    utf8.h, utfebcdic.h: Add #define

M       utf8.h
M       utfebcdic.h

commit 94af1df41d29e69aab6e30a017fd10be7fe0598f
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 24 13:11:25 2013 -0600

    utf8.c: Use mnemonics instead of hex numbers

M       utf8.c

commit 345cbfa6abba1b7d993d11d6ac2f98efdae16436
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 20 22:15:58 2013 -0600

    lib/Unicode/UCD.t: Allow to run under EBCDIC,

M       lib/Unicode/UCD.t

commit aced7b961bff3bb33aebe3fc5fc5eccee4e0cf19
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 15:27:31 2013 -0600

    t/op/quotemeta.t: EBCDIC fixes

M       t/op/quotemeta.t

commit e6df3ea174e4a58c5cd264f7c8f5c6eb3ec5d8a5
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:32:55 2013 -0600

    t/re/fold_grind.t: Fixes for EBCDIC

M       t/re/fold_grind.t

commit 18dc25499869d662752080d6477adb8c487fe59d
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:21:09 2013 -0600

    t/lib/charnames/alias: Fix some EBCDIC problems

M       t/lib/charnames/alias

commit c2f78ec4be42262f71b0ef8618525fb54a6884c3
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:20:24 2013 -0600

    t/uni/class.t: Make work on EBCDIC

M       t/uni/class.t

commit 202268258405d157ac591ef085cf3b03c53d9ad4
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 11:01:57 2013 -0600

    feature/unicode_strings.t: Fix to work on EBCDIC

M       lib/feature/unicode_strings.t

commit b23599b9e4e0abf32dfb87857959e651471907c8
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:10:46 2013 -0600

    XXX rebase regen/regcharclass.pl: make more EBCDIC friendly
    
    XXX regen/regcharclass.pl: maybe temp comment out utf8_char
    One of the possible inputs to this process is a string.  This clarifies
    that it must be specified in Unicode characters, and adds code to
    translate it to native, if necessary.

M       regen/regcharclass.pl
M       regen/regcharclass_multi_char_folds.pl

commit dfba29bc4304057da121eb03bbf55ca100040434
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 19 10:09:53 2013 -0600

    XXX temporary comment out multi-char folds

M       regcomp.c
M       regen/regcharclass.pl
M       regexec.c
M       t/re/fold_grind.t
M       t/re/reg_fold.t

commit 94b25724b27e76a6bb4febe57b42207b677dac89
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 22:00:29 2013 -0600

    XXX temp skip perl5db.t

M       lib/perl5db.t

commit 0215e2328fdf5be586cb6da6372ed5ad60b052b8
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 11:45:06 2013 -0600

    pp.c: White-space only
    
    Make a ternary operation more clear

M       pp.c

commit b4a2d6d091ef022af81f018e3032da3fa820e79f
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 18 11:43:42 2013 -0600

    Fix valid_utf8_to_uvchr() for EBCDIC

M       utf8.c

commit ac17078a961c63a033555c8a8dfc71ea2c811d59
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 17 21:42:20 2013 -0600

    t/test.pl: Add comment about EBCDIC

M       t/test.pl

commit d495b3d39bff3c045e55a44e648c14a995c4afaf
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 17 17:39:33 2013 -0600

    XXX makedepend.SH: Why does 255 work and 250 not?

M       makedepend.SH

commit fcb5c53411031bb0a92c649d7d7734ed9a84bb2a
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:48:22 2013 -0600

    XXX regen/mk_PL_charclass.pl: Make EBCDIC friendly
    
    need more of a commit message

M       regen/mk_PL_charclass.pl

commit 40330f54423a977dcd89b40034c7de934580ba2c
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:44:44 2013 -0600

    XXX make various things more EBCDIC friendly
    
    Adds trailing white space errors
    Need to know what to do about ^A meaning 0x1, and M-foo meaning meta

M       lib/DB.pm
M       lib/dumpvar.pl
M       lib/perl5db.pl
M       lib/sigtrap.pm

commit f76e75eaac5ab77eef3394ada05e51947a3c4d30
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:41:57 2013 -0600

    XXX charnames.t: Make more EBCDIC friendly
    
    Why need utf8::unicode_to_native

M       lib/charnames.t

commit c1b27470d7452c360a378c9a75682137361a4314
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 22:41:15 2013 -0600

    XXX: Fixup commit message.
    
    Fix UTF8_ACUUMULATE, utf8.c

M       utf8.c
M       utf8.h

commit c0f1b5a52deca1befe163f03db2b0962553b9dfe
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 16 16:52:45 2013 -0600

    regcomp.c: Fix bug in EBCDIC
    
    The POSIXA and NPOSIXA regnodes need to set the bits on only the ASCII
    code points, but under EBCDIC those code points are 0-127.

M       regcomp.c

commit d128281629dced4698a9dd3b475d8a487322381f
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 11:57:24 2013 -0600

    re/charset.t: Allow to work on EBCDIC
    
    This just converts the hard-coded character numbers to native, so will
    work on any platform.

M       t/re/charset.t

commit 40462a69a18c4107242bed16e9ac4afbc32dae18
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 15 11:50:35 2013 -0600

    XS-APItest/t/handy.t: Change output message
    
    On EBCDIC platforms, the output is not in terms of \N{U+}; change text
    to \x{ }

M       ext/XS-APItest/t/handy.t

commit d7abfc49b072d7afb5a8841351afcc8c77383edd
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 21:44:16 2013 -0600

    XXX Dumper.xs: Don't know why this stopped compiling

M       dist/Data-Dumper/Dumper.xs

commit d3bbc80747a84da12a2fc69165cca2145a3ee1c0
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:22:28 2013 -0600

    toke.c: Fix an ASCII-platform dependency

M       toke.c

commit ec13d1a62777369e78df6066decc02e38467ce72
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:20:23 2013 -0600

    toke.c: Simplify some code
    
    We don't have to test separately for lower vs uppercase here, as
    upper/lower case A-Z and a-z are not intermixed in the gaps in A-Z and
    a-z under EBCDIC.

M       toke.c

commit 71f2031d7e55baf46f5973e640d58aa454146d9e
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:18:12 2013 -0600

    genpacksizetables.pl: Correct comment typo

M       genpacksizetables.pl

commit 5e8ea6ade0e6445df69d88c0fcc50fa96b54b2e1
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:17:39 2013 -0600

    APItest/t/handy.t: Make EBCDIC-friendly

M       ext/XS-APItest/t/handy.t

commit 255eb60296a4dad776da402225fbc056d9cc0296
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:16:14 2013 -0600

    Data-Dumper: Make EBCDIC-friendly

M       dist/Data-Dumper/Dumper.xs

commit 79d707132ee56ebf0c8f158034941ab384a68f3e
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:14:31 2013 -0600

    sv.c: Make less ASCII-centric

M       sv.c

commit 1fc6d61df436e6dc0bec5ec83f770143c0fcd880
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:07:52 2013 -0600

    lib/charnames.t: Make some tests work under EBCDIC

M       lib/charnames.t

commit 4223b3f6d79ceab0fede41ad6374e510bcda6184
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:05:46 2013 -0600

    dump.c: Make less ASCII-centric:
    
    This has the added advantage of being clearer as to what is going on.

M       dump.c

commit a9841bd40a98fdf7044535404c93a3c36adaeca8
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 13 16:02:52 2013 -0600

    hv.c: Stop being ASCII-centric
    
    This uses macros which work cross-platform.  This has the added advantge
    that it is much clearer what is going on.

M       hv.c

commit d4ec85eca667b4baf69380542e19e6e90fe413a1
Author: Karl Williamson <[email protected]>
Date:   Tue Mar 12 22:34:17 2013 -0600

    t/TEST: Don't bail if fails in t/base unless minitest
    
    In order to completely compile Perl, many modules must have been parsed
    and compiled, so if there is a full perl, we know that things basically
    work.  The purpose of bailing out is that if these supposedly very base
    level functionality tests don't work, there's no point in continuing.
    But over the years, tests of more esoteric functionality have been
    added here, and if one of them doesn't work, it still could be that Perl
    pretty much does work.
    
    I believe it would be best to move such non-basic tests elsewhere, but
    that's work, and hasn't bitten us much so far; this change lessens the
    severity of the biting even more.  Where it will really bite is if
    things are so bad that a full perl binary can't be compiled, and we are
    trying to figure out why using minitest.

M       t/TEST

commit 8bb92d85563a8326013577c527940aa91bdb2511
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 11 15:11:10 2013 -0600

    Added Porting/reorder_charclass_invlists.pl
    
    This program is used too bootstrap perl onto a non-ASCII platform with
    no pre-existing perl.

M       MANIFEST
A       Porting/reorder_charclass_invlists.pl

commit ae3882c57c1a3da19ae5af5db001c1053bb0dfda
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 22:17:31 2013 -0600

    t/base/lex.t: Use char suitable for both ASCII and EBCDIC
    
    \xE2 is 'S' in EBCDIC, and so is going to be legal.  \xDF is an alpha
    which has no ASCII equivalent in either character set

M       t/base/lex.t

commit 4822f66ae51056f900b881264fd9ccf706dfa2a0
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 13:11:07 2013 -0600

    XXX Temporary comment out ParseXS check
    
    this is to get things to compile for now

M       dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS.pm

commit bb8d149a82c3480b5c9f8c307d1d1728ecd983f9
Author: Karl Williamson <[email protected]>
Date:   Sun Mar 10 11:34:10 2013 -0600

    XXX Collate, Normalize: Allow to compile under EBCDIC

M       cpan/Unicode-Collate/Collate.pm
M       cpan/Unicode-Collate/mkheader
M       cpan/Unicode-Normalize/Normalize.pm
M       cpan/Unicode-Normalize/mkheader

commit b56ee4d36759a3d32cda2ea36cb8253830b67a8e
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 9 21:57:38 2013 -0700

    XXX dquote_static.c: Silence wrong warning on EBCDIC
    
    Unsure of whether to add the 2nd !isCNTRL_L1 to silence return trip,
    which should be a separate commit anyway.
    
    This silences an inappropriate warning that doesn't happen on ASCII
    platforms.  CTRL-T maps to 0x14 on both ASCII and EBCDIC platforms.  But
    0x14 is a C1 control on EBCDIC, a C0 on ASCII.  Therefore the test that
    it's a control should include both C0 and C1, which isCNTRL_L1() does.
    
    Also has a white-space change, outdenting a line so it doesn't wrap in
    an 80 column window.

M       dquote_static.c

commit 43b100ec61426704fb46b72d6ad29ebb1e08ac3f
Author: Karl Williamson <[email protected]>
Date:   Thu Mar 7 12:08:41 2013 -0700

    utfebcdic.h: Change 'unsigned char' to U8
    
    This is for consistency with the rest of Perl

M       utfebcdic.h

commit 10eab09a2918695435fc3e16c46f89c57d3e45f5
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 8 08:11:38 2013 -0700

    regen/regcharclass.pl: Make more EBCDIC-friendly
    
    This commit changes the code generated by the macros so that they work
    right out-of-the-box on non-ASCII platforms for non-UTF-8 inputs.  THEY
    ARE WRONG for UTF-8, but this is good enough to get perl bootstrapped
    onto the target platform, and regcharclass.pl can be run there,
    generating macros correct UTF-8.

M       regcharclass.h
M       regen/regcharclass.pl

commit 2cf52515cbdd63aa4297ad6d7ff05b6e5c1fe6e8
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 6 21:30:01 2013 -0700

    utfebcdic.h: Add (UV) cast
    
    The operand of this macro is implicitly a UV.  Make sure that it is.

M       utfebcdic.h

commit 115ec0947258d54530cbde264d889c85f041d03b
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 6 17:04:58 2013 -0700

    handy.h: Allow bootstrapping to non-ASCII platform
    
    This adds a bunch of macros and moves things around to support
    conditional compilation when Configure is called with
    -DBOOTSTRAP_CHARSET.  Doing so causes the usual macros that are
    table-driven to not be used, since the table may not be valid when
    bringing Perl up for the first time on a non-ASCII platform.
    
    This allows it to compile using the platform's native C library ctype
    functions, which should work enough to compile miniperl, and allow the
    table to be changed to be valid.  Then Configure can be re-run to not
    bootstrap, and normal compilation can proceed

M       handy.h
M       inline.h

commit 38a1ad18646a935705bf2cc952bb18a024418e2f
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 13:43:26 2013 -0700

    gv.c: Remove EBCDIC dependency

M       gv.c

commit 45deb8c65659ceede4759d16cf9055fcc20440f4
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 13:00:47 2013 -0700

    toke.c: Remove EBCDIC dependency

M       toke.c

commit a0f0426ca5ea301f96be31bafaa284b00e6dac67
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 09:14:25 2013 -0700

    toke.c: Remove character set dependency
    
    Instead of hard-coding the bit patterns that comprise the Byte Order
    Mark in the UTF-8 or UTF-EBCDIC encodings, use the generated ones for
    the current platform.
    
    This removes some EBCDIC-only code.

M       toke.c

commit a38f84671c22c20b46b88b20308adf0577d4c13b
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 4 09:10:27 2013 -0700

    unicode_constants.h: Add #defines for Byte Order Mark
    
    These will be used in future commits

M       regen/unicode_constants.pl
M       unicode_constants.h

commit 3a82020be7530bdc2247ae32b885127f8d61357e
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 15:04:18 2013 -0700

    XXX: Find a cleaner way. Handle missing is_UTF8_CHAR_utf8_safe
    
    This macro may not be present, and is currently used exclusively in
    IS_UTF8_CHAR, which itself may be undefined, and code should cope with
    that.  This is a work-around until a better solution is found.

M       utf8.c
M       utf8.h

commit 1d0b6fbebc6c6b3675d17c3b6d8fced7d1775a5f
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 14:09:04 2013 -0700

    Add Porting tool for help with non-ASCII platforms
    
    Porting/reorder_l1_char_class_tab.pl is used to bootstrap Perl onto a
    non-ASCII platform with no working Perl.

M       MANIFEST
A       Porting/reorder_l1_char_class_tab.pl
M       regen/mk_PL_charclass.pl

commit 5750995194ee0d4432439abb1da9707822f66b01
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 13:06:58 2013 -0700

    inline.h: Reorder functions
    
    The comment implied that the functions below it in the file were
    deprecated, but in fact only the next two functions were.  This
    clarifies that and moves them so they are the final ones in the file

M       inline.h

commit 2e9f648cfe3b4b2237287d9fce497dd094ff308d
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:33:42 2013 -0700

    utfebcdic.h: Add comment

M       utfebcdic.h

commit fb5f4abc414aeae1b9b98e6ed898f6a128b20fa5
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:12:11 2013 -0700

    utf8.h: Clean up START_MARK definition and use
    
    The previous definition broke good encapsulation rules.  UTF_START_MARK
    should return something that fits in a byte; it shouldn't be the caller
    that does this.  So the mask is moved into the definition.  This means
    it can apply only to the portion that creates something larger than a
    byte.  Further, the EBCDIC version can be simplified, since 7 is the
    largest possible number of bytes in an EBCDIC UTF8 character.

M       utf8.h
M       utfebcdic.h

commit 120775fdf6f15766ccd0d3bac438dbb3fd9bf83f
Author: Karl Williamson <[email protected]>
Date:   Sat Mar 2 12:05:26 2013 -0700

    utf8.h: Move #includes
    
    These two files were only being #included for non-ebcdic compiles; they
    should be included always.

M       utf8.h

commit e792ed69081754accb03fdaf0f53f6101b77f7f7
Author: John Goodyear <[email protected]>
Date:   Sat Mar 2 11:49:14 2013 -0700

    utfebcdic.h: Remove extra parameter expansions
    
    These two macros were improperly expanding the parameters as well as
    defining the operation, leading to compile errors.

M       utfebcdic.h

commit 54b4b1fff85e95fc889559d63ac45b74a820d398
Author: Karl Williamson <[email protected]>
Date:   Fri Mar 1 08:28:52 2013 -0700

    utf8.h: Simplify UTF8_EIGHT_BIT_foo on EBCDIC
    
    These macros were previously defined in terms of UTF8_TWO_BYTE_HI and
    UTF8_TWO_BYTE_LO.  But the EIGHT_BIT versions can use the less general
    and simpler NATIVE_TO_LATN1 instead of NATIVE_TO_UNI because the input
    domain is restricted in the EIGHT_BIT.  Note that on ASCII platforms,
    these both expand to the same thing, so the difference matters only on
    EBCDIC.

M       utf8.h

commit 0732344840a921a40bb521202f2c8fdfd4c51dd1
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 28 09:25:27 2013 -0700

    XXX temp:  show makedepend cerr

M       makedepend.SH

commit f831bcad0b836cf0e4c6c947f8bc394018f47791
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 21:59:11 2013 -0700

    makedepend.SH: Split too long lines; properly join
    
    I had thought that a continuation introduced a space.  But no,
    a continuation can happen in the middle of a token.
    
    And this splits lines that are getting very long to avoid preprocessor
    limitations.

M       makedepend.SH

commit b8ad076b93b82f165b8d77d67f3f83cf33a287d1
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 15:51:28 2013 -0700

    makedepend.SH: White-space only
    
    Align continuation backslashes

M       makedepend.SH

commit ffe5bf78173577c9644720084fcf80403a21cdae
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 14:39:28 2013 -0700

    makedepend.SH: Remove some unnecessary white space
    
    Multi-line preprocessor directives are now joined into single lines.
    This can create lines too long for the preprocessor to handle.  This
    commit removes blanks adjoining comments that get deleted.  This makes
    things somewhat less likely to exceed the limit.
    
    This commit also fixes several [] which were meant to each match a tab
    or a blank, but editors converted the tabs to blanks

M       makedepend.SH

commit 681fe3d3c0690b850f651ab4e1023a87622a91cb
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 14:30:51 2013 -0700

    makedepend.SH: Retain '/**/' comments
    
    These comments may actually be necessary.

M       makedepend.SH

commit 78b3ea63a4d5a035389de982d226e5ff1df8b686
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 27 08:38:19 2013 -0700

    handy.h: Remove extraneous parens

M       handy.h

commit 01701dc697321e0d7efd3bea6c1dd330dee480c3
Author: Andy Dougherty <[email protected]>
Date:   Wed Feb 27 13:06:07 2013 -0500

    Disable gcc-style function attributes on z/OS.
    
    John Goodyear <[email protected]> reports that the z/OS C compiler
    supports the attribute keyword, but not exactly the same as gcc.
    Instead of a "warning", the compiler emits an "INFORMATIONAL" message
    that Configure fails to detect.  Until Configure is fixed, just disable
    the attributes altogether.
    
    John Goodyear

M       hints/os390.sh

commit 5e8e05cf8c5bbba4f7275cecdc0ff95ff35e3f0e
Author: Andy Dougherty <[email protected]>
Date:   Wed Feb 27 09:12:13 2013 -0500

    Change os390 custom cppstdin script to use fgrep.
    
    Grep appears to be limited to 2048 characters, and truncates
    the output for cppstin.  Fgrep apparently doesn't have that limit.
    Thanks to John Goodyear <[email protected]> for reporting this.

M       hints/os390.sh

commit 102dfc0ae67f1f807a9bc669fbc27b7560ea0d5d
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:45:19 2013 -0700

    utf8.c: Use more clearly named macro
    
    In the case of invariants these two macros should do the same thing,
    but it seems to me that the latter name more clearly indicates what is
    going on.

M       utf8.c

commit 50fbf9ea49517360158648d2b3d1447b618353f2
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:35:12 2013 -0700

    Add macro OFFUNISKIP
    
    This means use official Unicode code point numbering, not native.  Doing
    this converts the existing UNISKIP calls in the code to refer to native
    code points, which is what they meant anyway.  The terminology is
    somewhat ambiguous, but I don't think will cause real confusion.
    NATIVESKIP is also introduced for situations where it is important to be
    precise.

M       toke.c
M       utf8.c
M       utf8.h
M       utfebcdic.h

commit 5aa3cc30e727e9e422bad81825e02a2f20f00e88
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:22:19 2013 -0700

    toke.c: white space only

M       toke.c

commit 45ba074748d9a27ebac0909868e44bd82fc42eeb
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 12:08:50 2013 -0700

    utf8.c: Deprecate two functions
    
    This is to force any code that has been using these functions to change.
    Since the Unicode tables are now stored in native order, these functions
    should only rarely be needed.
    
    However, the functionality of these is needed, and in actuality, on
    ASCII platforms, the native functions are #defined to these.  So what
    this commit does is rename the functions to something else, and create
    wrappers with the old names, so that anyone using them will get the
    deprecation.

M       embed.fnc
M       embed.h
M       mathoms.c
M       proto.h
M       toke.c
M       utf8.c
M       utf8.h

commit a058a2fedd7ffec8cc32e7c14ed84b1a41a1ee70
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 11:26:09 2013 -0700

    Deprecate uvuni_to_utf8()
    
    Code should almost never be dealing with non-native code points

M       embed.fnc
M       embed.h
M       proto.h
M       toke.c
M       utf8.c
M       utf8.h

commit 776c97add5a8bf0a18d5e43e8d1ff39b6195d63b
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 11:02:33 2013 -0700

    Deprecate utf8_to_uni_buf()
    
    Now that the tables are stored in native order, there is almost no need
    for code to be dealing in Unicode order.

M       embed.fnc
M       proto.h
M       utf8.c

commit a739a2306155b0fc8e3d3553018c171098f86935
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 09:00:18 2013 -0700

    makedepend.SH: Comment out unnecessary code
    
    This causes problems currently for z/OS.  But, since we don't know why
    it was there, I'm leaving it in as a placeholder.

M       makedepend.SH

commit b32a8f9833c0f36777bdbf213f031808b7e8eb9f
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 20:26:44 2013 -0700

    Deprecate valid_utf8_to_uvuni()
    
    Now that all the tables are stored in native format, there is very
    little reason to use this function; and those who do need this kind of
    functionality should be using the bottom level routine, so as to make it
    clear they are doing nonstandard stuff.

M       embed.fnc
M       proto.h
M       utf8.c

commit d9de181e8ba8fb58ee457669f7ac464d3b068ce4
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 20:14:26 2013 -0700

    utf8.c: Swap which fcn wraps the other
    
    This is in preparation for the current wrapee becoming deprecated

M       embed.fnc
M       embed.h
M       proto.h
M       utf8.c
M       utf8.h

commit 2e5a005a9d0b2f55f902319b135145f248aa89ea
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 19:29:34 2013 -0700

    utf8.c: Skip a no-op
    
    Since the value is invariant under both UTF-8 and not, we already have
    it in 'uv'; no need to do anything else to get it

M       utf8.c

commit 876c474e6492607693c5bfcf928f0a76e1c558ab
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 19:26:50 2013 -0700

    utf8.c: Move comment to where makes more sense

M       utf8.c

commit b43e922c51c41efbe90eba61a44d178393183e03
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:30:10 2013 -0700

    APItest: Test native code points, instead of Unicode

M       ext/XS-APItest/APItest.pm
M       ext/XS-APItest/APItest.xs
M       ext/XS-APItest/t/utf8.t

commit 619fd86f9f0f5171df49ebc4c5d1b2cbf4357b53
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:25:08 2013 -0700

    XXX CPAN Normalize
    
    This converts Unicode::Normalize to use the native tables that are used
    by Perl starting in XXX, while using the Unicode-ordered ones that were
    used before then.
    
    Another alternative would be to have mktables generate just these tables
    in Unicode ordering.

M       cpan/Unicode-Normalize/Normalize.xs

commit 58c39498f76cd37d15b4936f16716f8c24b15917
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:22:55 2013 -0700

    XXX CPAN prob wrong Collate
    
    This changes to implicity usenative code points.  This is likely wrong,
    as the module comes with its own data, that are probably in terms of
    Unicode

M       cpan/Unicode-Collate/Collate.xs

commit 1eedd84881f4d9df564c7ce84dfb72d2eb882eae
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:12:53 2013 -0700

    XXX CPAN Encode.xs
    
    Use core function if available.  This will insulate this code from any
    future changes.

M       cpan/Encode/Encode.xs

commit f7336f14e881fee14cd37019538edaad8ff95bfd
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:04:24 2013 -0700

    XXX CPAN and unsure Encode

M       cpan/Encode/Encode.xs
M       cpan/Encode/Unicode/Unicode.xs

commit 610bc91f836c9c27e444b5eb17d9201032d2d996
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 25 17:00:47 2013 -0700

    XXX CPAN Encode.xs: fix indent

M       cpan/Encode/Encode.xs

commit b1fecde536e3793246b6953768c036d646199d3d
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 17:23:15 2013 -0700

    Don't refer to U+XXXX when mean native
    
    These messages say the output number is Unicode, but it is really
    native, so change to saying is 0xXXXX.

M       regen/regcharclass_multi_char_folds.pl
M       regexec.c

commit 763741d01a303106d511e9a8ddf970bd136b66f3
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:43:59 2013 -0700

    Convert some uvuni() to uvchr()
    
    All the tables are now based on the native character set, so using
    uvuni() in almost all cases is wrong.

M       cygwin/cygwin.c
M       doop.c
M       op.c
M       pp_pack.c
M       regcomp.c
M       regexec.c
M       toke.c
M       utf8.c

commit 0e42912ce19a53e983be85957ba4afaad8ecbd18
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:25:47 2013 -0700

    handy.h: White space only

M       handy.h

commit 744d71578035e681aed6bda0b0c6470706437237
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:19:49 2013 -0700

    t/test.pl: Allow native/latin1 string conversions to work on utf8.
    
    These functions no longer have the hard-coded definitions in them,
    but now end up resolving to internal functions, so that new encodings
    could be added and these would automatically understand them.
    
    Instead of using tr///, these now go character by character and
    converting to/from ord, which is slower, but allows them to operate on
    utf8 strings.
    
    Peephole optimization should make these essentially no-ops on ascii
    platforms.

M       t/test.pl

commit 659a93973bb2b10f1912d6f311a8b511f2930246
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 16:05:55 2013 -0700

    t/test.pl: Simplify ord to/from native fcns
    
    This commit changes these functions from converting to/from a string to
    calling utf8:: functions which operate on ordinals instead.

M       t/test.pl

commit 54c7edf844ec2da5ffcb92e679ad5012d876a9a9
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 15:35:38 2013 -0700

    Make casing tables native
    
    These are final tables that haven't been converted to native character
    set casing.

M       perl.h
M       utfebcdic.h

commit c09fb926b77c77b91609dbaffe5c07ab77a126d5
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 24 15:32:30 2013 -0700

    utfebcdic.h: Remove trailing spaces

M       utfebcdic.h

commit 2961c372dc8bc36c931963ed8c33e0717c3d62f0
Author: Karl Williamson <[email protected]>
Date:   Fri Feb 22 18:55:26 2013 -0700

    EBCDIC has the unicode bug too
    
    We have not had a working modern Perl on EBCDIC for some years.  When I
    started out, comments and code led me to conclude erroneously that
    natively it supported semantics for all 256 characters 0-255.  It turns
    out that I was wrong; it natively (at least on some platforms) has the
    same rules (essentially none) for the characters which don't correspond
    to ASCII onees, as the rules for these on ASCII platforms.
    
    A previous commit for 5.18 changed the docs about this issue.  This
    current commit forces ASCII rules on EBCDIC platforms (even should there
    be one that natively uses all 256).  To get all 256, the same things
    like 'use feature "unicode_strings"' must now be done.

M       handy.h

commit c48235f79f17bb0c855239e7fff225fd946b89c9
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:47:52 2013 -0700

    handy.h: Solve a failure to compile problem under EBCDIC
    
    handy.h is included in files that don't include perl.h, and hence not
    utf8.h.  We can't rely therefore on the ASCII/EBCDIC conversion
    macros being available to us.  The best way to cope is to use the native
    ctype functions.  Most, but not all, of the macros in this commit
    currently resolve to use those native ones, but a future commit will
    change that.

M       handy.h

commit 76fc6cff2bd38519ea5db69d874ed4ce8cc82a82
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:35:12 2013 -0700

    handy.h: Simplify some macro definitions
    
    Now, only one of the macros relies on magic numbers (isPRINT), leading
    to clearer definitions.

M       handy.h

commit cbc7630d3c61e1a2938b02f2aa6b84c697ec897a
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 21 13:26:49 2013 -0700

    handy.h: Combine macros that are same in ASCII, EBCDIC
    
    These 4 macros can have the same RHS for their ASCII and EBCDIC
    versions, so no need to duplicate their definitions
    
    This also enables the EBCDIC versions to not have undefined expansions
    when compiling without perl.h

M       handy.h

commit 1589bfea03a9f7a99a51dad52eac85ff770f15d5
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 10:39:48 2013 -0700

    Deprecate NATIVE_TO_NEED and ASCII_TO_NEED
    
    These macros are no longer called in the Perl core.  This commit turns
    them into functions so that they can use gcc's deprecation facility.
    
    I believe these were defective right from the beginning, and I have
    struggled to understand what's going on.  From the name, it appears
    NATIVE_TO_NEED taks a native byte and turns it into UTF-8 if the
    appropriate parameter indicates that.  But that is impossible to do
    correctly from that API, as for variant characters, it needs to return
    two bytes.  It could only work correctly if ch is an I8 byte, which
    isn't native, and hence the name would be wrong.
    
    Similar arguments for ASCII_TO_NEED.
    
    The function S_append_utf8_from_native_byte(const U8 byte, U8** dest)
    does what I think NATIVE_TO_NEED intended.

M       embed.fnc
M       mathoms.c
M       proto.h
M       toke.c
M       utf8.h
M       utfebcdic.h

commit 1400e8248f416109237db9410cfff34864f92adc
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 10:26:43 2013 -0700

    Remove remaining calls of NATIVE_TO_NEED
    
    These calls are just copying the input to the output byte by byte.
    There is no need to worry about UTF-8 or not, as the output is just an
    exact copy of the input

M       toke.c

commit 053c9978f6ed4455ade513b3c8fbb596768de7f3
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 08:12:15 2013 -0700

    toke.c: Remove some NATIVE_TO_NEED calls
    
    I believe NATIVE_TO_NEED is defective, and will remove it in a future
    commit.  But, just in case I'm wrong, I'm doing it in small steps so
    bisects will show the culprit.  This removes the calls to it where the
    parameter is clearly invariant under UTF-8 and UTF-EBCDIC, and so the
    result can't be other than just the parameter.

M       toke.c

commit f093ddfeb4656816696f6bf39223ce8c6f27936d
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 20 08:22:07 2013 -0700

    toke.c: in [A-Za-z] use macros that exclude non-ASCII alphas
    
    This code is attempting to deal with the problem of holes in the ranges
    a-z and A-Z in EBCDIC.  Prior to this patch, it accepeted things like A
    WITH GRAVE, etc, which shouldn't have the special processing to deal
    with the holes

M       toke.c

commit 6ac562dff8c272c1602e9137a44faf4933f42c8a
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 19 15:13:19 2013 -0700

    Use real illegal UTF-8 byte
    
    The code here was wrong in assuming that \xFF is not legal in UTF-8
    encoded strings.  It currently doesn't work due to a bug, but that may
    eventually be fixed: [perl #116867].  The comments are also wrong that
    all bytes are legal in UTF-EBCDIC.
    
    It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
    (C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
    of an illegal overlong sequence.
    
    This creates a #define for an illegal byte using one of the real illegal
    ones, and changes the code to use that.
    
    No test is included due to #116867.

M       op.c
M       toke.c
M       utf8.h

commit b6099d4715263461dfb17691cf68b68841b7e2dc
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 14:00:13 2013 -0700

    toke.c: Don't remap \N{} for EBCDIC
    
    Everything is now in native,

M       toke.c

commit 4b4b876aeb048e7b1d994b986b31bf243194c411
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 13:50:45 2013 -0700

    toke.c: Remove remapping for EBCDIC for octal
    
    The code prior to this commit converted something like \04 into its
    EBCDIC equivalent only in double-quoted strings.  This was not done in
    patterns, and so gave inconsistent results.  The correct thing to do
    should be to do the native thing, what someone who works on a platform
    would think \04 do.  Platform independent characters are available
    through \N{}, either by name or by U+.
    
    The comment changed by this was wrong, as in some cases it was native,
    and in some cases Unicode.

M       toke.c

commit 4d9d37e40ca28800eda029c753efe1c02f738685
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 13:47:13 2013 -0700

    Remove EBCDIC remappings
    
    Now that the tables are stored in native format, we shouldn't be doing
    remapping.
    
    Note that this assumes that the Latin1 casing tables are stored in
    native order; not all of this has been done yet.

M       handy.h
M       perly.c
M       pp.c
M       regcomp.c
M       regexec.c
M       utf8.c

commit 139f6539cb379b0d640199b50f198255b6e07ccc
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 12:46:05 2013 -0700

    Add and use macro to return EBCDIC
    
    The conversion from UTF-8 to code point should generally be to the
    native code point.  This adds a macro to do that, and converts the
    core calls to the existing macro to use the new one instead.  The old
    macro is retained for possible backwards compatibility, though it
    probably should be deprecated.

M       handy.h
M       pp.c
M       regcomp.c
M       regexec.c
M       toke.c
M       utf8.c
M       utf8.h

commit 7b989e3d99b4c06d544cb8e3cdf155e4ff5655e6
Author: Karl Williamson <[email protected]>
Date:   Sun Feb 17 09:18:06 2013 -0700

    charnames: fix nit in comment

M       lib/_charnames.pm

commit e80f61fa088e245507330036938f1e3dc917521b
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 16 11:05:44 2013 -0700

    charnames: Make work in EBCDIC
    
    Now that mktables generates native tables, the only thing that was
    needed was to make U+ mean Unicode instead of native.

M       lib/_charnames.pm
M       lib/charnames.pm

commit fe8dd5c82432caf44b937670f7fd07ab48328a7d
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 16 09:35:56 2013 -0700

    Unicode::UCD: Work on non-ASCII platforms
    
    Now that mktables generates native tables, it is a fairly simple matter
    to get Unicode::UCD to work on those platforms.

M       lib/Unicode/UCD.pm

commit 094ca70468db79857f74249346cbafdde7a7b30a
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 27 17:01:24 2013 -0600

    Unicode::UCD: Typo in comment

M       lib/Unicode/UCD.pm

commit 04af8d3a338c2ed62e0fd5d2b9047a340f326b1b
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 14 22:16:38 2013 -0700

    mktables: Generate native code-point tables
    
    The output tables for mktables are now in the platform's native
    character set.  This means there is no change for ASCII platforms, but
    is a change for EBCDIC ones.
    
    Since we currently don't have any EBCDIC test platforms, I tested this
    by faking it out to generate EBCDIC data, and then eye-balled the
    results.
    
    Code that didn't realize there was a potential difference between EBCDIC
    and non-EBCDIC platforms will now start to work; code that tried to do
    the right thing under these circumstances will no longer work.  Fixing
    that comes in later commits.

M       lib/unicore/mktables

commit 795da5fc07bd14ca9478b5a7ed5eee2c9b0697c6
Author: Karl Williamson <[email protected]>
Date:   Tue Apr 2 21:36:28 2013 -0600

    mktables: Move table creation code
    
    This code is moved later in the process.  This is in preparation for
    mktables generating tables in the native character set.  By moving it to
    later, the translation to native has already been done, and special
    coding need not be done.
    
    This also caught 7 code points that were omitted somehow in the previous
    logic

M       lib/unicore/mktables

commit 81f55bf310221da9de8304a86c68e25b92c6d141
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 14 10:50:00 2013 -0700

    Fix some EBCDIC problems
    
    These spots have native code points, so should be using the macros for
    native code points, instead of Unicode ones.

M       regcomp.c
M       sv.c
M       toke.c

commit e2e171427e53d82658c16a53cc62fad54fcbe820
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 13 22:10:19 2013 -0700

    Remove unnecessary temp variable in converting to UTF-8
    
    These areas of code included a temporary that is unnecessary.

M       inline.h
M       regcomp.c
M       sv.c

commit 91af0e5f5e9ea8afff5bcbcd4dff2e4fa386261a
Author: Karl Williamson <[email protected]>
Date:   Wed Feb 13 22:00:55 2013 -0700

    utf8.h: Correct macros for EBCDIC
    
    These macros were incorrect for EBCDIC.  The 3 step process given in
    utfebcdic.h wasn't being followed.

M       utf8.h

commit cf0b3d53d1717fa7ab8bd278b57e41ea838fa1b6
Author: Karl Williamson <[email protected]>
Date:   Sat Feb 9 21:23:30 2013 -0700

    Extract common code to an inline function
    
    This fairly short paradigm is repeated in several places; a later commit
    will improve it.

M       embed.fnc
M       embed.h
M       inline.h
M       pp_pack.c
M       proto.h
M       sv.c
M       toke.c
M       utf8.c

commit 0e14e3cfce13253610f183a299a0df363ac6be77
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 21:35:57 2013 -0700

    Don't use EBCDIC macro for a C language escape
    
    C recognizes '\a' (for BEL); just use that instead of a look-up.
    
    regen/unicode_constants.pl could be used to generate the character for
    the ESC (set in surrounding code), but I didn't do that because of
    potential bootstrapping problems when porting to an EBCDIC platform
    without a working perl.  (The other characters generated in that .pl are
    less likely to cause problems when compiling perl.)

M       regcomp.c
M       toke.c

commit 3596b4803dc64be73c7e068a5f9a106f958454bf
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 19:53:38 2013 -0700

    Use byte domain EBCDIC/LATIN1 macro where appropriate
    
    The macros like NATIVE_TO_UNI will work on EBCDIC, but operate on the
    whole Unicode range.  In the locations affected by this commit, it is
    known that the domain is limited to a single byte, so the simpler ones
    whose names contain LATIN1 may be used.
    
    On ASCII platforms, all the macros are null, so there is no effective
    change.

M       handy.h
M       regcomp.c
M       utf8.c

commit 182b7aafa93d891e9675982439c1a119854c7d50
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 14:31:09 2013 -0700

    Use new clearer named #defines
    
    This converts several areas of code to use the more clearly named macros
    introduced in a recent commit

M       op.c
M       toke.c
M       utf8.c
M       utf8.h
M       utfebcdic.h

commit 83ece809008695e5b97a8ccd2383fef91ace4e5f
Author: Karl Williamson <[email protected]>
Date:   Thu Feb 7 13:52:31 2013 -0700

    utf8.h, utfebcdic.h: Create less confusing #defines
    
    This commit creates macros whose names mean something to me, and I don't
    find confusing.  The older names are retained for backwards
    compatibility.  Future commits will fix bugs I introduced from
    misunderstanding the meaning of the older names.
    
    The older names are now #defined in terms of the newer ones, and moved
    so that they are only defined once, valid for both ASCII and EBCDIC
    platforms.

M       utf8.h
M       utfebcdic.h

commit 6b337ea1833ad3390c23cccc297a6af32dcf057e
Author: Karl Williamson <[email protected]>
Date:   Mon Feb 4 14:22:02 2013 -0700

    pp_ctl.c: Use isCNTRL instead of hard-coded mask
    
    This is clearer and portable to EBCDIC.

M       pp_ctl.c

commit c333e568d5d38008bf520917b8ece84582ecd7c3
Author: Karl Williamson <[email protected]>
Date:   Tue Feb 26 13:51:05 2013 -0700

    utf8.c: is_utf8_char_slow() should use native length
    
    What is passed is the actual length of the native utf8 character.  What
    this was calculating was the length it would be if it were a Unicode
    character, and then compares, apples to oranges.

M       utf8.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

[perl.git] branch khw/ebcdic, created. v5.17.10-223-g6233fba

Reply via email to