In perl.git, the branch smoke-me/khw-encode has been created

<http://perl5.git.perl.org/perl.git/commitdiff/2cb6bc585363b6864a47bf98c7e914eb75c14bcc?hp=0000000000000000000000000000000000000000>

        at  2cb6bc585363b6864a47bf98c7e914eb75c14bcc (commit)

- Log -----------------------------------------------------------------
commit 2cb6bc585363b6864a47bf98c7e914eb75c14bcc
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 1 22:12:51 2016 -0600

    Fix wrong UTF-8 overflow error on 32-bit platforms
    
    Commit 2b5e7bc2e60b4c4b5d87aa66e066363d9dce7930 changed the algorithm
    for detecting overflow during decoding UTF-8 into code points.  However,
    on 32-bit platforms, this change caused it to claim some things overflow
    that really don't.  ALl such are overlong malformations, which are
    normally forbidden, but not necessarily.  This commit fixes that.

M       embed.fnc
M       embed.h
M       ext/XS-APItest/t/utf8.t
M       proto.h
M       utf8.c

commit 4c11ad7d7cf31fd4d122698bd90b62bf612c10ac
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 1 22:13:21 2016 -0600

    APItest/t/utf8.t: Correct to uppercase in print
    
    This worked so long as we didn't have hex digits A-F.

M       ext/XS-APItest/t/utf8.t

commit ba7dcf3cef00d829bd5f1dbc2db5200027ac5759
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 05:03:37 2016 -0600

    XXX For EBCDIC debug

M       utf8.c

commit 362722ff71dbaf2c0ec6f5389aa053df609aff8f
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 1 22:23:47 2016 -0600

    customized

M       t/porting/customized.dat

commit e8b873011d23ec4009c544b9bc70d1b5a09b1a8d
Author: Karl Williamson <[email protected]>
Date:   Tue Oct 18 14:09:43 2016 -0600

    pali

M       cpan/Encode/Encode.xs

commit 6e3944ecd4ea46e7d81ab5ed66d69e13b586d52a
Author: Karl Williamson <[email protected]>
Date:   Wed Oct 12 20:33:29 2016 -0600

    later

M       utf8.h

commit 3f9cd74a0a34066976bb28d6a1097a682c4aac1b
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 15 09:09:07 2016 -0600

    XXX incomplete: Add sv_utf8_decode_flags

M       embed.fnc
M       embed.h
M       proto.h
M       sv.c
M       sv.h

commit 0a8882b7edbcd7137308763bebd3a3ac35d50e76
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 14 22:40:23 2016 -0600

    customized

M       t/porting/customized.dat

commit fa29054c6e3c86b70b271078ed32bb63ade306ba
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:20:52 2016 -0600

    Use core REPLACEMENT CHARACTER definition
    
    This allows the code to now work on EBCDIC as well.

M       cpan/Encode/Encode/encode.h

commit 16eb15ffa1708c07601fa10a90e835ca63ec0b50
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:16:00 2016 -0600

    XXX commit msg: Encode.xs: Rmv unused function

M       cpan/Encode/Encode.xs

commit fb0dbe7928ee3f2ffb3293faa9f7282edde92870
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:39 2016 -0600

    Encode.xs: white-space only

M       cpan/Encode/Encode.xs

commit b4c618aec6d0189b98ee325268d3c47950fb7355
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:06 2016 -0600

    XXX maybe more in commit msg: Speed up Encode UTF-8 validation checking
    
    This replaces the current scheme for checking UTF-8 validity by one
    in which normal processing doesn't require having to decode the UTF-8
    into code points.  The copying of characters individually from the input
    to the output is changed to be a single operation for each entire span
    of valid input at once.
    
    Thus in the normal case, what ends up happening is a tight loop to
    check the validity, and then a memmove of the entire input to the
    output, then return.
    
    If an error is found, it copies all the valid input before the error,
    then handles the character in error, then positions to the next input
    position, and repeats the whole process starting from there.
    
    It uses the functionality available from the Perl 5 core to to look at
    just the bytes that comprise the UTF-8 to make the determination,
    converting to code points only those that are defective some how in
    order to display them in warnings and error messages.
    
    Thus, this does not need to know about the intricacies of UTF-8
    malformations, relying on the core to handle this.
    
    This cannot be pushed to CPAN until Devel::PPPort has been updated to
    implement all the functions now needed.

M       cpan/Encode/Encode.pm
M       cpan/Encode/Encode.xs
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to