In perl.git, the branch smoke-me/khw-encode has been created

<http://perl5.git.perl.org/perl.git/commitdiff/2b232ed1ce442e9ea50a5b2453cfd53173ca77ee?hp=0000000000000000000000000000000000000000>

        at  2b232ed1ce442e9ea50a5b2453cfd53173ca77ee (commit)

- Log -----------------------------------------------------------------
commit 2b232ed1ce442e9ea50a5b2453cfd53173ca77ee
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 08:46:53 2016 -0600

    XS-APItest/t/utf8.t: Test with longest possible overlong
    
    As part of testing, certain malformations are perturbed to also be
    overlong to see that the combination of them is properly handled.  To do
    this, the code will take a test case and calculate an overlong that is
    longer than it.  However if the test case is as long as the overlong
    would be, this can't be done, and is skipped.  This commit now
    uses a longer overlong than previously (now the maximum possible) so
    that fewer tests have to be skipped.

M       ext/XS-APItest/t/utf8.t

commit 24b9ea9efb57e390d1662b6fe2a7909acb69cbbd
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 08:44:43 2016 -0600

    XS-APItest/t/utf8.t: White-space only

M       ext/XS-APItest/t/utf8.t

commit ce0c9378fad5fa98af70f1121c198c7f29761c27
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 08:42:38 2016 -0600

    XS-APItest/t/utf8.t: Fix EBCDIC bug
    
    This number needs to be adjusted for EBCDIC platforms

M       ext/XS-APItest/t/utf8.t

commit 2c36d7f40639db7241be5e06fae4daff0e9b769d
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 08:36:56 2016 -0600

    XS-APItest/t/utf8.t: Move a common expression to $var
    
    The maximum byte length of a single code-points UTF-8 representation is
    used in a bunch of places.  Calculate it once.

M       ext/XS-APItest/t/utf8.t

commit d7099f298bc4a657b9bd5aca87419c065176f8f8
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 08:31:09 2016 -0600

    XS-APItest/t/utf8.t: Fix wrong test on EBCDIC
    
    The I8 string doesn't work the same as UTF-8, as it only takes 5 bits
    from each continuation byte instead of 6.

M       ext/XS-APItest/t/utf8.t

commit 885bdc7f97303962bdb4b985c2a6855be5fec51f
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 05:03:37 2016 -0600

    XXX For EBCDIC debug

M       utf8.c

commit db52803e1d0aaa935087812ff9de23097ed06acb
Author: Karl Williamson <[email protected]>
Date:   Tue Oct 18 14:09:43 2016 -0600

    pali

M       cpan/Encode/Encode.xs

commit ee6f151c413ac3f8ac3c057a104362b95a16c0dc
Author: Karl Williamson <[email protected]>
Date:   Wed Oct 12 20:33:29 2016 -0600

    later

M       utf8.h

commit d1d6eb574c621c54c1142fd92c401ba9573d8c78
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 15 09:09:07 2016 -0600

    XXX incomplete: Add sv_utf8_decode_flags

M       embed.fnc
M       embed.h
M       proto.h
M       sv.c
M       sv.h

commit e5af3cab40f1af3cecf217354460706b20c7e8fe
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 14 22:40:23 2016 -0600

    customized

M       t/porting/customized.dat

commit 06a50a8c7aaf2f3a6ac646e62910fc9d9b4a1063
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:20:52 2016 -0600

    Use core REPLACEMENT CHARACTER definition
    
    This allows the code to now work on EBCDIC as well.

M       cpan/Encode/Encode/encode.h

commit 7306f2d8504d76c65a625bfe00c05df066225831
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:16:00 2016 -0600

    XXX commit msg: Encode.xs: Rmv unused function

M       cpan/Encode/Encode.xs

commit 7b3b2fe7348dc3abe96d0e0cfa28aa919a1978fe
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:39 2016 -0600

    Encode.xs: white-space only

M       cpan/Encode/Encode.xs

commit dbef7dbf59c4a73c7db6e66206da544db210d796
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:06 2016 -0600

    XXX maybe more in commit msg: Speed up Encode UTF-8 validation checking
    
    This replaces the current scheme for checking UTF-8 validity by one
    in which normal processing doesn't require having to decode the UTF-8
    into code points.  The copying of characters individually from the input
    to the output is changed to be a single operation for each entire span
    of valid input at once.
    
    Thus in the normal case, what ends up happening is a tight loop to
    check the validity, and then a memmove of the entire input to the
    output, then return.
    
    If an error is found, it copies all the valid input before the error,
    then handles the character in error, then positions to the next input
    position, and repeats the whole process starting from there.
    
    It uses the functionality available from the Perl 5 core to to look at
    just the bytes that comprise the UTF-8 to make the determination,
    converting to code points only those that are defective some how in
    order to display them in warnings and error messages.
    
    Thus, this does not need to know about the intricacies of UTF-8
    malformations, relying on the core to handle this.
    
    This cannot be pushed to CPAN until Devel::PPPort has been updated to
    implement all the functions now needed.

M       cpan/Encode/Encode.pm
M       cpan/Encode/Encode.xs
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to