In perl.git, the branch smoke-me/khw-encode has been created

<http://perl5.git.perl.org/perl.git/commitdiff/ae65d1bbd033b4b19dcb9cc3d3acc48d166deffc?hp=0000000000000000000000000000000000000000>

        at  ae65d1bbd033b4b19dcb9cc3d3acc48d166deffc (commit)

- Log -----------------------------------------------------------------
commit ae65d1bbd033b4b19dcb9cc3d3acc48d166deffc
Author: Karl Williamson <[email protected]>
Date:   Fri Nov 25 08:35:30 2016 -0700

    smoke

M       utf8.c

commit 41635c05b0c8e8314c775297e913e726c5b159f7
Author: Karl Williamson <[email protected]>
Date:   Sat Oct 29 08:47:07 2016 -0600

    XXX don't push lex hints

M       t/op/lex.t

commit 0bc9a0cc931976ca6e4054502ee5d9fcb30c5506
Author: Karl Williamson <[email protected]>
Date:   Wed Nov 23 13:27:43 2016 -0700

    Add isFOO_utf8_safe() macros
    
    The original API assumed that we could keep malformed UTF-8 out by use
    of gatekeepers, but that is currently impossible.  This commit adds
    "safe" versions to macros for determining if a UTF-8 sequence represents
    an alphabetic, a digit, etc.  Each new macro has an extra parameter
    pointing to the end of the sequence, so that looking beyond the input
    string can be avoided.
    
    The macros aren't currently completely safe, as they don't test that
    there is at least a single valid character in the input, except by an
    assertion in DEBUGGING builds.  This is because typically they are
    called in code that makes that assumption, and frequently tests the
    current character for one thing or another.  While debugging this and
    future commits, The assertion showed some current errors where that
    assumption turned out to be false.

M       embed.fnc
M       embed.h
M       handy.h
M       proto.h
M       utf8.c
M       utf8.h

commit 351c32af4e2c3c6984c6734663f9dfd9927ab538
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 20:17:22 2016 -0700

    customized

M       t/porting/customized.dat

commit ba187e343bccf3f08edd2cad46e9b68758369182
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 15 09:09:07 2016 -0600

    XXX incomplete: Add sv_utf8_decode_flags

M       embed.fnc
M       embed.h
M       proto.h
M       sv.c
M       sv.h

commit dc56f3283e43db0b2747b5aa9a31087625cd3593
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 1 22:23:47 2016 -0600

    customized

M       t/porting/customized.dat

commit bb0779e0ae176efb1731b8ad2284e79561d707b4
Author: Karl Williamson <[email protected]>
Date:   Tue Oct 18 14:09:43 2016 -0600

    pali

M       cpan/Encode/Encode.xs

commit a79801dbf44bc3e48a791909de52086c37b29e4c
Author: Karl Williamson <[email protected]>
Date:   Wed Oct 12 20:33:29 2016 -0600

    later

M       utf8.h

commit d93c237cd842553d93fea06165b6708a35e6e036
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 14 22:40:23 2016 -0600

    customized

M       t/porting/customized.dat

commit b629b2f8bb07e1b64883c8f9474c66743b75e48d
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:20:52 2016 -0600

    Use core REPLACEMENT CHARACTER definition
    
    This allows the code to now work on EBCDIC as well.

M       cpan/Encode/Encode/encode.h

commit 35a2f0566e61eff4e8df8019a65780b7e599b8bd
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:16:00 2016 -0600

    XXX commit msg: Encode.xs: Rmv unused function

M       cpan/Encode/Encode.xs

commit 2bea84d3ebb553f3767bffc50d5f8172d5b974d1
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:39 2016 -0600

    Encode.xs: white-space, comment only
    
    This removes some trailing white space, and indents various blocks
    properly according to perl standards, and adds a comment, fixes grammar
    in another.

M       cpan/Encode/Encode.xs

commit acf53d426a9c6651b38aa51f7fa2a7642bed9dc2
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:06 2016 -0600

    XXX maybe more in commit msg: Speed up Encode UTF-8 validation checking
    
    This replaces the current scheme for checking UTF-8 validity by one
    in which normal processing doesn't require having to decode the UTF-8
    into code points.  The copying of characters individually from the input
    to the output is changed to be a single operation for each entire span
    of valid input at once.
    
    Thus in the normal case, what ends up happening is a tight loop to
    check the validity, and then a memmove of the entire input to the
    output, then return.
    
    If an error is found, it copies all the valid input before the error,
    then handles the character in error, then positions to the next input
    position, and repeats the whole process starting from there.
    
    It uses the functionality available from the Perl 5 core to to look at
    just the bytes that comprise the UTF-8 to make the determination,
    converting to code points only those that are defective some how in
    order to display them in warnings and error messages.
    
    Thus, this does not need to know about the intricacies of UTF-8
    malformations, relying on the core to handle this.
    
    This cannot be pushed to CPAN until Devel::PPPort has been updated to
    implement all the functions now needed.

M       cpan/Encode/Encode.pm
M       cpan/Encode/Encode.xs

commit 0cafc4d1987dd12e8d8a7b7d52ca1500004a8705
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 05:03:37 2016 -0600

    XXX For EBCDIC debug

M       utf8.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to