In perl.git, the branch smoke-me/khw-encode has been created

<http://perl5.git.perl.org/perl.git/commitdiff/5a04c69d9c49225341bfb6fe178d64b1404b5a99?hp=0000000000000000000000000000000000000000>

        at  5a04c69d9c49225341bfb6fe178d64b1404b5a99 (commit)

- Log -----------------------------------------------------------------
commit 5a04c69d9c49225341bfb6fe178d64b1404b5a99
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 20:30:56 2016 -0700

    APItest/t/utf8.t: White space only
    
    This indents the new block formed by the previous commit.  However,
    since the indentation is getting too much, it also changes the indents
    for all the nested for loops to 2 spaces to allow room on the line.

M       ext/XS-APItest/t/utf8.t

commit 034a668e4d2d34198394d306f06b341cae1bb84c
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 20:17:22 2016 -0700

    customized

M       t/porting/customized.dat

commit 6a583cc1f485ad8d4219ed57bdd5ff1eb751c4ac
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 17:47:35 2016 -0700

    Split diagnostics for two UTF-8 malformations
    
    Some UTF-8 sequences may have multiple malformations.  Commit
    2b5e7bc2e60b4c4b5d87aa66e066363d9dce7930 tried to make sure that all
    possible ones are raised, instead of abandoning searching after one is
    found.  Since, I realized that there was yet another case of two
    malformations that it returned only one or the other of.
    
    An input buffer may be too short to fully express the code point it
    purports to.  This can be determined by the first byte of the UTF-8
    sequence indicating a longer sequence is requred than the space
    available.  But also, that shortened sequence can have a premature
    beginning of another character earlier than the shortness.  This commit
    causes these to be both raised, instead of the previous behavior of
    noting just one.

M       ext/XS-APItest/t/utf8.t
M       t/op/utf8decode.t
M       utf8.c

commit f1cb3da01473dcbfd383951a3b2088e81df7e1a8
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 18:14:45 2016 -0700

    APItest/t/utf8.t: Partially refactor to use table data
    
    This removes kludgy code that was trying, given a partial
    character, to determine if there enough bytes present to guarantee that
    the whole character must belong to a class of characters or not.  Now
    the necessary length to make that determination has instead manually
    been placed in a table, so it can be looked up.  In doing so, I
    corrected one length that was failing on EBCDIC.

M       ext/XS-APItest/t/utf8.t

commit 5bf2125c209a4fdec31f7548ab5be64a39f89884
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 18:07:31 2016 -0700

    APItest/t/utf8.t: Fix test
    
    It turns out that this test has two malformations, and should only have
    one; a future commit will remove the masking of the 2nd one.

M       ext/XS-APItest/t/utf8.t

commit ad2a68e6f74115e1f81a0b17682908f38e87f0ec
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 18:01:21 2016 -0700

    APItest/t/utf8.t: Comments only

M       ext/XS-APItest/t/utf8.t

commit 16b826e9d1f06e8c0514aaccfa10d2b43d7a6f4c
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 17:55:10 2016 -0700

    APItest/t/utf8.t: Add some indentation to diagnositcs
    
    This is so they don't interrupt reading the output when there are
    errors.

M       ext/XS-APItest/t/utf8.t

commit 9ea5554d26bcd1c9af8f6d0f1b3c8b36fbf1ee3c
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 22 13:15:18 2016 -0700

    utf8.c: Clarify warning message.
    
    This warning was changed recently in the 5.25 series, and has not been
    in a stable release.

M       ext/XS-APItest/t/utf8.t
M       lib/utf8.t
M       t/op/lex.t
M       t/op/utf8decode.t
M       utf8.c

commit 81e259f729791f9dbb0cba210a6eb344f2e6a7e5
Author: Karl Williamson <[email protected]>
Date:   Mon Nov 21 14:59:47 2016 -0700

    APItest/t/utf8.t: Simplify expression slightly

M       ext/XS-APItest/t/utf8.t

commit 25a48de46d7d077d1bac4a963d07f236aa029fd4
Author: Karl Williamson <[email protected]>
Date:   Sun Nov 20 07:56:40 2016 -0700

    APItest/t/handy.t: Output details if test fails
    
    There should be no warnings generated, but if there are, we want to see
    what they were.

M       ext/XS-APItest/t/handy.t

commit 0c7d737ffd8cc8becbdbaf94d2627240502b5b77
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 15 09:09:07 2016 -0600

    XXX incomplete: Add sv_utf8_decode_flags

M       embed.fnc
M       embed.h
M       proto.h
M       sv.c
M       sv.h

commit 7339ac10881a902c0b7777b1801fe4d7699fc9a7
Author: Karl Williamson <[email protected]>
Date:   Tue Nov 1 22:23:47 2016 -0600

    customized

M       t/porting/customized.dat

commit 79203adb2b13741585378a97d2ccf02ba87051e4
Author: Karl Williamson <[email protected]>
Date:   Tue Oct 18 14:09:43 2016 -0600

    pali

M       cpan/Encode/Encode.xs

commit 7e649a0a0c000ae33fed56865a02127e93b2ea6d
Author: Karl Williamson <[email protected]>
Date:   Wed Oct 12 20:33:29 2016 -0600

    later

M       utf8.h

commit c89fc8c252e6ad9351c83e6bb2b18f6ba57a7295
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 14 22:40:23 2016 -0600

    customized

M       t/porting/customized.dat

commit 69015b32dfa88911c82c2d5e948cab2fe82673d6
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:20:52 2016 -0600

    Use core REPLACEMENT CHARACTER definition
    
    This allows the code to now work on EBCDIC as well.

M       cpan/Encode/Encode/encode.h

commit 12005725e31c7105bda2e86739c0cbd750889d4d
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:16:00 2016 -0600

    XXX commit msg: Encode.xs: Rmv unused function

M       cpan/Encode/Encode.xs

commit 1f5da59b7986c06d68ebffde0f49e04cef0fe522
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:39 2016 -0600

    Encode.xs: white-space, comment only
    
    This removes some trailing white space, and indents various blocks
    properly according to perl standards, and adds a comment, fixes grammar
    in another.

M       cpan/Encode/Encode.xs

commit f1fe0f20838dc854ba99f295d66783fe56fe5232
Author: Karl Williamson <[email protected]>
Date:   Thu Sep 1 12:12:06 2016 -0600

    XXX maybe more in commit msg: Speed up Encode UTF-8 validation checking
    
    This replaces the current scheme for checking UTF-8 validity by one
    in which normal processing doesn't require having to decode the UTF-8
    into code points.  The copying of characters individually from the input
    to the output is changed to be a single operation for each entire span
    of valid input at once.
    
    Thus in the normal case, what ends up happening is a tight loop to
    check the validity, and then a memmove of the entire input to the
    output, then return.
    
    If an error is found, it copies all the valid input before the error,
    then handles the character in error, then positions to the next input
    position, and repeats the whole process starting from there.
    
    It uses the functionality available from the Perl 5 core to to look at
    just the bytes that comprise the UTF-8 to make the determination,
    converting to code points only those that are defective some how in
    order to display them in warnings and error messages.
    
    Thus, this does not need to know about the intricacies of UTF-8
    malformations, relying on the core to handle this.
    
    This cannot be pushed to CPAN until Devel::PPPort has been updated to
    implement all the functions now needed.

M       cpan/Encode/Encode.pm
M       cpan/Encode/Encode.xs

commit e3000d7e9ad6eb78cbe90d1b8ddb53cc3e4ab41e
Author: Karl Williamson <[email protected]>
Date:   Fri Oct 28 05:03:37 2016 -0600

    XXX For EBCDIC debug

M       utf8.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to