In perl.git, the branch smoke-me/davem/sprintf has been created

<http://perl5.git.perl.org/perl.git/commitdiff/7f755ba08fd207c35cb39986ecac37bf4308d98b?hp=0000000000000000000000000000000000000000>

        at  7f755ba08fd207c35cb39986ecac37bf4308d98b (commit)

- Log -----------------------------------------------------------------
commit 7f755ba08fd207c35cb39986ecac37bf4308d98b
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 15:57:29 2017 +0100

    add some sprintf benchmarks

M       t/perf/benchmarks

commit b6c6629513d29d5ddf1d20dcc00376de7a4c62c5
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 15:12:46 2017 +0100

    Perl_sv_vcatpvfn_flags: rename a label
    
    s/donevalidconversion/done_valid_conversion/
    
    so its a bit easier to read.

M       sv.c

commit 75e4204fa6634a6f973c991775314359c4f8716f
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 15:07:10 2017 +0100

    sv_vcatpvfn_flags and wrappers: s/svmax/sv_count/
    
    Rename the 'svmax' parameter of
    
        Perl_sv_vcatpvfn_flags(),
        Perl_sv_vcatpvfn(),
        Perl_sv_vsetpvfn(),
    
    to 'sv_count'.
    
    'max' often implies N-1 (e.g. svarsg[0]..svargs[svmax]), whereas it's
    actually the number of SV args passed to the functions.

M       embed.fnc
M       proto.h
M       sv.c

commit 38c4a31e2c6113a900131266720a50ae752e3384
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 14:47:11 2017 +0100

    Perl_sv_vcatpvfn_flags: handle mixed utf8 better
    
    Once the output string gets upgraded to utf8 (e.g. due to a utf8 %s
    argument), any remaining appending of plain (non-%) parts of the
    format string becomes very inefficient. It basically creates an
    SV out of the next format chunk, upgrades that SV to utf8, then
    appends the upgraded buffer.
    
    This commits makes it just append the format chunk byte by byte, upgrading
    in the fly if that byte is !NATIVE_BYTE_IS_INVARIANT

M       sv.c

commit d7ce96b2773de2657d46aed2b368cc327c317b8e
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 13:51:45 2017 +0100

    add S_sv_catpvn_simple() for use by sprintf
    
    Currently Perl_sv_vcatpvfn_flags() uses an unrolled sv_catpvn_nomg()
    to append floating point formats, a call to sv_catpvn_nomg() to append
    non-% parts of the format, and a few other non-performance-critical
    calls to sv_catpvn_nomg().
    
    Move the unrolled code block into an inline static function, and make
    the non-% appending use it too.

M       sv.c

commit 35499a0fbc915f6de88299b0747918750159786b
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 13:08:12 2017 +0100

    Perl_sv_vcatpvfn_flags: re-indent a code block
    
    whitespace only

M       sv.c

commit ba7eafd35d2ea99eaabe2134bf5e7732071a5926
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 13:00:52 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate p var
    
    It has 1500-line scope, and is equal to fmtstart-1 for most of the
    time.
    
    This also allows us to 'const'ify some variables better.

M       embed.fnc
M       proto.h
M       sv.c

commit 80a4ec8bbe2de8bd7a65d6c2146240c32b6d8194
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 12:23:32 2017 +0100

    Perl_sv_vcatpvfn_flags: clarify GCC bug comments
    
    In particular it wasn't clear what bug was being worked around, nor that
    '#13488' referred to a GNU ticket rather than a perl ticket.
    
    This bug was fixed back in 2004, but the workaround is fairly harmless, so
    I've left it as-is.

M       sv.c

commit 21b781e099bb1aa38a01429bd185569ae2d30144
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 11:57:11 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify alt handling
    
    only do calculations for alt (#) formatting in the branches which use it

M       sv.c

commit 97b2d67004339098751af7b20e46016e86da04e5
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 11:41:41 2017 +0100

    Perl_sv_vcatpvfn_flags: rename 'p' var 's'
    
    In the 'append # block of code at the end of the loop, don't re-use the
    widely-scoped 'p' pointer; instead use a tightly scope var instead
    (named 's' do it doesn't clash with p which is still valid in an outer
    scope.)

M       sv.c

commit 2e835c2e748c1a2d0049ea49f6734dc94b13d809
Author: David Mitchell <[email protected]>
Date:   Fri Jun 2 09:51:40 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify format appending
    
    The bit at the end of the main loop has a whole bunch of conditionals
    along the lines of
    
        if (gap && !left)
             apppend gap
        if (esignlen && !fill)
             append esignbuf
        if (zeros)
            append zeroes
        if (elen)
            append ebuf
        if (gap && left)
            append gap
    
    This involves many tests along the main code path to cope with all the
    possibilities (e.g. if left, gap is output before ebuf, otherwise after)
    
    Instead split it into a couple of major branches with duplication between
    the branches, but requiring few tests along any one code path.
    
    For example, sprintf("%5d", -1) formerly required 9 branches, 1 for loop,
    and 1 memset(). It now requires 2 branches and 3 for loops,
    
    I've removed memset()s and replaced them with for loops. For the short
    padding typically used (e.g. "%9d" rather than "%8192d") a loop is faster.

M       sv.c

commit e628fd4e3e6b11f51d75489f60c852afac126261
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 16:05:59 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate a wrap check
    
    This is one case where it can never wrap, so don't check.

M       sv.c

commit 3484ce2dc8996a41b450d916beeb6578e257618c
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 12:46:23 2017 +0100

    Perl_sv_vcatpvfn_flags: simpler special formats
    
    At the top of Perl_sv_vcatpvfn_flags(), certain fixed formats are
    special-cased: "", "%s", "%-p", "%.0f".
    
    Simplify the code which handles these. In particular, don't try to issue
    "missing" or "redundant" arg warnings there. Instead, check for the
    correct number of args as part of the test for whether this can be
    special-cased, and if not, fall through to the general code in the main
    body of the function to handle that format and issue any warnings.
    
    This makes the code a lot simpler. It also now detects the redundant arg
    in printf("%.0f",1,2).
    
    The code is now also more efficient - it tries to check for things like
    pat[0] == '%' only once, rather than re-checking for every special-case
    variant its trying.

M       sv.c
M       t/op/sprintf.t

commit db2f0662a037b9dc13cfafaba3771202e53dba19
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 11:55:47 2017 +0100

    Perl_sv_vcatpvfn_flags: simpler redundant arg test
    
    5.24.0 added a new warning:
    
        Redundant argument in printf at ....
    
    That warning is issued if there are more args than format elements.
    However, it may also warn for invalid format - e.g. for something like
    printf("%Z%d", 1,2) you get both
    
        Invalid conversion in printf: "%Z" at ...
        Redundant argument in printf at ...
    
    Personally I think once once part of the format has been determined to be
    invalid, its hard for perl to second-guess in what way the format was
    invalid, and thus to be able to conclude that there is in fact a redundant
    arg.
    
    So this commit commit suppresses any "redundant" warning once an "invalid"
    warning has been issued.
    
    Doing this makes it possible to simplify the code and remove the
    used_explicit_ix variable.
    
    Apart from warnings, used_explicit_ix was only used in %p to check for
    'simple' special forms - but that code checks for a trailing '$' character
    anyway, so that test was redundant.

M       sv.c
M       t/op/sprintf.t
M       t/op/sprintf2.t

commit 804c4dcdd29c4512cb0bd05b7f7f045ee6ae1f08
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 11:29:35 2017 +0100

    Perl_sv_vcatpvfn_flags: fix comment typo

M       sv.c

commit a9075552c089351f42608870e15c66e179aeff3b
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 11:27:20 2017 +0100

    Perl_sv_vcatpvfn_flags: add comment about wrap

M       sv.c

commit a7a8ab2c65d6b74f6d884c014a9ac66f183ccf8b
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 11:08:27 2017 +0100

    Perl_sv_vcatpvfn_flags: only do utf8 in radix code
    
    For floating point formats, the output can only be utf8 if the radix point
    is utf8. Currently the radix point code sets the is_utf8 variable, then
    later, in the main floating-point code path, it tests is_utf8 and
    upgrades the output string to utf8.
    
    Instead, just do the upgrade directly in the radix code block.

M       sv.c

commit 834bc982b41187237120adf8ded799981acd4da1
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 11:00:26 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify radix len adding
    
    Assume the length of the radix point is a constant 1 (i.e. length('.'))
    and only increment float_need further if we're in a locale.

M       sv.c

commit ab5eedeb7af9cae5e6191a90eb7683065168522a
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 10:52:12 2017 +0100

    sprintf %a/%A more sanity checks
    
    For the code which generates hexadecimal floating-point formats,
    add extra sanity checks against buffer overruns.

M       sv.c

commit 00d97013759bd93cc87d06632e04929608bec60e
Author: David Mitchell <[email protected]>
Date:   Thu Jun 1 10:32:36 2017 +0100

    S_hextract(): fix #if indentation
    
    a complex set of nested #if/#else/#endif's had incorrect and confusing
    indentation.
    
    whitespace-only change

M       sv.c

commit f5ec36347fb260ba38368d723116f981cd2de9ee
Author: David Mitchell <[email protected]>
Date:   Wed May 31 12:35:34 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify some wrap checks
    
    Skip doing some overflow checks when we know it can't overflow.

M       sv.c

commit 78a7e49cd083dc8d00681579e874da1cb359519c
Author: David Mitchell <[email protected]>
Date:   Wed May 31 11:59:48 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify float_need calc
    
    Include another constant addition in the initial assignment, to eliminate
    a later wrap check.

M       sv.c

commit 2f8ef833a9252306864de9fd50568c3603151e37
Author: David Mitchell <[email protected]>
Date:   Wed May 31 11:15:15 2017 +0100

    S_format_hexfp(): s/int/STRLEN/
    
    In the helper function that sprintf's %a/%A hex floating point values,
    the calculation of the number of zeros to pad with should be in terms of
    STRLEN rather than int.
    
    A bit academic unless someone ever tries to print a hex f/p value with a
    precision > 2Gb digits.

M       sv.c

commit 15dc17ac906981b9907e36b4ff4a818757aa92fa
Author: David Mitchell <[email protected]>
Date:   Wed May 31 09:47:27 2017 +0100

    op/infnam.t: skip unportable tests
    
    sprintf size modifiers L and q aren't available on all platform sizes,
    so skip them.

M       t/op/infnan.t

commit dd0de300a9a4bdcf6b3fc6e950f3647a2e6e93a1
Author: David Mitchell <[email protected]>
Date:   Tue May 30 16:11:37 2017 +0100

    Perl_sv_vcatpvfn_flags: add inits to silence gcc
    
    Add a couple of unnecessary variable initialisers, to keep gcc's "this
    variable might be used uninitialised - then again it might not - in fact I
    don't really know what I'm talking about, but I've decided to annoy you
    with it anyway" warning at bay.

M       sv.c

commit 485a9edae4c8cf41dca4d9c4f26de6485f4f69a4
Author: David Mitchell <[email protected]>
Date:   Tue May 30 15:55:29 2017 +0100

    Perl_sv_vcatpvfn_flags: avoid wrap on precision
    
    Where the precision is specified literally in the format string,
    the integer precision value could wrap. Instead, make it croak with
    
        Integer overflow in format string
    
    As in other recent commits, the upper limit is set at 1/4 of STRLEN.

M       sv.c
M       t/op/sprintf.t
M       t/op/sprintf2.t

commit c5644ed95bcf1ab49a7e1df6dcff426de7cb4cbd
Author: David Mitchell <[email protected]>
Date:   Tue May 30 15:27:00 2017 +0100

    Perl_sv_vcatpvfn_flags: s/int/STRLEN/g
    
    There wee a few residual places that used int loop counters, e.g. to
    prepend N '0's to a number. Since the N's are of type STRLEN, make the
    loop counters STRLEN too.
    
    Its a bit academic since you're unlikely to have a number needing >2Gb
    worth of zero padding, but it makes things consistent and easier to audit.
    
    At this point I believe that any remaining usage of int / I32 / U32 in
    Perl_sv_vcatpvfn_flags() is legitimate.

M       sv.c

commit 211056f87c22bfba6ec5b3397ac66342cc52d406
Author: David Mitchell <[email protected]>
Date:   Tue May 30 15:11:24 2017 +0100

    Perl_sv_vcatpvfn_flags: %n: avoid wrap
    
    Its a bit academic, but in principle if a string was longer than 2Gb
    chars, the length as set by %n could wrap. So use the correct type(s).

M       sv.c

commit 27fb98e8e99b2eb243b890514373b814fd88873d
Author: David Mitchell <[email protected]>
Date:   Tue May 30 13:45:35 2017 +0100

    Perl_sv_vcatpvfn_flags: width/precis arg wrap
    
    When the width or precision is specified via an argument rather than
    literally, check whether the value wraps.
    
    Formerly, something like
    
        $w = 0x100000005;
        printf "%*s", $w, "abc";
    
    might print "  abc" or similar, depending on platform.
    
    Now it croaks with "Integer overflow in format string".
    
    I did wonder whether it should just warn instead, but:
    
    1) over-large literal widths/precisions already croak.
    2) Code that has wild field specifiers like that is already likely
       to crash with an out-of-memory error.
    3) At least this croak is trappable via eval - OOM isn't.
    
    I also set the maximum allowed value to be 1/4 of the size of a pointer,
    to give a safety margin for possible wrapping later

M       sv.c
M       t/op/sprintf2.t

commit bf06df40ee4e5f686ef390192cdbf9b3b8d961ad
Author: David Mitchell <[email protected]>
Date:   Mon May 29 17:06:06 2017 +0100

    Perl_sv_vcatpvfn_flags: move vector initialisation
    
    Move the generation of vecstr/veclen/vec_utf8 into the
    vector-initialisation block, rather than being part of the general
    'get next arg' block.
    
    Also, stop vecsv being in scope for the whole of the loop block, and make
    it two separate tightly-scope vars (with different purposes).

M       sv.c

commit 764a20bdc1d1e122e36fc847652543194e130d98
Author: David Mitchell <[email protected]>
Date:   Mon May 29 16:53:06 2017 +0100

    Perl_sv_vcatpvfn_flags: warn on missing %v arg
    
    The explicit arg variant, e.g. %3$vd, didn't give 'missing arg' warning.

M       sv.c
M       t/op/sprintf.t
M       t/op/sprintf2.t

commit 8c9d2e8d17c0b5ceba97983606723133d1bfc526
Author: David Mitchell <[email protected]>
Date:   Mon May 29 16:20:17 2017 +0100

    Perl_sv_vcatpvfn_flags: warn on missing width arg
    
    It didn't used to warn when the width value was obtained from the next or
    specified arg, and there wasn't such an arg.

M       sv.c
M       t/op/sprintf.t

commit 0f10182ee49d54c8d0cbca2ce1c8e41eb6696e2c
Author: David Mitchell <[email protected]>
Date:   Mon May 29 16:11:01 2017 +0100

    Eliminate FETCH_VCATPVFN_ARGUMENT macro
    
    This can be simplified so much now that it might as well just be expanded
    in situ for its 3 uses.

M       sv.c

commit d60a97ec8e3a9313dc5478bde5910c5090d9a6cf
Author: David Mitchell <[email protected]>
Date:   Mon May 29 16:01:26 2017 +0100

    Perl_sv_vcatpvfn_flags: re-indent block
    
    whitespace-only

M       sv.c

commit 4505e23456bd065a9eda6fcc9ae3b9971cb10b9b
Author: David Mitchell <[email protected]>
Date:   Mon May 29 15:27:18 2017 +0100

    Perl_sv_vcatpvfn_flags: unify %v vers obj handling
    
    Cureently sv_vcatpvfn_flags() has special handling of the arg under %v
    when the arg is a version object, but only via the perlish interface
    (argsv and svmax). This commit extends that handling to the C-sih
    interface (args).
    
    There seems no good reason not to, and it simplifies the code.

M       sv.c

commit 19f17bbdeaee98bd3e86552a8894d3f75dd7300a
Author: David Mitchell <[email protected]>
Date:   Mon May 29 13:49:42 2017 +0100

    Perl_sv_vcatpvfn_flags: unify args handling
    
    Several places do something along the lines of:
    
        if (explicit arg index)
            FETCH_VCATPVFN_ARGUMENT(...., svargs[ix-1])
        else
            FETCH_VCATPVFN_ARGUMENT(...., svargs[svix++])
    
    For each of these, reduce the duplicate code by changing the above to
    (approximately)
    
        ix = ix ? ix - 1 : svix++;
        FETCH_VCATPVFN_ARGUMENT(...., svargs[ix])

M       sv.c

commit 83e418612e2c2094675d7477f9915918eecff8f0
Author: David Mitchell <[email protected]>
Date:   Mon May 29 11:16:49 2017 +0100

    sv_vcatpvfn() family: make svmax arg Size_t
    
    It was formerly I32. It should be unsigned since you can't have a negative
    number of args. And although you're unlikely to call sprintf with more
    than 0x7fffffff args, it makes it more consistent with other APIs which
    we've been gradually expanding to 64-bit/ptrsize. It also makes the
    code internal to Perl_sv_vcatpvfn_flags more consistent, when
    dealing with explict arg index formats like "%10$s". This function still
    has a mix of STRLEN (for string lengths) and Size_t (for arg indexes)
    but they are aliases for each other.
    
    I made Perl_do_sprintf()'s len arg SSize_t rather than Size_t, since
    it typically gets called with ptr diff arithmetic. Not sure if this is
    being overly cautious.

M       doop.c
M       embed.fnc
M       pod/perlguts.pod
M       proto.h
M       sv.c

commit e7ac322799335a64efde06f10a32acdb28d103cd
Author: David Mitchell <[email protected]>
Date:   Mon May 29 09:59:16 2017 +0100

    S_expect_number(): return STRLEN not I32
    
    This static function is used by Perl_sv_vcatpvfn_flags() to read in
    a width or explicit argument number. It currently returns an I32 result
    (and croaks if the number exceeds the maximum possible I32 value).
    
    Change it to return STRLEN, and to croak on the value being greater than
    max(STRLEN) / 4.
    
    This doesn't make a lot of difference in practice, since no code is ever
    going to be able to successfully create a formatted string that large
    without running out of memory anyway. But by making it unsigned and of the
    same type used elsewhere in sv_vcatpvfn_flags(), it simplifies auditing
    the code for possible wrapping/truncating etc.
    
    The change in the limit where it croaks with "Integer overflow in format
    string" has changed as follows:
    
                         previously                   now
        32-bit system    0x7fffffff            0x3fffffff
        32/64bit system  0x7fffffff            0x3fffffff
        64bit system     0x7fffffff    0x3fffffffffffffff
    
    Setting the limit as 1/4 max rather than 1/2 max is just a safety
    net to help avoid wraps/overflows elsewhere.

M       embed.fnc
M       proto.h
M       sv.c
M       t/op/sprintf.t
M       t/op/sprintf2.t

commit 3e80ed5abb5ce80283f7980213064d025fa82eeb
Author: David Mitchell <[email protected]>
Date:   Sun May 28 18:07:14 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify 'c' var
    
    Make it so that its now *always* the format type ('s', 'd' etc).
    Don';t bother initialising it, and *don't* use as as a temporary
    buffer (eptr = &c), so it can be stored in a register.

M       sv.c

commit 3bb4a8912e392b84dfa6ff9c0360e6ac829a7b14
Author: David Mitchell <[email protected]>
Date:   Sun May 28 17:59:47 2017 +0100

    Perl_sv_vcatpvfn_flags: reduce scope of 'iv' var

M       sv.c

commit 05f4807ba0d5291fadd4f1a3b525d8c09fb75efe
Author: David Mitchell <[email protected]>
Date:   Sun May 28 17:52:25 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate 'epix' var
    
    Or rather, reduce its scope to a small block and rename to 'ix'.

M       sv.c

commit 0b68b37ec5759dfef45fcc40f7e54030a303448e
Author: David Mitchell <[email protected]>
Date:   Sun May 28 17:49:10 2017 +0100

    S_expect_number() re-indent code
    
    .. following previous commit. Whitespace only.

M       sv.c

commit 34b33f5a5211a49528b4e5c8c43b33e992e52326
Author: David Mitchell <[email protected]>
Date:   Sun May 28 17:43:36 2017 +0100

    sprintf: move 1..9 test out of S_expect_number()
    
    Currently Perl_sv_vcatpvfn_flags() does several checks for "is the next
    part of the format a number starting with a '1'..'9'?" It does this by
    calling S_expect_number(), which returns 0 if not, or the value of the
    number otherwise. For a simple format specifier, this results in multiple
    fruitless calls to S_expect_number.
    
    This commits makes it that the caller of S_expect_number is responsible
    for checking for the presence of 1..9.

M       sv.c

commit 87677428f4ffa279ae959dca29abdb8a585c043c
Author: David Mitchell <[email protected]>
Date:   Sat May 27 00:57:47 2017 +0100

    Perl_sv_vcatpvfn_flags: more %v optimisation
    
    Only do the code for appending the vector separator in the vector branch.
    In particular, don't size the SvGROW for dotstrlen outside of %v.
    This makes the %v code a bit slower but everything else a bit faster.

M       sv.c

commit f0d83af38810d9a863477c29a455cc4df69e1c53
Author: David Mitchell <[email protected]>
Date:   Sat May 27 00:17:35 2017 +0100

    Perl_sv_vcatpvfn_flags: test for valid %vX once
    
    Rather than testing for !vectorize in every conversion case which doesn't
    support %v, test once for supported types in the if (vectorize) branch.
    
    That way code which doesn't use %v never has to test for it.

M       sv.c

commit 448e0fc4afdd488571079c29e60d814002c6a449
Author: David Mitchell <[email protected]>
Date:   Sat May 27 00:07:48 2017 +0100

    Perl_sv_vcatpvfn_flags: join two if blocks
    
    convert if (x); if (!x); into an single if/else

M       sv.c

commit df3f474762774f53a0250589135892176ce679b9
Author: David Mitchell <[email protected]>
Date:   Sat May 27 00:00:10 2017 +0100

    Perl_sv_vcatpvfn_flags: delay vector arg get
    
    Move the block of code which retrieves the SV which the %v will iterate
    over, from just before the /* SIZE */ block to just after. Since that
    block doesn't do anything with args or svargs, this should make no
    functional difference - but it will allow the next commit to coalesce
    if (x); if (!x); into an single if/else.
    
    Apart from cutting and pasting the code block, no other changes have been
    made to it.

M       sv.c

commit f28efb65ce724f92fabffeafeab8659f1a9dd976
Author: David Mitchell <[email protected]>
Date:   Fri May 26 23:49:58 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate VECTORIZE_ARGS
    
    This macro is only used once. Just expand it.

M       sv.c

commit 76d35b666e5337600482e0ddbf07727111545409
Author: David Mitchell <[email protected]>
Date:   Fri May 26 23:42:07 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate ewix local var
    
    It's now only used within one code block.

M       sv.c

commit 5214c217f8b7a762bdc0fa0d83d2259381fe1e97
Author: David Mitchell <[email protected]>
Date:   Fri May 26 23:34:25 2017 +0100

    Perl_sv_vcatpvfn_flags: remove 'asterisk' var
    
    There was only one remaining use of this local var: in %p, to distinguish
    between explicit and implicit width specifier, e.g. %*p or %1$p, vs %2p.
    This can be done by just checking whether the char before the p was a '*'
    or '$'.

M       sv.c

commit e87084655730ebf53e76a1ed7d8ae4d6960774ee
Author: David Mitchell <[email protected]>
Date:   Fri May 26 23:20:22 2017 +0100

    Perl_sv_vcatpvfn_flags: further simplify %v logic
    
    For the common case with no * or v, there now are only 2 test-and-branch
    (! '*', ! 'v') rather than 3 (! '*', ! 'v', !asterisk)
    
    This works by putting the *v handling code in the * branch

M       sv.c

commit 54e53107eab9db6323546a6b1becc08a08270a6e
Author: David Mitchell <[email protected]>
Date:   Fri May 26 22:45:02 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate evix local var

M       sv.c

commit f508145c35ff905916cb214d96b1c4ced9c68814
Author: David Mitchell <[email protected]>
Date:   Fri May 26 22:26:58 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify v/asterisk code
    
    The previous commit's rearrangement of the v and * code now allows us to:
    1) eliminate the 'vectorarg' bool variable, which is set but no longer
       used;
    2) join two adjacent "if (asterisk)" and "if (!asterisk)" blocks into a
       single if/else.

M       sv.c

commit 3612e3464d18377996faecf4c30afb0e7ab892b7
Author: David Mitchell <[email protected]>
Date:   Fri May 26 22:19:44 2017 +0100

    Perl_sv_vcatpvfn_flags: move %*v handling earlier
    
    Where the v flag appears, and it has non-default separator, i.e.
    *v or *NNN$v, retrieve the next or NNNth arg (which defines the separator)
    earlier - as soon as we encounter the v flag. This should in theory make
    no functional difference since no args are processed between those two
    points (so no chance of us stealing something else's arg).
    
    Doing it ealrier makes the conditions simpler (we don't have to check for
    (vectorize && vectorarg) later).
    
    The whole code block has been moved as-is with no changes apart from
    whitespace.

M       sv.c

commit 10fb7fc9fc4e85fc73677933abc5a1b21418824a
Author: David Mitchell <[email protected]>
Date:   Fri May 26 18:19:11 2017 +0100

    Perl_sv_vcatpvfn_flags: move Inf handling for ints
    
    integer-like format types handle Inf/Nan specially. Currently the code to
    handle this in the main execution path, guarded by
    
        if (strchr("BbcDdiOouUXx", c)) ...
    
    After the previous few commits reorganised the int-arg getting code, this
    block can now be moved into an int-only section, so not slowing down
    other format types.
    
    There should be no functional changes.
    
    I've added some comments to the %c branch explaining why its a special
    case.

M       sv.c

commit 66be1c6a2dfcfa754b68864c32f292c8c8f97ba7
Author: David Mitchell <[email protected]>
Date:   Fri May 26 17:23:09 2017 +0100

    Perl_sv_vcatpvfn_flags: unify int arg fetching
    
    There are two big blocks of code that do signed and unsigned 'get next int
    arg' processing. Combine them (sort of).
    
    Previously it was a bit like
    
        case 'd':
        case 'i':
            base = 10;
            if (vectorize)
                uv = ...
            else if (arg)
                iv = ...
            else
                iv = SvIV_nomg(argsv);
            if (!vectorize)
                uv = f(iv) for some f.
            goto integer;
    
        case 'x' base = 16; goto uns_integer;
        case 'u' base = 10; goto uns_integer;
        ...
        uns_integer:
            if (vectorize)
                uv = ...
            else if (arg)
                uv = ...
            else
                uv = SvUV_nomg(argsv);
    
        integer:
            ... do stuff with base and uv ...
    
    Now it's more like
    
        case 'd': base = -10; goto get_int_arg_val;
        case 'i': base = -10; goto get_int_arg_val;
        case 'x': base =  16; goto get_int_arg_val;
        case 'u': base =  10; goto get_int_arg_val;
    
        get_int_arg_val:
    
            if (vectorize)
                uv = ...
            else if (base < 0) {
                /* signed int type */
                base = -base;
                if (arg)
                    iv = ...
                else
                    iv = SvIV_nomg(argsv);
                uv = f(iv) for some f.
            }
            else {
                /* unsigned int type */
                if (arg)
                    uv = ...
                else
                    uv = SvUV_nomg(argsv);
            }
    
        integer:
            ... do stuff with base and uv ...
    
    Note that in particular the vectorize block of code is no longer
    duplicated. This will also allow the next commit to handle Inf/overload
    just after the 'get_int_arg_val' label rather than doing it before the
    main switch and slowing down the non-integer format types.
    
    Should be no functional changes

M       sv.c

commit 9e7a96f4fd70828467b69177ed1ceea267caeeb2
Author: David Mitchell <[email protected]>
Date:   Fri May 26 16:39:30 2017 +0100

    Perl_sv_vcatpvfn_flags: move %c handling to ints
    
    %c is in some ways like integer formats - we treat the arg as an integer
    (with '0+' overloading and Inf/Nan handling), but then at the end convert
    it into a 1 char string rather than sequence of 0..9's.
    
    Move the %c code partially into the main integer handling block of
    code; this will shortly allow us to unify the SV-as-integer handling code.

M       sv.c

commit 6928e96f1e0ea85f1a32b45a2b848407a992d89b
Author: David Mitchell <[email protected]>
Date:   Fri May 26 16:05:18 2017 +0100

    Perl_sv_vcatpvfn_flags: %p and Inf/Nan
    
    sprintf("%p", 0+Inf) should print the address of an SV, not the literal
    string "Inf". Ditto NaN.
    
    Similarly, sprintf("%p", $x) should print the address of the $x SV,
    not triggering a tie fetch or overload method call, nor using the address
    of any SV returned by such calls.

M       sv.c
M       t/op/infnan.t
M       t/op/sprintf2.t
M       t/op/tie_fetch_count.t

commit 699602ef4a78af0f73d9f5230005304ff6f59ae9
Author: David Mitchell <[email protected]>
Date:   Thu May 25 12:09:52 2017 +0100

    Perl_sv_vcatpvfn_flags: make 'fill' var a boolean
    
    Currently the 'fill' local variable is a char, but it only ever holds the
    values ' ' or '0'. Make it into a boolean flag instead.

M       sv.c

commit aa9d259c102c066b0802240a76e6ff2f79ed6556
Author: David Mitchell <[email protected]>
Date:   Thu May 25 11:56:44 2017 +0100

    Perl_sv_vcatpvfn_flags: do %p specials in %p case
    
    There are currently a few special-cased %p variants (but only when called
    from C, not from perl) such as %-p, %2p etc. Currently these are handled
    specially at the top of main format-element loop, which penalises every
    format type. Instead move the handling into the "case 'p'" branch of the
    main switch. Which seems more logical, as well as more efficient.
    
    I've also heavily rewritten the big comment block about all the special %p
    formats.

M       sv.c

commit 3a61790cbaea2a88b3af920d7d86a602be4438f3
Author: David Mitchell <[email protected]>
Date:   Thu May 25 10:29:04 2017 +0100

    Perl_sv_vcatpvfn_flags: move UTF8f handling code
    
    The special UTF8f format (which is usually defined as something like
    "%d%lu%4p") is currently handled as a special case at the top of the main
    format-element loop.
    
    Instead move it into the "case "'d'" branch so that it doesn't slow down
    everything.

M       sv.c

commit 1d36d5114273ad0f312d36f5f3bf785dae18c5d8
Author: David Mitchell <[email protected]>
Date:   Wed May 24 16:29:16 2017 +0100

    Perl_sv_vcatpvfn_flags: add %n code comment
    
    point out thngs like "%-4.5n" don't currently warn

M       sv.c

commit 17e233f3a521662d2b212c2706589791f258c2ec
Author: David Mitchell <[email protected]>
Date:   Wed May 24 16:09:25 2017 +0100

    Perl_sv_vcatpvfn_flags: make %n missing arg fatal
    
    Normally sprintf et al just warn if there aren't enough args; but since %n
    wants to write the current string length to the next arg, make it fatal.
    
    Formerly it would croak anyway, but with a spurious "Modification of a
    read-only value" error as it as it tried to set &PL_sv_no

M       pod/perldiag.pod
M       sv.c
M       t/op/sprintf2.t

commit c3c7db3181b52b8c12f99436ac0920fef71a2b8b
Author: David Mitchell <[email protected]>
Date:   Wed May 24 15:58:06 2017 +0100

    Perl_sv_vcatpvfn_flags: comment %n deficiency
    
    This should be fixed sometime:
    
        /* XXX if sv was originally non-utf8 with a char in the
         * range 0x80-0xff, then if it got upgraded, we should
         * calculate char len rather than byte len here */

M       sv.c

commit 0ab1166529c34cbea5a15bcfeaaf0835bbf7ed8c
Author: David Mitchell <[email protected]>
Date:   Sat May 20 16:01:26 2017 +0100

    Perl_sv_vcatpvfn_flags: skip IN_LC(LC_NUMERIC)
    
    In a couple of places it does
    
        if (PL_numeric_radix_sv && IN_LC(LC_NUMERIC)) { ... }
    
    But PL_numeric_radix_sv is set to NULL unless we have a non-standard
    radix point (i.e. not "."), and this can only happen when we're in the
    scope of 'use locale'. So the IN_LC() should be a redundant (and
    expensive) test. Replace it with an assert.

M       sv.c

commit 610d7bd013327fea7b3cfc3267ef7a176ae69737
Author: David Mitchell <[email protected]>
Date:   Sat May 20 15:51:31 2017 +0100

    Perl_sv_vcatpvfn_flags: set locale at most once
    
    Calls to external snprintf-ish functions or that directly access
    PL_numeric_radix_sv are supposed to sandwich this access within
    
        STORE_LC_NUMERIC_SET_TO_NEEDED();
        ....
        RESTORE_LC_NUMERIC();
    
    The code in Perl_sv_vcatpvfn_flags() seems to have gotten a bit confused
    as to whether its trying to only set STORE_LC_NUMERIC_SET_TO_NEEDED()
    once, then handle one of more %[aefh] format elements, then only
    restore on exit. There is code at the end of the function which says:
    
        RESTORE_LC_NUMERIC();   /* Done outside loop, so don't have to 
save/restore
                                   each iteration. */
    
    but in practice various places within this function (and its helper
    function S_format_hexfp() inconsistently repeatedly do
    STORE_LC_NUMERIC_SET_TO_NEEDED(); and sometime do RESTORE_LC_NUMERIC().
    
    This commit changes it so that STORE_LC_NUMERIC_SET_TO_NEEDED() is called
    at most once, the first time a % format involving a radix point is
    encountered, and does RESTORE_LC_NUMERIC(); exactly once at the end of the
    function.
    
    Note that while calling STORE_LC_NUMERIC_SET_TO_NEEDED() multiple times
    is harmless, its quite expensive, as each time it has to check whether
    it's in the scope of 'use locale'. RESTORE_LC_NUMERIC() is cheap if
    STORE_LC_NUMERIC_SET_TO_NEEDED() earlier determined that there was nothing
    to do.

M       sv.c

commit 832087296200ccfeb258a0975e4e28a1ffa7a6f9
Author: David Mitchell <[email protected]>
Date:   Sat May 20 13:01:02 2017 +0100

    Perl_sv_vcatpvfn_flags: remove redundant code
    
    At the start of the function, it marks the output as being utf8 if the
    first arg is utf8. But this should be taken care of when the individual
    args (including the first one are processed). So its redundant code.
    
    In fact it would sometimes cause the resultant string to be unnecessarily
    upgraded to utf8, e.g.:
    
        my $precis = "9";
        utf8::upgrade($precis);
        my $s = sprintf "%.*f\n", $precis, 1.1;
        # whoops, $s is now utf8

M       sv.c
M       t/op/sprintf2.t

commit 809bc9408c1504a732a1f181f82f4a6a76852a1e
Author: David Mitchell <[email protected]>
Date:   Sat May 20 12:07:23 2017 +0100

    Perl_sv_vcatpvfn_flags: remove "%.Ng" special-case
    
    This function has special-case handling for the formats "%.0f" and
    "%.NNg", to speed things up. This special-casing appears twice,
    once near the top of the function for where the format matches exactly
    "%.0f" or "%.Ng" (N is 1..99), and once again in the main loop of the
    function, where it handles those format elements embedded in the larger
    format: "....%.0f..." and "....%.Ng..." (N > 0).
    
    The problem with the "%.Ng" code is that it isn't as robust as the more
    general "....%.Ng..." code - in particular the latter checks for a
    locale-dependent radix-point when determining needed buffer size.
    
    This commit removes the "%.Ng" special-cased code but leaves the
    "....%.Ng..." special-cased code. It makes the former about 7% slower
    compared to the situation at the start of this branch. (Part of the effort
    in this branch has been to make the "....%.Ng..." code faster, so that
    there's less of an overall performance hit by removing "%.Ng").

M       sv.c

commit 2f6d1b58b37c395714a6b3f994f0732cbdebcc9b
Author: David Mitchell <[email protected]>
Date:   Fri May 19 16:15:31 2017 +0100

    Perl_sv_vcatpvfn_flags: handle %.NNNg case earlier
    
    In the main loop, we look for %.NNNg and handle it specially.
    Change it so that the special-case is only used when precis is small
    enough to that it fits in the local ebuf[] rather than the malloced
    PL_efloatbuf. This allows the check for this special case to be done
    earlier with less redundant calculations.

M       sv.c

commit dd60cecf23cfc2ba4e697abd4901b74b108faf43
Author: David Mitchell <[email protected]>
Date:   Fri May 19 15:45:51 2017 +0100

    Perl_sv_vcatpvfn_flags: use quick concat for %.0f
    
    Most floating-point formats now use the quick concat path. But the
    "%.0f" shortcut was accidentally bypassing that path. This commit fixes
    that.

M       sv.c

commit 0aaf35b00a483e2407e967a8dd55979d195bc23f
Author: David Mitchell <[email protected]>
Date:   Thu May 18 12:47:51 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify concat of f/p str
    
    Since floating-point formats do their own formatting and padding, skip the
    block of code at the end of the main loop which handles appending eptr to
    sv, and do our own stripped-down version.

M       sv.c

commit 2571c54b00b991e16b6ee8111966cd4c6d3a78ef
Author: David Mitchell <[email protected]>
Date:   Thu May 18 11:44:17 2017 +0100

    Perl_sv_vcatpvfn_flags: s/gconverts/Gconvert's/
    
    fix a comment, so that a search for the word 'Gconvert' gets a match.
    So that a later comment 'See earlier comment about buggy Gconvert' makes
    sense.

M       sv.c

commit 07b1b50c1469a2f65f45a0c9092a2edc35d5eb3d
Author: David Mitchell <[email protected]>
Date:   Thu May 18 11:32:27 2017 +0100

    Perl_sv_vcatpvfn_flags: tighten hexfp var scope
    
    Only have the 'hexfp' var declared within the innermost scope it is
    actually needed for.

M       sv.c

commit 0935444cab80aa745052b0e4668207e1e590f930
Author: David Mitchell <[email protected]>
Date:   Thu May 18 11:17:32 2017 +0100

    Perl_sv_vcatpvfn_flags: rename 'is_simple' var
    
    the definition of 'simple' required the format to have a precision.

M       sv.c

commit 42904fad82ea5f9efca59571497f3508729f8088
Author: David Mitchell <[email protected]>
Date:   Thu May 18 11:03:28 2017 +0100

    Perl_sv_vcatpvfn_flags: move pod closer
    
    Several static functions etc had been added between the pod and the
    main function. Move the pod to be just above it.
    
    Also incorporate a comment into the pod about utf8ness of pattern and SV
    needing to match.

M       sv.c

commit cb32d08a0a03637e9167710f4cc4ff1231ca93d5
Author: David Mitchell <[email protected]>
Date:   Thu May 18 10:45:56 2017 +0100

    Perl_sv_vcatpvfn_flags: eliminate utf8buf[] var
    
    %c for a >255 char generates its utf8 byte representation and stores it in
    thiis temporarly buffer:
    
        U8 utf8buf[UTF8_MAXBYTES+1]
    
    But we already have another temporary buffer, ebuf, for creating floating
    point strings, which is big enough. So use that instead.

M       sv.c

commit e238238a28d3c3709b7205b6fbafed88a1d5e048
Author: David Mitchell <[email protected]>
Date:   Thu May 18 10:37:42 2017 +0100

    Perl_sv_vcatpvfn_flags: reorganise loop vars
    
    There are a big chunk of local vars declared at the top of the main loop.
    Reorder the declarations to group similar vars together, and add a comment
    to each var explaining what its for.
    
    No functional changes.

M       sv.c

commit a16480f0ee752baba286065cfcc03e0b5dbd6f60
Author: David Mitchell <[email protected]>
Date:   Thu May 18 09:49:08 2017 +0100

    Perl_sv_vcatpvfn_flags: move vars to inner scope
    
    Add a new scope around the floating-point code, then move some
    locals var declarations into that scope.

M       sv.c

commit bc2ed51d5a440c78d179206a53692e9f4d96538d
Author: David Mitchell <[email protected]>
Date:   Thu May 18 09:41:15 2017 +0100

    Perl_sv_vcatpvfn_flags: extract hex f/p code
    
    There is a large block of code (nearly 300 lines) in
    Perl_sv_vcatpvfn_flags(), which handles the %a/%A hexadecimal
    floating-point format. Move it into new static function,
    S_format_hexfp().
    
    No functional changes.

M       sv.c

commit e1b858915153fede944b84b78f018ceb20482726
Author: David Mitchell <[email protected]>
Date:   Thu May 18 09:03:20 2017 +0100

    Perl_sv_vcatpvfn_flags: move some macros earlier
    
    There are some macro definitions in the body of Perl_sv_vcatpvfn_flags()
    which handle some possible differences between double and long double.
    Move these to before the function as they will shortly need to be visible
    to a new helper function. At the same time, prefix their names with with
    VCATPVFN_ to make clear what they're for.
    
    For the same reason I've also added a new typedef, vcatpvfn_long_double_t.
    
    I also eliminated the FV_ISFINITE macro definition as its no longer used.

M       sv.c

commit 8c7735c327c62d9ed643e1bd8a60114fc05fd00a
Author: David Mitchell <[email protected]>
Date:   Wed May 17 13:36:27 2017 +0100

    remove HAS_LDBL_SPRINTF_BUG code
    
    This code was added in 2002 to work round an Irix 6 rounding bug in
    long double sprintfs.
    
    I strongly suspect that any such OS bug has long been fixed and/or such
    machines have been retired or are unlikely to have new perls installed on
    them.
    
    Part of the motivation for removing this code is that following the
    previous commit, that block of code's use of the float_need variable
    is likely to be wrong (since it now includes exponent etc), but I have no
    way of testing it.
    
    I've left the probe code in hints/irix_6.sh, so if anyone ever reports
    sprintf.t failures on an old Irix platform, perl -V should show if their
    system still has the bug. At that point someone brave could resurrect this
    block of code.

M       sv.c

commit a0196fce454db1e580272329b99af8ed21fd19af
Author: David Mitchell <[email protected]>
Date:   Wed May 17 12:27:18 2017 +0100

    Perl_sv_vcatpvfn_flags: better calc f/p buf size
    
    How it works out the needed buffer size for the various floating point
    formats is a bit opaque. This commit extensively documents and
    rationalises the process. In particular it will no longer allocate a very
    large buffer for %g printing a large number (%g switches to %e style
    format rather than %f in cases like this). Also it no longer relies on a
    +40 fudge factor to accommodate exponents - this is now factored in
    properly.
    
    It still includes a +20 safety fudge factor for production builds, but
    this is disabled under DEBUGGING so that ASAN and the like are likely to
    more quickly spot issues during development.

M       sv.c
M       t/op/sprintf2.t

commit 301909e3140bc5378d71148b8e560332351d5154
Author: David Mitchell <[email protected]>
Date:   Tue May 16 16:30:13 2017 +0100

    sprintf: handle sized int-ish formats with Inf/Nan
    
    The code path taken when int-ish formats saw an Inf/Nan was to jump to the
    floating-point handler, but then that would warn about (valid) size
    qualifiers. For example before:
    
        $ perl -we'printf "[%hi]\n", Inf'
        Invalid conversion in printf: "%hi" at -e line 1.
        Redundant argument in printf at -e line 1.
        [%hi]
        $
    
    After this commit:
    
        $ perl -we'printf "[%hi]\n", Inf'
        [Inf]
        $
    
    It also makes the code simpler.

M       sv.c
M       t/op/infnan.t

commit 488f879f46dfa42b3532a651b4e7825c48ce0dd9
Author: David Mitchell <[email protected]>
Date:   Tue May 16 08:53:19 2017 +0100

    Perl_sv_vcatpvfn_flags: handle Inf/Nan in 1 place
    
    At the start of the float section, check whether the value if Inf/Nan
    and handle directly. This stops later blocks of code having to test for it
    too. Also simplify the formatting of Inf/Nan - let the general code at the
    end of the block do any pre/post padding.

M       sv.c

commit 852d54b30c075fff3487e532aabf2f5bf5c2dfc3
Author: David Mitchell <[email protected]>
Date:   Mon May 15 18:59:54 2017 +0100

    Perl_sv_vcatpvfn_flags: sort PL_numeric_radix_sv
    
    Under locales the radix point may not be just a simple '.' but a Unicode
    string like "\N{ARABIC DECIMAL SEPARATOR}". Currently the hex f/p code
    explicitly takes account of the length of this string when calculating the
    buffer length, but the other branches don't - they just rely on the
    "add 40 fudge factor" to protect them.
    
    Instead, handle its length for all branches, and simplify utf8 handling.
    Currently it checks post-format whether the radix point was utf8, and if
    so marks the resulting buffer as utf8. Instead, check for utf8-ness at the
    same time we check for length.
    
    This new approach doesn't check whether the resulting string actually
    contains the radix point string, so in principle the string could be
    marked utf8 but not have any >127 chars. I think this is harmless.

M       sv.c

commit cfc390a235eeee124d0b19e681491fea545791bd
Author: David Mitchell <[email protected]>
Date:   Mon May 15 20:42:12 2017 +0100

    Perl_sv_vcatpvfn_flags() split %.0f and %.Ng
    
    The format elements "%.0f" and "%.NNNg" are handled specially in the main
    loop. Split the code block which handles them and process %.0f earlier. It
    doesn't need to allocate a variable-length buffer or worry about the
    length of the radix string.

M       sv.c

commit a947f756ba3fafc8f2e50c36a01f89f885c01408
Author: David Mitchell <[email protected]>
Date:   Mon May 15 14:49:50 2017 +0100

    S_F0convert(): remove Nan/Inf handling
    
    This function handles sprintf "%.0f". It also handles Inf/Nan, but neither
    of its callers will call it with such an nv. Its code for handling them is
    also broken - it returns the \0 following the "Inf" or "Nan! string.
    
    So just remove this unneeded and broken functionality.
    
    At the same time document what S_F0convert() does.

M       sv.c

commit 9451555371858033f04cb9289dd00caaddca2b22
Author: David Mitchell <[email protected]>
Date:   Mon May 15 13:54:17 2017 +0100

    Perl_sv_vcatpvfn_flags: fix arg to SNPRINTF_G()
    
    One of the callers of SNPRINTF_G() passes 'size' as its third arg - but
    there is no such variable. This code happens only to be used in the
    !USE_QUADMATH branch, and the SNPRINTF_G macro only uses that arg under
    USE_QUADMATH. So it doesn't matter. But replace 'size' with 'sizeof(ebuf)'
    in case that changes in future.

M       sv.c

commit 8adbd39c1037f3e5393bb958afa670dd525230dd
Author: David Mitchell <[email protected]>
Date:   Mon May 15 12:51:56 2017 +0100

    Perl_sv_vcatpvfn_flags: reduce scope of local var
    
    fix_ldbl_sprintf_bug is only used in one block of code so declare it in
    that block.
    Given that that block is only compiled under HAS_LDBL_SPRINTF_BUG,
    which seems only to be for some obscure Irix issues from 2002,
    I haven't actually tested this.

M       sv.c

commit ad6f3e14dacf14791cf4aaa417a8ca15466deab6
Author: David Mitchell <[email protected]>
Date:   Mon May 15 11:59:49 2017 +0100

    use SvCUR(PL_numeric_radix_sv) not SvLEN()
    
    When determining the length of buffer needed to output the decimal point
    in the current locale, use SvCUR(PL_numeric_radix_sv) rather than
    SvLEN(PL_numeric_radix_sv). I presume this was a thinko in the original
    commit. Using SvLEN currently seems harmless, since typically SvCUR <
    SvLEN, but one could conceive a future scenario where locale info is set
    using alien string buffers with SvLEN(sv) == 0.

M       sv.c

commit 394b5d7757c9abe088b4ae7be44975035fe7ca8e
Author: David Mitchell <[email protected]>
Date:   Thu May 11 09:06:05 2017 +0100

    Perl_sv_vcatpvfn_flags: reindent block
    
    whitespace only

M       sv.c

commit c9668165814fdc594cc30736ae47b218d3ce8c54
Author: David Mitchell <[email protected]>
Date:   Thu May 11 09:00:30 2017 +0100

    Perl_sv_vcatpvfn_flags: reduce scope of 'int i'
    
    Declare an 'i' var wherever needed for local use, rather than being in
    scope for 1600 lines.

M       sv.c

commit af32d0d3cce889be94f7f0536f50b02b405be402
Author: David Mitchell <[email protected]>
Date:   Wed May 10 17:23:51 2017 +0100

    Perl_sv_vcatpvfn_flags: get rid of an (int) cast
    
    harmless in this case, but there really shouldn't be (int) casts
    on string length and ptr diff calculations

M       sv.c

commit 492f362ea0feefc8bd2d77946b2c8b274c73ee9a
Author: David Mitchell <[email protected]>
Date:   Wed May 10 16:58:58 2017 +0100

    Perl_sv_vcatpvfn_flags: calc (width - elen) once
    
    There's a couple of blocks of code which repeat the expression
    (width - elen). Calculate this once at the top. This makes it slightly
    easier to audit the code for signed/unsigned wrap etc.
    
    Should be no functional change.

M       sv.c

commit c0f90dae9799e84f06e36661f032de01c11f24a5
Author: David Mitchell <[email protected]>
Date:   Wed May 10 16:17:18 2017 +0100

    Perl_sv_vcatpvfn_flags: avoid 1-byte buf overrun
    
    This only occurs on the "%a" (hex) format, and only happens when
    processing a denormalised value whose bit pattern is 0xf....f or similar,
    and when rounding up it needs to insert a '1' at the head of the number
    and shift the rest of the digits down one.
    
    In practice this never seems to happen - the top nybble of a denormalised
    float value always seems to be 0x1 (presumably because that's implicit) so
    there's never any carry to a higher digit. Maybe other platforms do it
    differently.
    
    Also VHEX_SIZE seems to be rounded up, so in practice there's no overrun.
    
    But better safe than sorry.

M       sv.c

commit bc864a27a25bd137cbae9a58d6b33b12b75807b0
Author: David Mitchell <[email protected]>
Date:   Wed May 10 15:27:49 2017 +0100

    Perl_sv_vcatpvfn_flags: avoid a potential wrap
    
    In the floating-point hex (%a) code, it checks whether the requested
    precision is smaller than the hex buf size. It does this by casting
    (precis + 1) to signed. Since precis can be any user-supplied value,
    this can wrap. Instead, cast the (buffer_length - 1) to unsigned, since
    this is bounded to a small constant value > 1.
    
    In practise this makes no difference currently, as a large precis will
    have caused a malloc panic earlier anyway. But that might change in
    future.

M       sv.c

commit 4e030392510854f69ca5b274cc35adcdd20e4c02
Author: David Mitchell <[email protected]>
Date:   Wed May 10 14:03:25 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify an expression
    
    In the hex floating/point code, (subnormal ? vfnz : vhex) is equivalent to
    v0, which we just set to the same value.
    
    So keep things simple.

M       sv.c

commit 706c62edbc8d9aaae36c44d95c5197c0415249db
Author: David Mitchell <[email protected]>
Date:   Wed May 10 11:19:38 2017 +0100

    sprintf(): handle mangled formats better with utf8
    
    Currently if sprintf() detects an error in the format while processing
    a %.... entry, it copies the bytes as-is from the % to the point the
    error was detected, then continues, If the output string and format string
    don't have the same utf8-ness, this can result in badly-formed utf8
    output.
    
    This commit changes the code so that it just appends a '%' then restarts
    processing from the character following the %. Most of the time this just
    again results with the characters following the % being output as-is,
    expect this time the 'normal' character-copying code path is taken, which
    handles utf8 mismatches correctly.
    
    By doing this, it also removes a block of code which contained a "roll
    your own" string appender which used SvGROW() and Copy(). This was one
    further place which was potentially open to wrapping and block overrun
    bugs.
    
    This commit may cause occasional changes in behaviour, depending on
    whether there are any further '%' characters within the bad section of the
    format.  Now these will be reprocessed, possibly triggering further
    'Invalid conversion' type warnings.

M       sv.c
M       t/op/sprintf.t
M       t/op/sprintf2.t

commit 3fd8abc769a5f898dfe7cdc26af9ec9d347b4632
Author: David Mitchell <[email protected]>
Date:   Tue May 9 15:55:07 2017 +0100

    Perl_sv_vcatpvfn_flags: simplify wrap checking
    
    The main SvGROW() has a new-length arg roughly equivalent to
    
        (SvCUR(sv) + elen + zeros + esignlen + dotstrlen + 1);
    
    Rationalise the overflow/wrap checking by doing each individual addition
    separately with its own check. This is slightly redundant as some of the
    values are interdependent, but this way it's easier to see whether all
    possible overflows are being checked for.
    
    `

M       sv.c

commit c43f346076ba6e1bd9f24e452988522129a0c917
Author: David Mitchell <[email protected]>
Date:   Tue May 9 15:32:49 2017 +0100

    Perl_sv_vcatpvfn_flags: reduce scope of 'gap' var
    
    shouldn't make any functional difference

M       sv.c

commit 24c611ff96387386536873173142a104f281a63e
Author: David Mitchell <[email protected]>
Date:   Tue May 9 15:29:25 2017 +0100

    Perl_sv_vcatpvfn_flags: reindent a block of code
    
    (whitespace-only change)
    
    indent a chunk of code ready for the next commit.

M       sv.c

commit b708d81ab1a0a6724dcdfe264f71f1a46fe3a55c
Author: David Mitchell <[email protected]>
Date:   Tue May 9 14:48:59 2017 +0100

    Perl_sv_vcatpvfn_flags: reduce scope of 'have' var
    
    Just declare this var in the small block where its needed, rather than
    being in scope for 500+ lines.
    
    Should be no functional changes.

M       sv.c

commit e27e670158d5b7b1b757bf3ac65970069d316cdd
Author: David Mitchell <[email protected]>
Date:   Tue May 9 14:36:40 2017 +0100

    Perl_sv_vcatpvfn_flags: split the 'need' local var
    
    The 'need' local var has a wide scope (over 500 lines), and is used for
    two separate purposes. Split it into two separate vars. One remains wide
    scope, but is just used to calculate the new value of PL_efloatsize. Rename
    that one to 'float_need'.
    
    For the second use, introduce a new scope of just 6 lines with its own
    'need' variable'.
    
    This should make no functional difference but makes the code slightly
    easier to understand and analyse.

M       sv.c

commit 173dd048e4d67d7a69d0a694acaa0e3834d5bb4b
Author: David Mitchell <[email protected]>
Date:   Tue May 9 14:29:11 2017 +0100

    sprintf(): add memory wrap tests
    
    In various places Perl_sv_vcatpvfn_flags() does croak_memory_wrap()
    (including a couple added by the previous commit to fix RT #131260),
    but there don't appear to be any tests for them.
    
    So this commit adds some tests.

M       t/op/sprintf2.t
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to