>>> DESCRIPTION
>>> The humanize_number() function will convert the signed integer
>>> <value> to a string, which will write at most <buf_len> bytes
>>> to the output <buffer>. The format of the output string
>>> is described by the <fmt> string.
>>>
>>> Conversion Specification
>>> The <fmt> string determines how <value> will be formatted to
>>> the output in a fashion similar to sprintf(). Each conversion
>>> specifier within the string consists of the % character, after
>>> which the following appear in sequence:
>>>
>
> Why is there no way to request that a % character be included in the
> output.
>
>
Indeed ... '%%' would permit to have the '%' char to be included.
>>> o One or more of the following modifiers:
>>>
>
> Did you mean "Zero or more" rather than "One or more"? Most of the
> examples don't contain any modifiers.
>
>
Yes.
>>> o An optional decimal length. If the result is smaller,
>>> the output will be padded according to the modifiers
>>> if present, otherwise the result will be space-padded.
>>>
>
> If no length is specified, why is there any padding? Most of the
> examples don't show padding characters even though this says spaces
> will be added???
>
>
Yes, of course, if no length is specified, no padding is done.
>>> o An optional decimal precision, expressed as .# where
>>> # is an unsigned decimal integer. When not specified,
>>> three digits is the default precision.
>>>
>
> Am I correct in assuming that .0 also suppresses the radix character?
> Am I also correct in assuming that if the # modifier above causes all
> zero characters after the radix character to be suppressed the radix
> character will also be suppressed?
>
Yes
>>> o An optional 'b' character, which will express the result
>>> as radix-2 rather than radix-10 (radix 2 uses 1024^N
>>> whereas the default of radix 10 uses 1000^N to convert
>>> the number).
>>> o An optional unit character, from the following set:
>>> 'A' : the unit is choosen automatically for the best
>>> human-readable result. The integer part of the
>>> result will be between 0 and 999 when radix-10
>>> is used for the conversion, or 0 to 1023 when
>>> radix-2 is used for the conversion.
>>> 'N' : use no metric unit for conversion.
>>> 'K' : use the metric unit of kilo.
>>> 'M' : use the metric unit of mega.
>>> 'G' : use the metric unit of giga.
>>> 'P' : use the metric unit of peta.
>>> 'T' : use the metric unit of tera.
>>> 'E' : use the metric unit of exa.
>>>
>
> Shouldn't you also specify lowercase versions of these unit characters
> to produce lowercase characters in the output?
>
>
There have been many questions regarding the unit chars, I have no
answer yet.
>>> RETURN VALUE
>>> The humanize_number() function returns the number of bytes
>>> written into <buffer> if less than or equal to <buf_len>,
>>> otherwise returns the number of bytes that would have been
>>> written to <buffer> if <buf_len> had been sufficiently large.
>>> The value of <buf_len> may be passed as zero, and/or <buffer>
>>> may be a null pointer; this will result in a return value of
>>> the minimal-length buffer required to contain a complete
>>> result (minus a terminating NUL character).
>>>
>
> Am I correct in assuming that you mean "not including the" rather than
> the mathematical operation of subtracting zero stated by "minus a"?
> (I.e., if I want to use the return value as an argument to malloc(), I
> need to add 1 to actually get a minimal-length buffer big enough to
> contain a complete result including the terminating '\0' character?)
>
Yes
> What goes into <buffer> if <buf_len> bytes aren't sufficient to hold
> the resulting string?
>
What fits in the buffer (with a null ending char). For example, if the
buffer is 2 bytes long, and the result is 123, buffer contains the char
'1' and the null char.
>
>>> EXAMPLES
>>> humanize_number(123456789, "%5M", buffer, len)
>>> will output "123.457M" into the buffer
>>>
>>> humanize_number(123456789, "%5.M", buffer, len)
>>>
>
> According to the spec above, this format is not legal (M is not an
> unsigned decimal number).
>
>
In this case "%5.M" is equivalent to "%5.0M"
>>> will output " 123M" into the buffer
>>>
>>> humanize_number(123456789, "%5.0M", buffer, len)
>>> will output " 123M" into the buffer
>>>
>
> Why is there a leading space in the example above, but no leading space
> in the examples below?
>
>
I've tried to keep printf's behaviour. On my solaris box (snv34), this
simple C test
printf("foo='%5.f'\n",(double)5/3);
printf("foo='%5.0f'\n",(double)5/3);
printf("foo='%5.1f'\n",(double)5/3);
printf("foo='%5.2f'\n",(double)5/3);
printf("foo='%5.3f'\n",(double)5/3);
printf("foo='%5.4f'\n",(double)5/3);
will output
foo=' 2' [ 4 trailing chars ]
foo=' 2' [ 4 trailing chars ]
foo=' 1.7' [ 2 trailing chars ]
foo=' 1.67' [ 1 trailing char ]
foo='1.667' [no trailing char ]
foo='1.6667' [no trailing char ]
the "rule" I have observed is : when using a format like '%x.y', if the
resulting string is shorter than x, do padding. This is what I've
applied, except that in this case, the unit value ('M') is counted in
the string length.
> Note that most of the issues raised above would already be covered if
> you added a flag or modifier character (or two) to the *printf and
> *scanf function families of functions' format strings. I'm not
> convinced by your argument so far that we couldn't find anything (like
> 'm' or 'M' for metric) to be used with signed and unsigned decimal,
> octal, and hexadecimal conversions. With a little bit of research, I'd
> be surprised if the project team couldn't make a proposal that would be
> accepted by "the standards community" with no conflicts with other
> existing practice.
>
I did not want to modify a so widely used function, and did not want
either to take the risk to break some compatibilities. If we chose, for
example, something like %M for metric, what would happen for sofware
who already use %M in their printf format string ? Right now,
printf("%5M",(double)1/3)) will output "M". Of course, this syntax is
weird since %M does not correspond to anything, but we cannot be sure
it's not used somewhere.
Yann