2006]

Yann POUPET Tue, 17 Oct 2006 00:43:05 +0200

>>> DESCRIPTION
>>>     The humanize_number() function will convert the signed integer
>>>     <value> to a string, which will write at most <buf_len> bytes
>>>     to the output <buffer>. The format of the output string
>>>     is described by the <fmt> string.
>>>
>>>     Conversion Specification
>>>     The <fmt> string determines how <value> will be formatted to
>>>     the output in a fashion similar to sprintf(). Each conversion
>>>     specifier within the string consists of the % character, after
>>>     which the following appear in sequence:
>>>       
>
> Why is there no way to request that a % character be included in the
> output.
>
>


Indeed ... '%%' would permit to have the '%' char to be included.

>>>       o One or more of the following modifiers:
>>>       
>
> Did you mean "Zero or more" rather than "One or more"?  Most of the
> examples don't contain any modifiers.
>
>   

Yes.
>>>           o An optional decimal length. If the result is smaller,
>>>         the output will be padded according to the modifiers
>>>         if present, otherwise the result will be space-padded.
>>>       
>
> If no length is specified, why is there any padding?  Most of the
> examples don't show padding characters even though this says spaces
> will be added???
>
>   

Yes, of course, if no length is specified, no padding is done.

>>>       o An optional decimal precision, expressed as .# where
>>>         # is an unsigned decimal integer. When not specified,
>>>             three digits is the default precision.
>>>       
>
> Am I correct in assuming that .0 also suppresses the radix character?
> Am I also correct in assuming that if the # modifier above causes all
> zero characters after the radix character to be suppressed the radix
> character will also be suppressed?
>   

Yes

>>>       o An optional 'b' character, which will express the result
>>>         as radix-2 rather than radix-10 (radix 2 uses 1024^N
>>>         whereas the default of radix 10 uses 1000^N to convert
>>>         the number).
>>>       o An optional unit character, from the following set:
>>>             'A' : the unit is choosen automatically for the best
>>>                   human-readable result. The integer part of the
>>>                   result will be between 0 and 999 when radix-10
>>>                   is used for the conversion, or 0 to 1023 when
>>>                   radix-2 is used for the conversion.
>>>             'N' : use no metric unit for conversion.
>>>             'K' : use the metric unit of kilo.
>>>             'M' : use the metric unit of mega.
>>>             'G' : use the metric unit of giga.
>>>             'P' : use the metric unit of peta.
>>>             'T' : use the metric unit of tera.
>>>             'E' : use the metric unit of exa.
>>>       
>
> Shouldn't you also specify lowercase versions of these unit characters
> to produce lowercase characters in the output?
>
>   

There have been many questions regarding the unit chars, I have no
answer yet.

>>> RETURN VALUE
>>>     The humanize_number() function returns the number of bytes
>>>     written into <buffer> if less than or equal to <buf_len>,
>>>     otherwise returns the number of bytes that would have been
>>>     written to <buffer> if <buf_len> had been sufficiently large.
>>>     The value of <buf_len> may be passed as zero, and/or <buffer>
>>>     may be a null pointer; this will result in a return value of
>>>     the minimal-length buffer required to contain a complete
>>>     result (minus a terminating NUL character).
>>>       
>
> Am I correct in assuming that you mean "not including the" rather than
> the mathematical operation of subtracting zero stated by "minus a"?
> (I.e., if I want to use the return value as an argument to malloc(), I
> need to add 1 to actually get a minimal-length buffer big enough to
> contain a complete result including the terminating '\0' character?)
>   

Yes

> What goes into <buffer> if <buf_len> bytes aren't sufficient to hold
> the resulting string?
>   

What fits in the buffer (with a null ending char). For example, if the
buffer is 2 bytes long, and the result is 123, buffer contains the char
'1' and the null char.
>   
>>> EXAMPLES
>>>     humanize_number(123456789, "%5M", buffer, len)
>>>     will output "123.457M" into the buffer
>>>
>>>     humanize_number(123456789, "%5.M", buffer, len)
>>>       
>
> According to the spec above, this format is not legal (M is not an
> unsigned decimal number).
>
>   

In this case "%5.M" is equivalent to "%5.0M"


>>>     will output " 123M" into the buffer
>>>
>>>     humanize_number(123456789, "%5.0M", buffer, len)
>>>     will output " 123M" into the buffer
>>>       
>
> Why is there a leading space in the example above, but no leading space
> in the examples below?
>
>   

I've tried to keep printf's behaviour. On my solaris box (snv34), this
simple C test
printf("foo='%5.f'\n",(double)5/3);
printf("foo='%5.0f'\n",(double)5/3);
printf("foo='%5.1f'\n",(double)5/3);
printf("foo='%5.2f'\n",(double)5/3);
printf("foo='%5.3f'\n",(double)5/3);
printf("foo='%5.4f'\n",(double)5/3);

will output
foo='    2'        [ 4 trailing chars ]
foo='    2'        [ 4 trailing chars ]
foo='  1.7'       [ 2 trailing chars ]
foo=' 1.67'      [ 1 trailing char ]
foo='1.667'     [no trailing char ]
foo='1.6667'   [no trailing char ]

the "rule" I have observed is : when using a format like '%x.y', if the
resulting string is shorter than x, do padding. This is what I've
applied, except that in this case, the unit value ('M') is counted in
the string length.

> Note that most of the issues raised above would already be covered if
> you added a flag or modifier character (or two) to the *printf and
> *scanf function families of functions' format strings.  I'm not
> convinced by your argument so far that we couldn't find anything (like
> 'm' or 'M' for metric) to be used with signed and unsigned decimal,
> octal, and hexadecimal conversions.  With a little bit of research, I'd
> be surprised if the project team couldn't make a proposal that would be
> accepted by "the standards community" with no conflicts with other
> existing practice.
>   

I did not want to modify a so widely used function, and did not want
either to take the risk to break some compatibilities. If we chose, for
example, something like %M for metric, what would happen  for  sofware
who already use %M in their printf format string ? Right now,
printf("%5M",(double)1/3)) will output "M". Of course, this syntax is
weird since %M does not correspond to anything, but we cannot be sure
it's not used somewhere.


Yann

Human-readable number library routine [PSARC-EXT/2006/573 Timeout: 10/17/2006]

Reply via email to