Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

Florian Pflug Sun, 24 Jul 2011 17:04:30 -0700

On Jul25, 2011, at 00:48 , Joey Adams wrote:
> On Sun, Jul 24, 2011 at 2:19 PM, Florian Pflug <f...@phlo.org> wrote:
>> On Jul24, 2011, at 05:14 , Robert Haas wrote:
>>> On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams <joeyadams3.14...@gmail.com> 
>>> wrote:
>>>> ... Fortunately, JSON's definition of a
>>>> "number" is its decimal syntax, so the algorithm is child's play:
>>>> 
>>>>  * Figure out the digits and exponent.
>>>>  * If the exponent is greater than 20 or less than 6 (arbitrary), use
>>>> exponential notation.
>> 
>> I agree. As for your proposed algorithm, I suggest to instead use
>> exponential notation if it produces a shorter textual representation.
>> In other words, for values between -1 and 1, we'd switch to exponential
>> notation if there's more than 1 leading zero (to the right of the decimal
>> point, of course), and for values outside that range if there're more than
>> 2 trailing zeros and no decimal point. All after redundant zeros and
>> decimal points are removed. So we'd store
>> 
>> 0 as 0
>> 1 as 1
>> 0.1 as 0.1
>> 0.01 as 0.01
>> 0.001 as 1e-3
>> 10 as 10
>> 100 as 100
>> 1000 as 1e3
>> 1000.1 as 1000.1
>> 1001 as 1001
> 
> Interesting idea.  The reason I suggested using exponential notation
> only for extreme exponents (less than -6 or greater than +20) is
> partly for presentation value.  Users might be annoyed to see 1000000
> turned into 1e6.


I'm not concerned about that, but ...

> Moreover, applications working solely with integers
> that don't expect the floating point syntax may choke on the converted
> numbers.

now that you say it, that's definitely a concern.

> 32-bit integers can be losslessly encoded as IEEE
> double-precision floats (JavaScript's internal representation), and
> JavaScript's algorithm for converting a number to a string ([1],
> section 9.8.1) happens to preserve the integer syntax (I think).

Indeed. In fact, it seems to be designed to use the integer syntax
for all integral values with <= 66 binary digits. log10(2^66) ~ 19.87

> Should we follow the JavaScript standard for rendering numbers (which
> my suggestion approximates)?  Or should we use the shortest encoding
> as Florian suggests?

In the light of the above, consider my suggestion withdrawn. I now think
we should just follow the JavaScript standard as closely as possible.
As you said, it's pretty much the same as your suggestion, just more precise
in the handling of some corner-cases like infinity, nan, +/-0, some
questions of leading and trailing zeros, ...

I wouldn't have made my suggestion had I realized earlier that limit
of 20 for the exponent was carefully chosen to ensure that the full
range of a 64-bit integer value would be represented in non-exponential
notation. I assumed the bracketed "arbitrary" in your description applied
to both the upper (20) as well as the lower (-6) bound, when it really only
applies to the lower bound. Sorry for that.

(I am now curious where the seemingly arbitrary lower bound of -6 comes
from, though. The only explanation I can come up with is that somebody
figured that 0.000001 is still easily distinguished visually from
0.00001, but not so much from 0.0000001)

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

Reply via email to