Re: Float rounding (in JSON)

2022-12-30 Thread Sergey via Digitalmars-d-learn

On Thursday, 13 October 2022 at 19:00:30 UTC, Sergey wrote:
I'm not a professional of IEEE 754, but just found this 
behavior at rounding in comparison with other languages. I 
supose it happened because in D float numbers parsed as double 
and have a full length of double while rounding. But this is 
just doesn't match with behavior in other languages.


So there is no luck with std.json for me. But when std is not the 
solution, third party libraries could help. I've tried ASDF. This 
is kind of archived library, but it works well, its documentation 
is small and clear (mir-ion really needs to improve 
documentation).


So in asdf we could just serialize the json and it will 
automatically round numbers with the same **magic** logic for 
floating as other languages do.


The only thing: some numbers which are usually double could be 
presented in JSON as integers. Automatically asdf convert them to 
double too. In case you need to process them exactly as integers 
you could use Variant!(int, double) as a type of the data. And 
provide your custom serializer/deserializer as it is proposed in 
asdf documentation example.

http://asdf.libmir.org/asdf_serialization.html#.serializeToAsdf

PS Thanks to Steven for his suggestions in Discord.



Re: Float rounding (in JSON)

2022-10-14 Thread bauss via Digitalmars-d-learn
On Friday, 14 October 2022 at 09:00:11 UTC, Patrick Schluter 
wrote:
On Thursday, 13 October 2022 at 19:27:22 UTC, Steven 
Schveighoffer wrote:

On 10/13/22 3:00 PM, Sergey wrote:

[...]


It doesn't look really that far off. You can't expect floating 
point parsing to be exact, as floating point does not 
perfectly represent decimal numbers, especially when you get 
down to the least significant bits.


[...]
To me it looks like there is a conversion to `real` (80 bit 
floats) somewhere in the D code and that the other languages 
stay in `double` mode everywhere. Maybe forcing `double` by 
disabling x87 on the D side would yield the same results as the 
other languages?


Looking through the source code then for floating points we call 
`parse!double` when parsing the json as a floating point.


I don't see real being used anywhere when parsing.

So if anything then it would have to be internally in parse or 
dmd. I haven't checked either yet.


Re: Float rounding (in JSON)

2022-10-14 Thread Patrick Schluter via Digitalmars-d-learn
On Thursday, 13 October 2022 at 19:27:22 UTC, Steven 
Schveighoffer wrote:

On 10/13/22 3:00 PM, Sergey wrote:

[...]


It doesn't look really that far off. You can't expect floating 
point parsing to be exact, as floating point does not perfectly 
represent decimal numbers, especially when you get down to the 
least significant bits.


[...]
To me it looks like there is a conversion to `real` (80 bit 
floats) somewhere in the D code and that the other languages stay 
in `double` mode everywhere. Maybe forcing `double` by disabling 
x87 on the D side would yield the same results as the other 
languages?




Re: Float rounding (in JSON)

2022-10-13 Thread Sergey via Digitalmars-d-learn
On Thursday, 13 October 2022 at 19:27:22 UTC, Steven 
Schveighoffer wrote:


Thank you Steven, for your very detailed answer.

It doesn't look really that far off. You can't expect floating 
point parsing to be exact, as floating point does not perfectly 
represent decimal numbers, especially when you get down to the 
least significant bits.


This is sad - because "exact" match is what I need in this toy 
example.



But I want to point out something you may have missed:


Actually I've meant those things too :)

But look also at f3, and how actually D is closer to the 
expected value than with the other languages.


If you want exact representation of data, parse it as a string 
instead of a double.


Unfortunately it is not helped me in this task (which is pretty 
awkward): it parses some GeoData from JSON file. Then create 
representation of that data into string format and use hash from 
that string. Because they use the Hash - I need exact the same 
string representation to match the answer.


I'm assuming you are comparing for testing purposes? If you 
are, just realize you can never be accurate here. You just have 
to live with the difference. Typically when comparing floating 
point values, you use an epsilon to ensure that the floating 
point value is "close enough", you can't enforce exact 
representation.


-Steve


Actually it was my attempt to implement the benchmark-game: 
https://programming-language-benchmarks.vercel.app/problem/json-serde


As you can see many languages have passed tests which I assume 
they have exactly same representation of that float numbers.
Maybe I am wrong and did not understand code from other 
realizations. But at least I test python and crystal and found 
pretty confusing their results (what you wrote about more 
accurate example of f3), but what surpsised me even more: they 
have exactly same confused results with those floating numbers.
That's why I've made a conclusion that maybe it is some special 
and declared behavior/rule for that and I just can't find how to 
replicate that "well known behavior" in D.





Re: Float rounding (in JSON)

2022-10-13 Thread Steven Schveighoffer via Digitalmars-d-learn

On 10/13/22 3:00 PM, Sergey wrote:
I'm not a professional of IEEE 754, but just found this behavior at 
rounding in comparison with other languages. I supose it happened 
because in D float numbers parsed as double and have a full length of 
double while rounding. But this is just doesn't match with behavior in 
other languages.


I'm not sure if this is somehow connected with JSON realizations.


It doesn't look really that far off. You can't expect floating point 
parsing to be exact, as floating point does not perfectly represent 
decimal numbers, especially when you get down to the least significant bits.




Is it possible in D to have the same results as in others? Because 
explicit formatting is not the answer, since length of rounding could be 
different. That's why just specify "%.XXf" will not resolve the issue - 
two last numbers have 14 and 15 positions after the dot.


It seems like you are looking to output a certain number of digits. If 
you limit the digits, you can get the outcome you desire.


But I want to point out something you may have missed:



**Code Python**
```python
import json

str = '{ "f1": 43.47637900065, "f2": 43.49971899987, "f3": 
43.49971800087, "f4": 43.41805299986 }'

print(json.loads(str))
```
**Result**
```
{'f1': 43.47637900065, 'f2': 43.49971899985, 'f3': 
43.4997180009, 'f4': 43.41805299986}

```


Let's line these up so we can read it easier

```
f1 in:  43.47637900065
f1 out: 43.47637900065
f2 in:  43.49971899987
f2 out: 43.49971899985
f3 in:  43.49971800087
f3 out: 43.4997180009
f4 in:  43.41805299986
f4 out: 43.41805299986
```

Note how f2 is a different output significantly than the input. This is 
an artifact of floating point parsing and the digits that are the most 
insignificant.


Also note that the omission of the 7 in f3 doesn't seem to have to do 
with rounding, because the digits are less than the original. If that 
digit were anywhere close to significant, you would have expected the 
digit to appear.




**Code Crystal**
```crystal
require "json"

str = "{ \"f1\": 43.47637900065, \"f2\": 43.49971899987, \"f3\": 
43.49971800087, \"f4\": 43.41805299986 }"


puts JSON.parse(str)
```
**Result**
```
{"f1" => 43.47637900065, "f2" => 43.49971899985, "f3" => 
43.4997180009, "f4" => 43.41805299986}

```


Same here



**Code D**
```d
import std;

void main() {
     string s = `{ "f1": 43.47637900065, "f2": 43.49971899987, 
"f3": 43.49971800087, "f4": 43.41805299986 }`;

     JSONValue j = parseJSON(s);
     writeln(j);
}
```
**Result**
```
{"f1":43.476379000654,"f2":43.499718999847,"f3":43.499718000867,"f4":43.418052999862}
```


Let's look at D's representation:

```
f1 in:  43.47637900065
f1 out: 43.476379000654
f2 in:  43.49971899987
f2 out: 43.499718999847
f3 in:  43.49971800087
f3 out: 43.499718000867
f4 in:  43.41805299986
f4 out: 43.418052999862
```

Why does it print one more digit than the other languages? Because that 
must be the default for `writeln`. You can affect this by changing the 
number of digits printed. But probably not when printing an entire JSON 
structure.


But look also at f3, and how actually D is closer to the expected value 
than with the other languages.


If you want exact representation of data, parse it as a string instead 
of a double.


I'm assuming you are comparing for testing purposes? If you are, just 
realize you can never be accurate here. You just have to live with the 
difference. Typically when comparing floating point values, you use an 
epsilon to ensure that the floating point value is "close enough", you 
can't enforce exact representation.


-Steve


Float rounding (in JSON)

2022-10-13 Thread Sergey via Digitalmars-d-learn
I'm not a professional of IEEE 754, but just found this behavior 
at rounding in comparison with other languages. I supose it 
happened because in D float numbers parsed as double and have a 
full length of double while rounding. But this is just doesn't 
match with behavior in other languages.


I'm not sure if this is somehow connected with JSON realizations.

Is it possible in D to have the same results as in others? 
Because explicit formatting is not the answer, since length of 
rounding could be different. That's why just specify "%.XXf" will 
not resolve the issue - two last numbers have 14 and 15 positions 
after the dot.


**Code Python**
```python
import json

str = '{ "f1": 43.47637900065, "f2": 43.49971899987, 
"f3": 43.49971800087, "f4": 43.41805299986 }'

print(json.loads(str))
```
**Result**
```
{'f1': 43.47637900065, 'f2': 43.49971899985, 'f3': 
43.4997180009, 'f4': 43.41805299986}

```

**Code Crystal**
```crystal
require "json"

str = "{ \"f1\": 43.47637900065, \"f2\": 43.49971899987, 
\"f3\": 43.49971800087, \"f4\": 43.41805299986 }"


puts JSON.parse(str)
```
**Result**
```
{"f1" => 43.47637900065, "f2" => 43.49971899985, "f3" => 
43.4997180009, "f4" => 43.41805299986}

```

**Code D**
```d
import std;

void main() {
string s = `{ "f1": 43.47637900065, "f2": 
43.49971899987, "f3": 43.49971800087, "f4": 
43.41805299986 }`;

JSONValue j = parseJSON(s);
writeln(j);
}
```
**Result**
```
{"f1":43.476379000654,"f2":43.499718999847,"f3":43.499718000867,"f4":43.418052999862}
```