Re: Precision loss casting float to numeric
Emre Hasegeliwrites: > We can also consider turning the current float to numeric casts to > explicit as they are causing data loss. I am not sure how much it > would impact backwards-compatibility. The counter argument is the > numeric to float casts being IMPLICIT. They are causing data loss for > sure, but I believe there are different reasons to keep them as > IMPLICIT. FWIW, the behavior of these casts is intended to model what the SQL standard says about the behavior of "exact numeric" vs "approximate numeric" types. (In our implementation, the integral types plus numeric are in the first category, float4 and float8 in the second.) The spec is perfectly clear that you can assign an approximate numeric to an exact numeric, or vice versa, without any explicit cast. It is also clear that arithmetic operations combining approximate and exact are allowed without a cast, and yield approximate results. These rules lead to our conclusions that exact -> approximate is an implicit cast (so that the parser will choose eg. float8 multiplication over numeric multiplication if you write numericvar * float8var) while approximate -> exact is an assignment cast (so that you can assign float8 to numeric without explicit casting). If the decisions had been driven purely by "what risks silent precision loss", no doubt we'd have done it differently ... but it's hard to do that and still meet the letter of the spec. regards, tom lane
Re: Precision loss casting float to numeric
Chapman Flackwrites: > I wonder whether even changing a formerly-implicit cast to explicit > would be too much of a behavior change for existing code that expects > the current behavior? We did exactly that during the 8.3 cycle, and got one heck of a lot of pushback about it, despite the existence of a lot of examples showing that it would be a good idea. I won't say we can't do it again, but it won't be an easy sell. regards, tom lane
Re: Precision loss casting float to numeric
On 03/09/18 12:05, Emre Hasegeli wrote: > In this case, I cannot see any other option than adding those as > separate cast functions. Should we mark this entry as "returned with > feedback"? > > We can also consider turning the current float to numeric casts to > explicit as they are causing data loss. I am not sure how much it > would impact backwards-compatibility. The counter argument is the > numeric to float casts being IMPLICIT. They are causing data loss for > sure, but I believe there are different reasons to keep them as > IMPLICIT. Thanks for the feedback. I will mark it RWF myself, as the backward- compatibility issues are kind of paralyzing, and I don't think I'll have time in this CF to give it enough further thought anyway. I wonder whether even changing a formerly-implicit cast to explicit would be too much of a behavior change for existing code that expects the current behavior? -Chap
Re: Precision loss casting float to numeric
>> I wonder if an alternative to making a cast that can't be immutable, >> because it looks at a GUC, could be to offer a choice of cast >> functions: if you need the other behavior, create a schema, do a >> CREATE CAST in it, and put it on your search path ahead of pg_catalog. > > Nope, because casts aren't schema-local, since they don't have names. > There can be only one cast between given source and target types. In this case, I cannot see any other option than adding those as separate cast functions. Should we mark this entry as "returned with feedback"? We can also consider turning the current float to numeric casts to explicit as they are causing data loss. I am not sure how much it would impact backwards-compatibility. The counter argument is the numeric to float casts being IMPLICIT. They are causing data loss for sure, but I believe there are different reasons to keep them as IMPLICIT.
Re: Precision loss casting float to numeric
On Mon, Feb 26, 2018 at 11:29 AM, Tom Lanewrote: > Chapman Flack writes: > > The 0002-*.patch is a proof-of-concept patching float4_numeric and > > float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and > > DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new > > regression test pass. (It will only work under a compiler that has > > __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used > > those internal versions to avoid mucking with build tooling to change > > the target C standard, which I assume wouldn't be welcome anyway. > > Nope. TBH, I'd think about just using "DBL_DIG + 3", given our existing > coding around extra_float_digits in places like pg_dump and postgres_fdw. > The knowledge that you need 2 or 3 extra digits is already well embedded. > > Conceivably you could do it like > > #ifndef DBL_DECIMAL_DIG > #ifdef __DBL_DECIMAL_DIG__ > #define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__ > #else > #define DBL_DECIMAL_DIG (DBL_DIG + 3) > #endif > #endif > > but I'm not exactly seeing how that buys us anything. > > The bigger question here is whether people actually want this behavioral > change. I think there's probably a bigger chance of complaints that > "casting 1.1::float8 to numeric now produces some weird, > incorrectly-rounded result" than that we make anyone happier. > > I have a vague idea that at some point in the past we discussed making > this conversion use extra_float_digits, which'd allow satisfying both > camps, at the nontrivial price that the conversion would have to be > considered stable not immutable. We didn't pull the trigger, if this > memory is real at all, presumably because of the mutability issue. > > Another idea would be to leave the cast alone and introduce a named > function that does the "exact" conversion. Possibly that makes nobody > happy, but at least both the cast and the function could be immutable. > It'd dodge backwards-compatibility objections, too. > > regards, tom lane > Working for a company that has enterprise customers this can't be overemphasized. Never require the user to do something so they keep getting the same results. It doesn't matter if it's "wrong". I would vote for a property. If you want the best effort to match the IEEE spec you need to execute 'set use_ieee_numbers' and you'll get the extra digits and rounding behavior. If not you'll get the existing behavior. Bear
Re: Precision loss casting float to numeric
Chapman Flackwrites: > I wonder if an alternative to making a cast that can't be immutable, > because it looks at a GUC, could be to offer a choice of cast > functions: if you need the other behavior, create a schema, do a > CREATE CAST in it, and put it on your search path ahead of pg_catalog. Nope, because casts aren't schema-local, since they don't have names. There can be only one cast between given source and target types. regards, tom lane
Re: Precision loss casting float to numeric
On 02/26/2018 01:29 PM, Tom Lane wrote: > I think there's probably a bigger chance of complaints that > "casting 1.1::float8 to numeric now produces some weird, > incorrectly-rounded result" than that we make anyone happier. Arguably, though, that's a moment that can be used to explain exactly what the correctly-rounded value of 1.1::float8 is and why, and then people both know something new and understand more precisely what's happening to their data, and can apply round() to it in exactly the places they want, if they want. In contrast, the current fact that 1.1::float8 looks like 1.1 when cast to numeric puts a superficial smile on people's faces, while they haven't really been asked how they feel about losing five, six, or thirty-eight bits of precision when casting one data type into another of supposedly greater precision and back. I think the typical assumption is that, sure, you may lose precision if you cast to a *less* precise type, but the other direction's assumed value-preserving. I can see the concern about changing behavior for code that may exist already. I would never have thought of making the behavior of a cast sensitive to extra_float_digits (in fact, I hadn't known about extra_float_digits; it's described in the "locale and formatting" section, which I never dreamed of consulting for a cast between internal value representations; am I weird in that?). I wonder if an alternative to making a cast that can't be immutable, because it looks at a GUC, could be to offer a choice of cast functions: if you need the other behavior, create a schema, do a CREATE CAST in it, and put it on your search path ahead of pg_catalog. Kind of ugly, but that can happen dealing with kind of ugly situations. -Chap
Re: Precision loss casting float to numeric
On Mon, Feb 26, 2018 at 1:29 PM, Tom Lanewrote: > The bigger question here is whether people actually want this behavioral > change. I think there's probably a bigger chance of complaints that > "casting 1.1::float8 to numeric now produces some weird, > incorrectly-rounded result" than that we make anyone happier. Yeah, I worry about that, too. Of course, as you know, I also have a deep and abiding skepticism about IEEE binary floats as a concept. Anomalies are unavoidable; we can choose exactly which set users experience, but we can eliminate them because the underlying storage format is poorly-adapted to the behavior people actually want. It's too bad that IEEE decimal floats weren't standardized until 2008. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Precision loss casting float to numeric
Here are two patches. The 0001-*.patch simply adds a regression test to numeric.sql that bits aren't lost casting from float[48] to numeric. Naturally, it fails at this stage. The 0002-*.patch is a proof-of-concept patching float4_numeric and float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new regression test pass. (It will only work under a compiler that has __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used those internal versions to avoid mucking with build tooling to change the target C standard, which I assume wouldn't be welcome anyway. This patch is just POC and not proposed as anything else.) It changes the output of an existing numeric test and a few numeric aggregate tests. I have adjusted the numeric test: its purpose is to check the direction of numeric rounding of ties, but the value to be rounded was computed in float8 and then cast to numeric, and because negative powers of ten aren't tidy in binary, it can turn out that the float8 computation produces a correctly-rounded-53-bit-result that is on the nearer-to-zero side of a tie, and now that that result is correctly cast, the resulting numeric doesn't round in the away-from-zero direction. I changed that test because I concluded it wasn't meant to test float8-to-numeric casting, but only the rounding of tied numerics, so I just made the inner expression be typed numeric, and the expected output is unchanged. The three aggregate tests with changed output are working from a table of float4 values, and my assumption is they are now simply producing the correct aggregate results given the full precision of the input values, but I haven't confirmed that yet, so this patch leaves those three failing for now. -Chap >From 5b3b05009830b29d1d3167bcd3321228535161dc Mon Sep 17 00:00:00 2001 From: Chapman FlackDate: Sun, 25 Feb 2018 22:52:22 -0500 Subject: [PATCH 1/2] Add regression test for cast float[48] to numeric. At this stage, as expected, it fails. --- src/test/regress/expected/numeric.out | 34 ++ src/test/regress/sql/numeric.sql | 31 +++ 2 files changed, 65 insertions(+) diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out index 17985e8..c186385 100644 --- a/src/test/regress/expected/numeric.out +++ b/src/test/regress/expected/numeric.out @@ -2063,3 +2063,37 @@ SELECT SUM((-)::numeric) FROM generate_series(1, 10); -0 (1 row) +-- +-- Tests for cast from float4, float8 +-- +-- check precision of mantissa not lost +WITH RECURSIVE + f4bit(place) AS ( +VALUES (1::float4) +UNION +SELECT place/2::float4 +FROM f4bit +WHERE 1::float4 + place/2::float4 > 1::float4 + ), + f4(epsilon) AS (select min(place) FROM f4bit), + f8bit(place) AS ( +VALUES (1::float8) +UNION +SELECT place/2::float8 +FROM f8bit +WHERE 1::float8 + place/2::float8 > 1::float8 + ), + f8(epsilon) AS (SELECT min(place) FROM f8bit), + testvalues(f4, f8) AS ( +SELECT 1::float4+f4.epsilon, 1::float8+f8.epsilon +FROM f4, f8 + ) +SELECT + f4::numeric > 1::numeric AS f4numericpreserved, + f8::numeric > 1::numeric AS f8numericpreserved +FROM testvalues; + f4numericpreserved | f8numericpreserved ++ + t | t +(1 row) + diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql index d77504e..dab9ae2 100644 --- a/src/test/regress/sql/numeric.sql +++ b/src/test/regress/sql/numeric.sql @@ -1036,3 +1036,34 @@ select scale(-13.000); -- cases that need carry propagation SELECT SUM(::numeric) FROM generate_series(1, 10); SELECT SUM((-)::numeric) FROM generate_series(1, 10); + +-- +-- Tests for cast from float4, float8 +-- + +-- check precision of mantissa not lost +WITH RECURSIVE + f4bit(place) AS ( +VALUES (1::float4) +UNION +SELECT place/2::float4 +FROM f4bit +WHERE 1::float4 + place/2::float4 > 1::float4 + ), + f4(epsilon) AS (select min(place) FROM f4bit), + f8bit(place) AS ( +VALUES (1::float8) +UNION +SELECT place/2::float8 +FROM f8bit +WHERE 1::float8 + place/2::float8 > 1::float8 + ), + f8(epsilon) AS (SELECT min(place) FROM f8bit), + testvalues(f4, f8) AS ( +SELECT 1::float4+f4.epsilon, 1::float8+f8.epsilon +FROM f4, f8 + ) +SELECT + f4::numeric > 1::numeric AS f4numericpreserved, + f8::numeric > 1::numeric AS f8numericpreserved +FROM testvalues; -- 2.7.3 >From 4d721ac45967f64415bafb82039a652f0a451e2c Mon Sep 17 00:00:00 2001 From: Chapman Flack Date: Mon, 26 Feb 2018 01:26:01 -0500 Subject: [PATCH 2/2] Proof of concept fix of float[48]_numeric. This patch simply uses the new FLT_DECIMAL_DIG/DBL_DECIMAL_DIG constants in place of the FLT_DIG/DBL_DIG, just to see what happens. It uses the