On Sun, Dec 15, 2013 at 2:27 PM, Josh Berkus <j...@agliodbs.com> wrote:
> On 12/14/2013 05:00 PM, Tom Lane wrote: > > This consideration also makes me question whether we should apply the > > method for NUMERIC. Although in principle numeric addition/subtraction > > is exact, such a sequence could leave us with a different dscale than > > is returned by the existing code. I'm not sure if changing the number of > > trailing zeroes is a big enough behavior change to draw complaints. > > If we're going to disqualify NUMERIC too, we might as well bounce the > feature. Without a fast FLOAT or NUMERIC, you've lost most of the > target audience. > > I think even the FLOAT case deserves some consideration. What's the > worst-case drift? In general, folks who do aggregate operations on > FLOATs aren't expecting an exact answer, or one which is consistent > beyond a certain number of significant digits. > > And Dave is right: how many bug reports would we get about "NUMERIC is > fast, but FLOAT is slow"? > > >From what I can tell adding an inverse transition function to support AVG for numeric does not affect the number of trailing zeros in the results, so I've attached a patch which now has inverse transition functions to support numeric types in AVG and all of the STDDEV* aggregates. The function numeric_avg_accum_inv is the inverse transition function for AVG. In the attached patch I've set this to be the inverse transition function for SUM too. I know it was mentioned that having extra trailing zeros here is a change of results and could be unwanted, but I've left it in for now and I've included a failing regression test to demonstrate exactly what has changed in the hope that it may seed a discussion on what the best solution is. Here's a quick example of what I'm talking about: With this query I'll include a call to MAX to force the executor to not use inverse transitions. SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING), MAX(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) FROM (VALUES(1,1.00000000000001),(2,2),(3,3)) v(i,n); sum | max ------------------+----- 6.00000000000001 | 3 5 | 3 3 | 3 (3 rows) Notice lack of trailing zeros for the sum results in rows 2 and 3. Here an inverse transition will be performed... Notice the extra trailing zeros on rows 2 and 3. SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) FROM (VALUES(1,1.00000000000001),(2,2),(3,3)) v(i,n); sum ------------------ 6.00000000000001 5.00000000000000 3.00000000000000 (3 rows) My ideas so far on what to do about this: 1. Don't allow inverse transition for SUM with numeric, but since numeric_avg_accum_inv is already written for AVG numeric support we could include a note in the docs to tell users how to create a SUM aggregate for numeric that supports inverse transitions. 2. Don't worry about the trailing zeros... After all SELECT AVG(n) FROM (VALUES(2::numeric)) v(n); adds a whole bunch of trailing zeros. Only the above 2 queries shows that this behaviour can be a bit weird. 3. Invent some way to remove trailing zeros, perhaps in numeric_out. Is there a reason they are there? Note that this would affect numeric globally. What I don't currently know is: Are there any rules about trailing zeros on numeric? I see that AVG with a single numeric produces many trailing zeros after the decimal point. Are these meant to be there or is it just a side effect of dividing on numerics? Please note that I'm not trying to push for any of the above points. I just want to get the information I currently have out there as perhaps someone can think of something better or show a good reason for or against any of the 3 points. Regards David Rowley -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com >
inverse_transition_functions_v1.7.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers