Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-08-10 Thread Bruce Momjian

Have we made any progress on this?

---

Tom Lane wrote:
 Michael Glaesemann [EMAIL PROTECTED] writes:
  ... I think this just confirms that there is some kind of rounding (or  
  lack of) in interval_div. Kind of frustrating that it's not visible  
  in the result.
 
 I think the fundamental problem is that the float8 results of division
 are inaccurate, and yet we're assuming that we can (for instance) coerce
 them to integer and get exactly the right answer.  For instance, in the
 '41 months'/10 example, I get month_remainder_days being computed as
 
 (gdb) p month_remainder
 $19 = 0.099645
 (gdb) s
 2575result-day += (int32) month_remainder_days;
 (gdb) p month_remainder_days
 $20 = 2.9893
 
 The only way we can really fix this is to be willing to round off
 the numbers, and I think the only principled way to do that is to
 settle on a specific target accuracy, probably 1 microsecond.
 Then the thing to do would be to scale up all the intermediate
 float results to microseconds and apply rint().  Something like
 (untested)
 
   month_remainder = rint(span-month * USECS_PER_MONTH / factor);
   day_remainder = rint(span-day * USECS_PER_DAY / factor);
   result-month = (int32) (month_remainder / USECS_PER_MONTH);
   result-day = (int32) (day_remainder / USECS_PER_DAY);
   month_remainder -= result-month * USECS_PER_MONTH;
   day_remainder -= result-day * USECS_PER_DAY;
 
   /*
* Handle any fractional parts the same way as in interval_mul.
*/
 
   /* fractional months full days into days */
   month_remainder_days = month_remainder * DAYS_PER_MONTH;
   extra_days = (int32) (month_remainder_days / USECS_PER_DAY);
   result-day += extra_days;
   /* fractional months partial days into time */
   day_remainder += month_remainder_days - extra_days * USECS_PER_DAY;
 
 #ifdef HAVE_INT64_TIMESTAMP
   result-time = rint(span-time / factor + day_remainder);
 #else
   result-time = rint(span-time * 1.0e6 / factor + day_remainder) / 
 1.0e6;
 #endif
 
 This might need a few more rint() calls --- I'm assuming that float ops
 with exact integral inputs will be OK, which is an assumption used
 pretty widely in the datetime code, but ...
 
 Per the comment, if we do this here we probably want to make
 interval_mul work similarly.
 
   regards, tom lane

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-08-10 Thread Michael Glaesemann


On Aug 11, 2006, at 13:48 , Bruce Momjian wrote:



Have we made any progress on this?


I made a bit of progress but am still having issues when --enable- 
integer-datetimes is not enabled. I need to spend some time with gdb  
and figure out what's going on. I probably won't be able to get time  
to look at it until Aug 18.


Michael Glaesemann
grzm seespotcode net




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-07-01 Thread Tom Lane
Michael Glaesemann [EMAIL PROTECTED] writes:
 ... I think this just confirms that there is some kind of rounding (or  
 lack of) in interval_div. Kind of frustrating that it's not visible  
 in the result.

I think the fundamental problem is that the float8 results of division
are inaccurate, and yet we're assuming that we can (for instance) coerce
them to integer and get exactly the right answer.  For instance, in the
'41 months'/10 example, I get month_remainder_days being computed as

(gdb) p month_remainder
$19 = 0.099645
(gdb) s
2575result-day += (int32) month_remainder_days;
(gdb) p month_remainder_days
$20 = 2.9893

The only way we can really fix this is to be willing to round off
the numbers, and I think the only principled way to do that is to
settle on a specific target accuracy, probably 1 microsecond.
Then the thing to do would be to scale up all the intermediate
float results to microseconds and apply rint().  Something like
(untested)

month_remainder = rint(span-month * USECS_PER_MONTH / factor);
day_remainder = rint(span-day * USECS_PER_DAY / factor);
result-month = (int32) (month_remainder / USECS_PER_MONTH);
result-day = (int32) (day_remainder / USECS_PER_DAY);
month_remainder -= result-month * USECS_PER_MONTH;
day_remainder -= result-day * USECS_PER_DAY;

/*
 * Handle any fractional parts the same way as in interval_mul.
 */

/* fractional months full days into days */
month_remainder_days = month_remainder * DAYS_PER_MONTH;
extra_days = (int32) (month_remainder_days / USECS_PER_DAY);
result-day += extra_days;
/* fractional months partial days into time */
day_remainder += month_remainder_days - extra_days * USECS_PER_DAY;

#ifdef HAVE_INT64_TIMESTAMP
result-time = rint(span-time / factor + day_remainder);
#else
result-time = rint(span-time * 1.0e6 / factor + day_remainder) / 
1.0e6;
#endif

This might need a few more rint() calls --- I'm assuming that float ops
with exact integral inputs will be OK, which is an assumption used
pretty widely in the datetime code, but ...

Per the comment, if we do this here we probably want to make
interval_mul work similarly.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-06-30 Thread Michael Glaesemann

On Jun 23, 2006, at 9:47 , Michael Glaesemann wrote:


# select '41 mon'::interval / 10;
?column?

4 mons 2 days 24:00:00


My understanding is that as month_remainder is a float (as is  
month_remainder_days), month_remainder_days may be equal to 24  
hours after rounding. As we're converting from months to days, and  
from days to time, rather than from months to time directly, we're  
assuming that we should only have time less than 24 hours remaining  
in the month_remainder_days when it's added to day_remainder.


This behavior is the same as applying justify_hours before adding the  
days and time components to the result. With this in mind, I rewrote  
interval_div to call interval_justify_hours. Good news is that -- 
enable-integer-datetimes works as expected. Bad news is that without  
--enable-integer-datetimes, still get the current behavior.


I also came across something I think is odd:

# select version();

version
 
--
PostgreSQL 8.1.4 on powerpc-apple-darwin8.7.0, compiled by GCC  
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.  
build 5341)

(1 row)

select justify_hours(a/10) as divided_and_justified
, justify_hours(b) as justified
, a/10 = b as are_equal
from (select '41 mon'::interval,'4 mons 2 days 24:00:00'::interval)  
as s(a,b);


without --enable-integer-datetimes:

divided_and_justified  |   justified   | are_equal
+---+---
4 mons 2 days 24:00:00 | 4 mons 3 days | t
(1 row)

with --enable-integer-datetimes:

divided_and_justified |   justified   | are_equal
---+---+---
4 mons 3 days | 4 mons 3 days | t

I think this just confirms that there is some kind of rounding (or  
lack of) in interval_div. Kind of frustrating that it's not visible  
in the result.


Anyway, there's another data point.

Michael Glaesemann
grzm seespotcode net





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-06-22 Thread Michael Glaesemann


Tom Lane wrote:


I've also confirmed that the problem is in interval_div; you can
reproduce the failure with

select '41 years 1 mon 11 days'::interval / 10;

which should give '4 years 1 mon 9 days 26:24:00', but when
timestamp.o is compiled with -mcpu=pentium4 -march=pentium4,
you get '4 years 1 mon 10 days 02:24:00'.  --enable-integer-datetimes
is not relevant because the interesting part is all double/integer
arithmetic.

Looking at this, though, I wonder if the pentium4 answer isn't more
correct.  If I'm doing the math by hand correctly, what we end up
with is having to cascade 3/10ths of a month down into the days field,
and since the conversion factor is supposed to be 30 days/month, that
should be exactly 9 days.  Plus the one full day from the 11/10 days
gives 10 days.  I think what is happening on all the non-Pentium
platforms is that (3.0/10.0)*30.0 is producing something just a shade
less than 9.0, whereas the Pentium gives 9.0 or a shade over, possibly
due to rearrangement of the calculation.  I think we can still file  
this

as a compiler bug, because I'm pretty sure the C spec does not allow
rearrangement of floating-point calculations ... but we might want to
think about tweaking the code's roundoff behavior just a bit.


I took a naive look at this, but have been unable to come up with a  
satisfactory solution. Here's an even simpler example that exhibits  
the behavior:


# select '41 mon'::interval / 10;
?column?

4 mons 2 days 24:00:00

And one using a non-integer divisor:

# select '2 mon'::interval / 1.5;
   ?column?
---
1 mon 9 days 24:00:00

Here's the relevant code in interval_div (timestamp.c c2573).  
month_remainder holds the fractional part of the months field of the  
original interval (41) divided by the divisor (10) as a double.


/* fractional months full days into days */
month_remainder_days = month_remainder * DAYS_PER_MONTH;
result-day += (int32) month_remainder_days;
/* fractional months partial days into time */
day_remainder += month_remainder_days - (int32) month_remainder_days;

#ifdef HAVE_INT64_TIMESTAMP
	result-time = rint(span-time / factor + day_remainder *  
USECS_PER_DAY);

#else
result-time = span-time / factor + day_remainder * SECS_PER_DAY;
#endif

My understanding is that as month_remainder is a float (as is  
month_remainder_days), month_remainder_days may be equal to 24 hours  
after rounding. As we're converting from months to days, and from  
days to time, rather than from months to time directly, we're  
assuming that we should only have time less than 24 hours remaining  
in the month_remainder_days when it's added to day_remainder. If  
month_remainder_days is equal to 24 hours, we should increment result- 
day rather than carrying that down into result-time.


With that in mind, I produced this hack:

#ifdef HAVE_INT64_TIMESTAMP
month_remainder_seconds = month_remainder_day_fraction *  
USECS_PER_DAY;


if ( USECS_PER_DAY == rint(month_remainder_seconds) )
result-day++;
else if ( USECS_PER_DAY == - rint(month_remainder_seconds) )
result-day--;
else day_remainder += month_remainder_day_fraction;

	result-time = rint(span-time / factor + day_remainder *  
USECS_PER_DAY);

#else
month_remainder_seconds = month_remainder_day_fraction *  
SECS_PER_DAY;


if ( SECS_PER_DAY == (int32) month_remainder_seconds )
result-day++;
else if ( SECS_PER_DAY == - (int32) month_remainder_seconds)
result-day--;
else day_remainder += month_remainder_day_fraction;

result-time = span-time / factor + day_remainder * SECS_PER_DAY;
#endif


It works okay with --enable-integer-datetimes

postgres=# select '41 mon'::interval/10;
   ?column?
---
4 mons 3 days
(1 row)

postgres=# select '2 mon'::interval/1.5;
   ?column?
---
1 mon 10 days
(1 row)

It also changes the result of the aggregate test for intervals, but I  
think that's to be expected.


*** ./expected/interval.out Tue Mar  7 07:49:17 2006
--- ./results/interval.out  Thu Jun 22 22:52:41 2006
***
*** 218,224 
  select avg(f1) from interval_tbl;
 avg
  -
!  @ 4 years 1 mon 9 days 28 hours 18 mins 23 secs
  (1 row)

  -- test long interval input
--- 218,224 
  select avg(f1) from interval_tbl;
 avg
  -
!  @ 4 years 1 mon 10 days 4 hours 18 mins 23 secs
  (1 row)

  -- test long interval input


But doesn't work without --enable-integer-datetimes. It still gives  
the same answer as before. I don't have a firm grasp of floating  
point math so I'm fully aware this might be entirely the wrong way to  
go about this. It definitely feels kludgy (especially my sign- 
handling), but I submit this in the hope it moves the discussion  
forward a 

Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-06-22 Thread Michael Glaesemann


On Jun 23, 2006, at 9:47 , Michael Glaesemann wrote:

It also changes the result of the aggregate test for intervals, but  
I think that's to be expected.


My goodness. Of course it changes the aggregate test results. That  
was what brought this up in the first place. (*kicks self for not  
reading the subject line when responding*)


Michael Glaesemann
grzm seespotcode net




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Interval aggregate regression failure (expected seems

2006-06-14 Thread Bruce Momjian

I assume no more progress has been made on this?

---

Tom Lane wrote:
 I wrote:
  Michael Paesold [EMAIL PROTECTED] writes:
  I am definatly not going to use -march=pentium4 in any production 
  system. Should I open a bug report with RedHat (gcc vendor)?
 
  Yeah, but they'll probably want a smaller test case than Postgres fails
  its regression tests :-(
 
 I have just confirmed that the problem still exists in FC4's current
 compiler (gcc 4.0.1, gcc-4.0.1-4.fc4), which probably will boost up the
 priority of the complaint quite a long way in Red Hat's eyes.
 
 I've also confirmed that the problem is in interval_div; you can
 reproduce the failure with
 
   select '41 years 1 mon 11 days'::interval / 10;
 
 which should give '4 years 1 mon 9 days 26:24:00', but when
 timestamp.o is compiled with -mcpu=pentium4 -march=pentium4,
 you get '4 years 1 mon 10 days 02:24:00'.  --enable-integer-datetimes
 is not relevant because the interesting part is all double/integer
 arithmetic.
 
 Looking at this, though, I wonder if the pentium4 answer isn't more
 correct.  If I'm doing the math by hand correctly, what we end up
 with is having to cascade 3/10ths of a month down into the days field,
 and since the conversion factor is supposed to be 30 days/month, that
 should be exactly 9 days.  Plus the one full day from the 11/10 days
 gives 10 days.  I think what is happening on all the non-Pentium
 platforms is that (3.0/10.0)*30.0 is producing something just a shade
 less than 9.0, whereas the Pentium gives 9.0 or a shade over, possibly
 due to rearrangement of the calculation.  I think we can still file this
 as a compiler bug, because I'm pretty sure the C spec does not allow
 rearrangement of floating-point calculations ... but we might want to
 think about tweaking the code's roundoff behavior just a bit.
 
 An example that's a little easier to look at is
 
   select '41 years 1 mon'::interval / 10;
 
 I get '4 years 1 mon 9 days' with the pentium4 optimization, and
 '4 years 1 mon 8 days 24:00:00' without, and the former seems more
 correct...
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-14 Thread Bruce Momjian
Tom Lane wrote:
 Michael Paesold [EMAIL PROTECTED] writes:
  Would you mind reporting this to RedHat Bugzilla? I believe a bug report 
  from you would have more weight then mine, because you actually 
  understand what's going on here. :-)
 
 Actually, given the thought that this may be an artifact of keeping an
 intermediate value in a wider-than-normal register rather than genuinely
 rearranging the computation, I'm not certain it is a compiler bug.
 We'd have to study it a lot more closely before filing it as one, anyway.
 
 If you accept the idea that the pentium4 answer is the right one,
 then what we really need to do is focus on a better rounding rule than
 strict truncation.  I was toying with the notion of adding the
 equivalent of half a microsecond to the fractional-day value before
 truncating it to integer.  But I'm not certain that that wouldn't have
 some bad effects in other cases.

Looking at the code, do we need additional rint() calls in there, or
rint(x + 0.5)?  Frankly, I am confused why interval_div() has caused
such problems for us?  Are we going at this the right way?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-08 Thread Michael Paesold

Gregory Maxwell wrote:

On 07 Nov 2005 14:22:37 -0500, Greg Stark [EMAIL PROTECTED] wrote:


IIRC, floating point registers are actually longer than a double so if the
entire calculation is done in registers and then the result rounded off to
store in memory it may get the right answer. Whereas if it loses the extra
bits on the intermediate values (the infinite repeating fractions) that might
be where you get the imprecise results.



Hm. I thought -march=pentium4 -mcpu=pentium4 implies -mfpmath=sse. 
SSE is a much better choice on P4 for performance reasons, and never

has excess precision. I'm guessing from the above that I'm incorrect,
in which case we should always be compiled with -mfpmath=sse -msse2
when we are complied -march=pentium4, this should remove problems
caused by excess precision. The same behavior can be had on non sse
platforms with -ffloat-store.


Just for the record (and those interested): using 'CFLAGS=-O2 
-mcpu=pentium4 -march=pentium4 -mfpmath=sse -msse2' actually passes the 
regression tests.


Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-08 Thread Michael Paesold

Tom Lane wrote:

I wrote:


Michael Paesold [EMAIL PROTECTED] writes:

I am definatly not going to use -march=pentium4 in any production 
system. Should I open a bug report with RedHat (gcc vendor)?




Yeah, but they'll probably want a smaller test case than Postgres fails
its regression tests :-(



I have just confirmed that the problem still exists in FC4's current
compiler (gcc 4.0.1, gcc-4.0.1-4.fc4), which probably will boost up the
priority of the complaint quite a long way in Red Hat's eyes.

I've also confirmed that the problem is in interval_div; you can
reproduce the failure with

select '41 years 1 mon 11 days'::interval / 10;

[snip]

Would you mind reporting this to RedHat Bugzilla? I believe a bug report 
from you would have more weight then mine, because you actually 
understand what's going on here. :-)


Otherwise I am going to do do my best...

Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-08 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes:
 Would you mind reporting this to RedHat Bugzilla? I believe a bug report 
 from you would have more weight then mine, because you actually 
 understand what's going on here. :-)

Actually, given the thought that this may be an artifact of keeping an
intermediate value in a wider-than-normal register rather than genuinely
rearranging the computation, I'm not certain it is a compiler bug.
We'd have to study it a lot more closely before filing it as one, anyway.

If you accept the idea that the pentium4 answer is the right one,
then what we really need to do is focus on a better rounding rule than
strict truncation.  I was toying with the notion of adding the
equivalent of half a microsecond to the fractional-day value before
truncating it to integer.  But I'm not certain that that wouldn't have
some bad effects in other cases.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Interval aggregate regression failure (expected seems wrong)

2005-11-07 Thread Michael Glaesemann


On Nov 7, 2005, at 17:24 , Michael Paesold wrote:

Using both PostgreSQL 8.1.0 and CVS current of Nov 7, 9:00 am CET I  
get a regression failure in the interval tests. I am no export for  
the interval type, but the expected 9 days 28 hours seem wrong,  
don't they? The actual value seems to be the same.


Is it possible that this is broken on the platform where the  
expected results were generated?


What platform are you testing on? With or without integer-datetimes?

I just ran make check on for PostgreSQL 8.1.0 on Mac OS X 10.4.3

test=# select version();

version
 
--
PostgreSQL 8.1.0 on powerpc-apple-darwin8.3.0, compiled by GCC  
powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 (Apple Computer, Inc.  
build 5026)


I didn't have any regression failures. I'd also expect we'd see a lot  
more failures on the build farm if it were the case that it was  
broken just on the platform that the expected results were generated  
on. From a quick look at the current build farm failures on HEAD and  
REL8_1_STABLE, it doesn't look like any of the failures are failing  
here.


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Michael Paesold

Michael Glaesemann wrote:


On Nov 7, 2005, at 17:24 , Michael Paesold wrote:

Using both PostgreSQL 8.1.0 and CVS current of Nov 7, 9:00 am CET I  
get a regression failure in the interval tests. I am no export for  
the interval type, but the expected 9 days 28 hours seem wrong,  
don't they? The actual value seems to be the same.


Is it possible that this is broken on the platform where the  expected 
results were generated?


What platform are you testing on? With or without integer-datetimes?


Ok, forgot. This is *without* integer-datetimes, RHEL 3 (Linux 2.4.21, 
glibc 2.3.2, gcc 3.2.3 20030502) on i686 (Xeon without x86-64).



I just ran make check on for PostgreSQL 8.1.0 on Mac OS X 10.4.3

[snip]
I didn't have any regression failures. I'd also expect we'd see a lot  
more failures on the build farm if it were the case that it was  broken 
just on the platform that the expected results were generated  on. From 
a quick look at the current build farm failures on HEAD and  
REL8_1_STABLE, it doesn't look like any of the failures are failing  here.


I just started to wonder about buildfarm, too, but found that most build 
farm members have --enable-integer-datetimes. Could that be an 
explanation? Is it possible that the code is wrong with 
--enable-integer-datetimes?


So what do you have in results/interval.out?
@ 4 years 1 mon 9 days 28 hours 18 mins 23 secs seems wrong to me, no?

Tom wrote for that commit:
revision 1.14
date: 2005/10/25 17:13:07;  author: tgl;  state: Exp;  lines: +1 -1
Remove justify_hours call from interval_mul and interval_div, and make
some small stylistic improvements in these functions.  Also fix several
places where TMODULO() was being used with wrong-sized quotient argument,
creating a risk of overflow --- interval2tm was actually capable of going
into an infinite loop because of this.

Perhaps it is an intended behavior? If so, it still fails without 
integer-datetimes.


Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Michael Paesold



Michael Paesold wrote:

On Nov 7, 2005, at 17:24 , Michael Paesold wrote:

Using both PostgreSQL 8.1.0 and CVS current of Nov 7, 9:00 am CET I  
get a regression failure in the interval tests. I am no export for  
the interval type, but the expected 9 days 28 hours seem wrong,  
don't they? The actual value seems to be the same.


Is it possible that this is broken on the platform where the  
expected results were generated?


Perhaps it is an intended behavior? If so, it still fails without 
integer-datetimes.


Well, no, it also fails with integer-datetimes for me in the same way.
pg_config output below. And yes, I did cvs up; make distclean; 
./configure... ; make ; make install ; make check.


Could this be DST-related? I thought plain interval was not affected by 
DST changes.


BINDIR = /usr/local/postgresql-8cvs/bin
DOCDIR = /usr/local/postgresql-8cvs/doc
INCLUDEDIR = /usr/local/postgresql-8cvs/include
PKGINCLUDEDIR = /usr/local/postgresql-8cvs/include
INCLUDEDIR-SERVER = /usr/local/postgresql-8cvs/include/server
LIBDIR = /usr/local/postgresql-8cvs/lib
PKGLIBDIR = /usr/local/postgresql-8cvs/lib
LOCALEDIR =
MANDIR = /usr/local/postgresql-8cvs/man
SHAREDIR = /usr/local/postgresql-8cvs/share
SYSCONFDIR = /usr/local/postgresql-8cvs/etc
PGXS = /usr/local/postgresql-8cvs/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/usr/local/postgresql-8cvs' '--with-pgport=54321' 
'--with-perl' 'CFLAGS=-O2 -mcpu=pentium4 -march=pentium4' 
'--enable-casserts' '--enable-debug' '--enable-integer-datetimes'

CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -mcpu=pentium4 -march=pentium4 -Wall -Wmissing-prototypes 
-Wpointer-arith -Winline -Wdeclaration-after-statement 
-fno-strict-aliasing -g

CFLAGS_SL = -fpic
LDFLAGS = -Wl,-rpath,/usr/local/postgresql-8cvs/lib
LDFLAGS_SL =
LIBS = -lpgport -lz -lreadline -lncurses -lcrypt -lresolv -lnsl -ldl -lm 
-lbsd

VERSION = PostgreSQL 8.2devel

Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Interval aggregate regression failure (expected seems wrong)

2005-11-07 Thread Michael Glaesemann

On Nov 7, 2005, at 18:28 , Michael Paesold wrote:

Ok, forgot. This is *without* integer-datetimes, RHEL 3 (Linux  
2.4.21, glibc 2.3.2, gcc 3.2.3 20030502) on i686 (Xeon without  
x86-64).



I just ran make check on for PostgreSQL 8.1.0 on Mac OS X 10.4.3


Heh. I forgot, too ;) My test was also without integer-datetimes.


[snip]
I didn't have any regression failures. I'd also expect we'd see a  
lot  more failures on the build farm if it were the case that it  
was  broken just on the platform that the expected results were  
generated  on. From a quick look at the current build farm  
failures on HEAD and  REL8_1_STABLE, it doesn't look like any of  
the failures are failing  here.


I just started to wonder about buildfarm, too, but found that most  
build farm members have --enable-integer-datetimes. Could that be  
an explanation? Is it possible that the code is wrong with --enable- 
integer-datetimes?


So what do you have in results/interval.out?
@ 4 years 1 mon 9 days 28 hours 18 mins 23 secs seems wrong to me, no?



select avg(f1) from interval_tbl;
   avg
-
@ 4 years 1 mon 9 days 28 hours 18 mins 23 secs
(1 row)

The point of the change to the interval datatype in 8.1 is to keep  
track of months, days, and seconds (which in turn are represented as  
hours, minutes and seconds). Previous releases tracked only months  
and seconds. This has advantages for using intervals with dates and  
timestamps that involve daylight saving time changes. Admittedly, it  
looks odd at first, but it falls out of the change in behavior of the  
interval datatype. There are two new functions, justify_days and  
justify_hours, that you can use to put intervals into more  
traditional forms.


http://developer.postgresql.org/docs/postgres/functions-datetime.html

Doesn't explain why you're getting a regression failure though.

Michael Glaesemann
grzm myrealbox com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Michael Paesold

Michael Glaesemann wrote:

So what do you have in results/interval.out?
@ 4 years 1 mon 9 days 28 hours 18 mins 23 secs seems wrong to me, no?



select avg(f1) from interval_tbl;
   avg
-
@ 4 years 1 mon 9 days 28 hours 18 mins 23 secs
(1 row)

The point of the change to the interval datatype in 8.1 is to keep  
track of months, days, and seconds (which in turn are represented as  
hours, minutes and seconds). Previous releases tracked only months  and 
seconds. This has advantages for using intervals with dates and  
timestamps that involve daylight saving time changes. Admittedly, it  
looks odd at first, but it falls out of the change in behavior of the  
interval datatype. There are two new functions, justify_days and  
justify_hours, that you can use to put intervals into more  traditional 
forms.


http://developer.postgresql.org/docs/postgres/functions-datetime.html


Thank you very much for the insight.


Doesn't explain why you're getting a regression failure though.


Well, I have something now. It seems to be a compiler/optimization issue.

I wrote:
 CFLAGS = -O2 -mcpu=pentium4 -march=pentium4 -Wall -Wmissing-prototypes
 -Wpointer-arith -Winline -Wdeclaration-after-statement
 -fno-strict-aliasing -g

I had set CFLAGS to -O2 -mcpu=pentium4 -march=pentium4. I have been 
using these settings for testing PostgreSQL tip for some time now and 
never had any problems.


Removing the cpu and architecture optimization part changes the behavior 
of the interval aggrate, so the results/interval.out now also looks like 
the expected output.

select avg(f1) from interval_tbl;
   avg
-
 @ 4 years 1 mon 9 days 28 hours 18 mins 23 secs
(1 row)

Switching -mcpu=pentium4 -march=pentium4 back on, results in wrong 
output. This is 100% reproducable. Can somebody with more knowledge 
explain why the compiler should stumble over just this? Pure luck?


I have tested these combination of CFLAGS:
-O2 OK
-O2 -mcpu=i686 -march=i686  OK (good, RPMS are built with these)
-O2 -mcpu=pentium4 -march=i686  OK
-O2 -mcpu=pentium4 -march=pentium4  fails

I am definatly not going to use -march=pentium4 in any production 
system. Should I open a bug report with RedHat (gcc vendor)?


Best Regards,
Michael Paesold

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes:
 I have tested these combination of CFLAGS:
 -O2 OK
 -O2 -mcpu=i686 -march=i686  OK (good, RPMS are built with these)
 -O2 -mcpu=pentium4 -march=i686  OK
 -O2 -mcpu=pentium4 -march=pentium4  fails

 I am definatly not going to use -march=pentium4 in any production 
 system. Should I open a bug report with RedHat (gcc vendor)?

Yeah, but they'll probably want a smaller test case than Postgres fails
its regression tests :-(

My guess is that the problem is misoptimization of the interval_div()
function (look in utils/adt/timestamp.c), and that you could probably
construct a nice small test case for them out of that function.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Tom Lane
I wrote:
 Michael Paesold [EMAIL PROTECTED] writes:
 I am definatly not going to use -march=pentium4 in any production 
 system. Should I open a bug report with RedHat (gcc vendor)?

 Yeah, but they'll probably want a smaller test case than Postgres fails
 its regression tests :-(

I have just confirmed that the problem still exists in FC4's current
compiler (gcc 4.0.1, gcc-4.0.1-4.fc4), which probably will boost up the
priority of the complaint quite a long way in Red Hat's eyes.

I've also confirmed that the problem is in interval_div; you can
reproduce the failure with

select '41 years 1 mon 11 days'::interval / 10;

which should give '4 years 1 mon 9 days 26:24:00', but when
timestamp.o is compiled with -mcpu=pentium4 -march=pentium4,
you get '4 years 1 mon 10 days 02:24:00'.  --enable-integer-datetimes
is not relevant because the interesting part is all double/integer
arithmetic.

Looking at this, though, I wonder if the pentium4 answer isn't more
correct.  If I'm doing the math by hand correctly, what we end up
with is having to cascade 3/10ths of a month down into the days field,
and since the conversion factor is supposed to be 30 days/month, that
should be exactly 9 days.  Plus the one full day from the 11/10 days
gives 10 days.  I think what is happening on all the non-Pentium
platforms is that (3.0/10.0)*30.0 is producing something just a shade
less than 9.0, whereas the Pentium gives 9.0 or a shade over, possibly
due to rearrangement of the calculation.  I think we can still file this
as a compiler bug, because I'm pretty sure the C spec does not allow
rearrangement of floating-point calculations ... but we might want to
think about tweaking the code's roundoff behavior just a bit.

An example that's a little easier to look at is

select '41 years 1 mon'::interval / 10;

I get '4 years 1 mon 9 days' with the pentium4 optimization, and
'4 years 1 mon 8 days 24:00:00' without, and the former seems more
correct...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 I think we can still file this as a compiler bug, because I'm pretty sure
 the C spec does not allow rearrangement of floating-point calculations ...

It may have more to do with whether the floating point value can stay in a
floating point register long enough to complete the calculation. 

IIRC, floating point registers are actually longer than a double so if the
entire calculation is done in registers and then the result rounded off to
store in memory it may get the right answer. Whereas if it loses the extra
bits on the intermediate values (the infinite repeating fractions) that might
be where you get the imprecise results.

It makes some sense that -mcpu and -march give the compiler enough information
to keep the intermediate results in registers more effectively.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Greg Stark

c.f.:


`-ffloat-store'
 Do not store floating point variables in registers, and inhibit
 other options that might change whether a floating point value is
 taken from a register or memory.

 This option prevents undesirable excess precision on machines such
 as the 68000 where the floating registers (of the 68881) keep more
 precision than a `double' is supposed to have.  Similarly for the
 x86 architecture.  For most programs, the excess precision does
 only good, but a few programs rely on the precise definition of
 IEEE floating point.  Use `-ffloat-store' for such programs, after
 modifying them to store all pertinent intermediate computations
 into variables.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Gregory Maxwell
On 07 Nov 2005 14:22:37 -0500, Greg Stark [EMAIL PROTECTED] wrote:
 IIRC, floating point registers are actually longer than a double so if the
 entire calculation is done in registers and then the result rounded off to
 store in memory it may get the right answer. Whereas if it loses the extra
 bits on the intermediate values (the infinite repeating fractions) that might
 be where you get the imprecise results.

Hm. I thought -march=pentium4 -mcpu=pentium4 implies -mfpmath=sse. 
SSE is a much better choice on P4 for performance reasons, and never
has excess precision. I'm guessing from the above that I'm incorrect,
in which case we should always be compiled with -mfpmath=sse -msse2
when we are complied -march=pentium4, this should remove problems
caused by excess precision. The same behavior can be had on non sse
platforms with -ffloat-store.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match