Re: [HACKERS] planner fails on HEAD

2011-12-05 Thread Merlin Moncure
On Sun, Dec 4, 2011 at 4:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

 Is this x86?  I can't reproduce it on x86_64.

reading all the comments in the gcc bug report, this is because x86
targets the x87 fpu by default which is where the bug is -- it's a
hardware problem.  x86_64 targets sse which has stricter standards for
rounding.  most x86 processors support sse -- is there a reason why we
don't target sse?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-05 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Sun, Dec 4, 2011 at 4:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert
 
 Is this x86?  I can't reproduce it on x86_64.

 reading all the comments in the gcc bug report, this is because x86
 targets the x87 fpu by default which is where the bug is -- it's a
 hardware problem.  x86_64 targets sse which has stricter standards for
 rounding.  most x86 processors support sse -- is there a reason why we
 don't target sse?

Well, older machines won't have sse, and in any case I think x86 is not
the only architecture with the issue, just the most popular one.
Floating-point registers that are wider than standard double are hardly
an unusual idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-05 Thread Merlin Moncure
On Mon, Dec 5, 2011 at 12:17 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

 Is this x86?  I can't reproduce it on x86_64.


 yes, this is x86 platform

 uname -a
 Linux nemesis 2.6.35.14-106.fc14.i686.PAE #1 SMP Wed Nov 23 13:39:51
 UTC 2011 i686 i686 i386 GNU/Linux

 [pavel@nemesis ~]$ cat /proc/cpuinfo
 processor       : 0
 vendor_id       : GenuineIntel
 cpu family      : 6
 model           : 15
 model name      : Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
 stepping        : 11
 cpu MHz         : 800.000
 cache size      : 4096 KB
 physical id     : 0
 siblings        : 2
 core id         : 0
 cpu cores       : 2
 apicid          : 0
 initial apicid  : 0
 fdiv_bug        : no
 hlt_bug         : no
 f00f_bug        : no
 coma_bug        : no
 fpu             : yes
 fpu_exception   : yes
 cpuid level     : 10
 wp              : yes
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov
 pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
 constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor
 ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi
 flexpriority
 bogomips        : 4785.76
 clflush size    : 64
 cache_alignment : 64
 address sizes   : 36 bits physical, 48 bits virtual
 power management:

 processor       : 1
 vendor_id       : GenuineIntel
 cpu family      : 6
 model           : 15
 model name      : Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
 stepping        : 11
 cpu MHz         : 800.000
 cache size      : 4096 KB
 physical id     : 0
 siblings        : 2
 core id         : 1
 cpu cores       : 2
 apicid          : 1
 initial apicid  : 1
 fdiv_bug        : no
 hlt_bug         : no
 f00f_bug        : no
 coma_bug        : no
 fpu             : yes
 fpu_exception   : yes
 cpuid level     : 10
 wp              : yes
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov
 pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
 constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor
 ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi
 flexpriority
 bogomips        : 4786.60
 clflush size    : 64
 cache_alignment : 64
 address sizes   : 36 bits physical, 48 bits virtual
 power management:

 it is Dell latitude D830

 It's fairly easy to get a set of values such that innerstartsel *should*
 equal innerendsel; but if one value has been rounded to memory precision
 and the other hasn't, the assert could certainly fail.

 Some digging around yields the information that the gcc hackers do not
 consider this a bug, or at least adamantly refuse to do anything about it:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
 Comment 47 is particularly relevant to our situation:

        To summarize, this defect effectively states that:
        assert( (x/y) == (x/y) )
        may cause an assertion if compiled with optimization.

 Also, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45691#c4
 indicates that an explicit cast to double should help.  Would
 you check if the problem goes away if the Asserts are changed to

        Assert((double) outerstartsel = (double) outerendsel);
        Assert((double) innerstartsel = (double) innerendsel);


 it doesn't help

                        regards, tom lane

 assambler list is attached

how about:
 Assert((volatile double) outerstartsel = (volatile double) outerendsel);
etc

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-05 Thread Pavel Stehule
2011/12/5 Merlin Moncure mmonc...@gmail.com:
 On Mon, Dec 5, 2011 at 12:17 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

 Is this x86?  I can't reproduce it on x86_64.


 yes, this is x86 platform

 uname -a
 Linux nemesis 2.6.35.14-106.fc14.i686.PAE #1 SMP Wed Nov 23 13:39:51
 UTC 2011 i686 i686 i386 GNU/Linux

 [pavel@nemesis ~]$ cat /proc/cpuinfo
 processor       : 0
 vendor_id       : GenuineIntel
 cpu family      : 6
 model           : 15
 model name      : Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
 stepping        : 11
 cpu MHz         : 800.000
 cache size      : 4096 KB
 physical id     : 0
 siblings        : 2
 core id         : 0
 cpu cores       : 2
 apicid          : 0
 initial apicid  : 0
 fdiv_bug        : no
 hlt_bug         : no
 f00f_bug        : no
 coma_bug        : no
 fpu             : yes
 fpu_exception   : yes
 cpuid level     : 10
 wp              : yes
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov
 pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
 constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor
 ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi
 flexpriority
 bogomips        : 4785.76
 clflush size    : 64
 cache_alignment : 64
 address sizes   : 36 bits physical, 48 bits virtual
 power management:

 processor       : 1
 vendor_id       : GenuineIntel
 cpu family      : 6
 model           : 15
 model name      : Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
 stepping        : 11
 cpu MHz         : 800.000
 cache size      : 4096 KB
 physical id     : 0
 siblings        : 2
 core id         : 1
 cpu cores       : 2
 apicid          : 1
 initial apicid  : 1
 fdiv_bug        : no
 hlt_bug         : no
 f00f_bug        : no
 coma_bug        : no
 fpu             : yes
 fpu_exception   : yes
 cpuid level     : 10
 wp              : yes
 flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
 cmov
 pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm
 constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor
 ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi
 flexpriority
 bogomips        : 4786.60
 clflush size    : 64
 cache_alignment : 64
 address sizes   : 36 bits physical, 48 bits virtual
 power management:

 it is Dell latitude D830

 It's fairly easy to get a set of values such that innerstartsel *should*
 equal innerendsel; but if one value has been rounded to memory precision
 and the other hasn't, the assert could certainly fail.

 Some digging around yields the information that the gcc hackers do not
 consider this a bug, or at least adamantly refuse to do anything about it:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
 Comment 47 is particularly relevant to our situation:

        To summarize, this defect effectively states that:
        assert( (x/y) == (x/y) )
        may cause an assertion if compiled with optimization.

 Also, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45691#c4
 indicates that an explicit cast to double should help.  Would
 you check if the problem goes away if the Asserts are changed to

        Assert((double) outerstartsel = (double) outerendsel);
        Assert((double) innerstartsel = (double) innerendsel);


 it doesn't help

                        regards, tom lane

 assambler list is attached

 how about:
  Assert((volatile double) outerstartsel = (volatile double) outerendsel);

doesn't help too

Regards

Pavel

 etc

 merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-05 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 Is this x86?  I can't reproduce it on x86_64.

 yes, this is x86 platform
 uname -a
 Linux nemesis 2.6.35.14-106.fc14.i686.PAE #1 SMP Wed Nov 23 13:39:51
 UTC 2011 i686 i686 i386 GNU/Linux

I reproduced this with gcc 4.6.0 on Fedora 15 x86, too.

 Also, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45691#c4
 indicates that an explicit cast to double should help.  Would
 you check if the problem goes away if the Asserts are changed to
 
Assert((double) outerstartsel = (double) outerendsel);
Assert((double) innerstartsel = (double) innerendsel);

 it doesn't help

Hmm ... I'm inclined to think this actually *is* a bug, since Jakub is
on record as saying it should work.  Nonetheless, we need a workaround,
since gcc versions behaving this way are going to be widespread for a
long time even if we convince them to do something about it (which I
suspect they wouldn't given their imperviousness to complaints about the
main issue).

I'm now thinking the best solution is just to drop these two Asserts.
They're not adding anything very useful given the previous ones (which
should be safe since those involve quantities rounded to integers).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-04 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 #3  0x083a1dfe in ExceptionalCondition (conditionName=0x8505474
 !(innerstartsel = innerendsel), errorType=0x83db178
 FailedAssertion, fileName=0x8505140 costsize.c, lineNumber=1937)
 at assert.c:57

[ scratches head ... ]  Given that it got past the previous assertions,
surely that ought to be impossible.  Could we see the values of
cost_mergejoin's local variables, please?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-04 Thread Pavel Stehule
2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 #3  0x083a1dfe in ExceptionalCondition (conditionName=0x8505474
 !(innerstartsel = innerendsel), errorType=0x83db178
 FailedAssertion, fileName=0x8505140 costsize.c, lineNumber=1937)
 at assert.c:57

 [ scratches head ... ]  Given that it got past the previous assertions,
 surely that ought to be impossible.  Could we see the values of
 cost_mergejoin's local variables, please?

It is strange

when I put a fprintf(stderr, const literal) to exactly before or
somewhere after assertion, then assertion is ok. Without fprintf
assertion fails again

it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
configured just with --enable-debug and --enable-cassert

when I put elog before calculation outerstartsel,
innerstartsel,outerendsel and innerendsel then it fails

the output is  (last elog result)

outer_skip_rows: 0.0
inner_skip_rows: 1.0
outer_rows: 208.000
inner_rows: 1.0

when I append elog to show selectivity, then it work again - related
selectivity is

outerstartsel: 0.000
outerendsel: 1.000
innerstartsel: 0.17
innerendsel: 0.17

Regards

Pavel Stehule



                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-04 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 [ scratches head ... ]  Given that it got past the previous assertions,
 surely that ought to be impossible.  Could we see the values of
 cost_mergejoin's local variables, please?

 It is strange

 when I put a fprintf(stderr, const literal) to exactly before or
 somewhere after assertion, then assertion is ok. Without fprintf
 assertion fails again

 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

Hmm.  I'm betting that gcc has flushed one value to memory but the other
one is still in a register that's wider than memory, creating a roundoff
hazard.  Can you look at the generated assembly code?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-04 Thread Pavel Stehule
2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2011/12/4 Tom Lane t...@sss.pgh.pa.us:
 [ scratches head ... ]  Given that it got past the previous assertions,
 surely that ought to be impossible.  Could we see the values of
 cost_mergejoin's local variables, please?

 It is strange

 when I put a fprintf(stderr, const literal) to exactly before or
 somewhere after assertion, then assertion is ok. Without fprintf
 assertion fails again

 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

 Hmm.  I'm betting that gcc has flushed one value to memory but the other
 one is still in a register that's wider than memory, creating a roundoff
 hazard.  Can you look at the generated assembly code?

I can, but tomorrow evening,

I'll send a code

Regards

Pavel

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-04 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 it looks like gcc bug - gcc 4.5.1 20100924 (Red Hat 4.5.1) It was
 configured just with --enable-debug and --enable-cassert

Is this x86?  I can't reproduce it on x86_64.

It's fairly easy to get a set of values such that innerstartsel *should*
equal innerendsel; but if one value has been rounded to memory precision
and the other hasn't, the assert could certainly fail.

Some digging around yields the information that the gcc hackers do not
consider this a bug, or at least adamantly refuse to do anything about it:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
Comment 47 is particularly relevant to our situation:

To summarize, this defect effectively states that:
assert( (x/y) == (x/y) )
may cause an assertion if compiled with optimization.

Also, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45691#c4
indicates that an explicit cast to double should help.  Would
you check if the problem goes away if the Asserts are changed to

Assert((double) outerstartsel = (double) outerendsel);
Assert((double) innerstartsel = (double) innerendsel);

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner fails on HEAD

2011-12-03 Thread Pavel Stehule
a plan for modified query is

ohs=# explain analyze SELECT object_id,
   inserted,
   'ASSIGN_RSLT',
   order_id,
   2,
   seqnum,
   rejected_flat_file_id,
   true
FROM (
   SELECT q.object_id,
  fe.inserted,
  q.order_id,
  q.seqnum,
  q.rejected_flat_file_id,
  q.rejected_result
  FROM queue q
   JOIN
   outgoing.cps_forms f
   ON f.id = q.object_id AND q.object_type = 'cp'
   JOIN
   flat_file_ex fe
   ON fe.id = q.rejected_flat_file_id
offset 0) x
 WHERE rejected_result = 'ACTV';

QUERY PLAN

 Subquery Scan on x  (cost=11.68..192.72 rows=1 width=24) (actual
time=1.748..12.398 rows=139 loops=1)
   Filter: (x.rejected_result = 'ACTV'::bpchar)
   Rows Removed by Filter: 17
   -  Limit  (cost=11.68..192.65 rows=6 width=29) (actual
time=1.739..11.655 rows=156 loops=1)
 -  Nested Loop  (cost=11.68..192.65 rows=6 width=29) (actual
time=1.732..11.036 rows=156 loops=1)
   -  Hash Join  (cost=11.68..138.77 rows=15 width=21)
(actual time=1.459..6.987 rows=186 loops=1)
 Hash Cond: (q.object_id = f.id)
 -  Seq Scan on queue q  (cost=0.00..126.24
rows=186 width=21) (actual time=0.032..4.658 rows=186 loops=1)
   Filter: (object_type = 'cp'::bpchar)
   Rows Removed by Filter: 4313
 -  Hash  (cost=9.08..9.08 rows=208 width=4)
(actual time=1.402..1.402 rows=208 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 5kB
   -  Seq Scan on cps_forms f
(cost=0.00..9.08 rows=208 width=4) (actual time=0.008..0.576 rows=208
loops=1)
   -  Index Scan using flat_file_ex_pkey on flat_file_ex
fe  (cost=0.00..3.58 rows=1 width=12) (actual time=0.008..0.010 rows=1
loops=186)
 Index Cond: (id = q.rejected_flat_file_id)
 Total runtime: 12.846 ms
(16 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers