Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Alexander Staubo

On Dec 15, 2006, at 04:09 , Ron wrote:


At 07:27 PM 12/14/2006, Alexander Staubo wrote:


Sorry, I neglected to include the pertinent graph:

  http://purefiction.net/paste/pgbench2.pdf
In fact, your graph suggests that using arch specific options in  
addition to -O3 actually =hurts= performance.


The difference is very slight. I'm going to run without -funroll- 
loops and -pipe (which are not arch-related) to get better data.


Alexander.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Steinar H. Gunderson
On Fri, Dec 15, 2006 at 10:53:25AM +0100, Alexander Staubo wrote:
 The difference is very slight. I'm going to run without -funroll- 
 loops and -pipe (which are not arch-related) to get better data.

-pipe does not matter for the generated code; it only affects compiler speed.
(It simply means that the compiler runs cpp | cc | as1 instead of cpp  tmp;
cc  tmp  tmp2; as1  tmp2.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Simon Riggs
On Thu, 2006-12-14 at 19:05 -0500, Tom Lane wrote:
 Kelly Burkhart [EMAIL PROTECTED] writes:
  I hope this isn't a crummy mainboard but I can't seem to affect
  things by changing clock source (kernel 2.6.16 SLES10).  I tried
  kernel command option clock=XXX where XXX in
  (cyclone,hpet,pmtmr,tsc,pit), no option was significantly better than
  the default.
 
 I believe that on machines where gettimeofday is really nice and fast,
 it doesn't require entry to the kernel at all; there's some hack that
 makes the clock readable from userspace.  (Obviously a switch to kernel
 mode would set you back a lot of the cycles involved here.)  So it's not
 so much the kernel that you need to teach as glibc.  How you do that is
 beyond my expertise, but maybe that will help you google for the right
 thing ...

Until we work out a better solution we can fix this in two ways:

1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ...

2. enable_analyze_timer = off | on (default) (USERSET)

A performance drop of 4x-10x is simply unacceptable when trying to tune
queries where the current untuned time is already too high. Tying down
production servers for hours on end when we know for certain all they
are doing is calling gettimeofday millions of times is not good. This
quickly leads to the view from objective people that PostgreSQL doesn't
have a great optimizer, whatever we say in its defence. I don't want to
leave this alone, but I don't want to spend a month fixing it either.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Martijn van Oosterhout
On Fri, Dec 15, 2006 at 10:28:08AM +, Simon Riggs wrote:
 Until we work out a better solution we can fix this in two ways:
 
 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ...
 
 2. enable_analyze_timer = off | on (default) (USERSET)

What exactly would this do? Only count actual rows or something? I
wrote a patch that tried statistical sampling, but the figures were too
far off for people's liking.

 A performance drop of 4x-10x is simply unacceptable when trying to tune
 queries where the current untuned time is already too high. Tying down
 production servers for hours on end when we know for certain all they
 are doing is calling gettimeofday millions of times is not good. This
 quickly leads to the view from objective people that PostgreSQL doesn't
 have a great optimizer, whatever we say in its defence. I don't want to
 leave this alone, but I don't want to spend a month fixing it either.

I think the best option is setitimer(), but it's not POSIX so
platform support is going to be patchy.

BTW, doing gettimeofday() without kernel entry is not really possible.
You could use the cycle counter but it has the problem that if you have
multiple CPUs you need to calibrate the result. If the CPU goes to
sleep, there's is no way for the userspace process to know. Only the
kernel has all the relevent information about what time is to get a
reasonable result.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Dimitri Fontaine
Hi list,

Le vendredi 15 décembre 2006 11:50, Martijn van Oosterhout a écrit :
 BTW, doing gettimeofday() without kernel entry is not really possible.
 You could use the cycle counter but it has the problem that if you have
 multiple CPUs you need to calibrate the result. If the CPU goes to
 sleep, there's is no way for the userspace process to know. Only the
 kernel has all the relevent information about what time is to get a
 reasonable result.

I remember having played with intel RDTSC (time stamp counter) for some timing 
measurement, but just read from several sources (including linux kernel 
hackers considering its usage for gettimeofday()  implementation) that TSC is 
not an accurate method to have elapsed time information.

May be some others method than gettimeofday() are available (Lamport 
Timestamps, as PGDG may have to consider having a distributed processing 
ready EA in some future), cheaper and accurate?
After all, the discussion, as far as I understand it, is about having a 
accurate measure of duration of events, knowing when they occurred in the day 
does not seem to be the point.

My 2¢, hoping this could be somehow helpfull,
-- 
Dimitri Fontaine
http://www.dalibo.com/


pgppVKlng4gwN.pgp
Description: PGP signature


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Ron

At 04:54 AM 12/15/2006, Alexander Staubo wrote:

On Dec 15, 2006, at 04:09 , Ron wrote:


At 07:27 PM 12/14/2006, Alexander Staubo wrote:


Sorry, I neglected to include the pertinent graph:

  http://purefiction.net/paste/pgbench2.pdf
In fact, your graph suggests that using arch specific options in 
addition to -O3 actually =hurts= performance.


According to the tech staff, this is a Sun X4100 with a two-drive 
RAID 1 volume. No idea about the make of the hard drives.


Alexander.

http://www.sun.com/servers/entry/x4100/features.xml

So we are dealing with a 1U 1-4S (which means 1-8C) AMD Kx box with 
up to 32GB of ECC RAM (DDR2 ?) and 2 Seagate 2.5 SAS HDs.


http://www.seagate.com/cda/products/discsales/index/1,,,00.html?interface=SAS

My bet is the X4100 contains one of the 3 models of Cheetah 
15K.4's.  A simple du, dkinfo, whatever, will tell you which.


I'm looking more closely into exactly what the various gcc -O 
optimizations do on Kx's as well.
64b vs 32b gets x86 compatible code access to ~ 2x as many registers; 
and MMX or SSE instructions get you access to not only more 
registers, but wider ones as well.


As one wit has noted, all optimization is an exercise in caching. 
(Terje Mathisen- one of the better assembler coders on the planet.)


It seems unusual that code generation options which give access to 
more registers would ever result in slower code...

Ron Peacetree


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Simon Riggs
On Fri, 2006-12-15 at 11:50 +0100, Martijn van Oosterhout wrote:
 On Fri, Dec 15, 2006 at 10:28:08AM +, Simon Riggs wrote:
  Until we work out a better solution we can fix this in two ways:
  
  1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ...
  
  2. enable_analyze_timer = off | on (default) (USERSET)
 
 What exactly would this do? Only count actual rows or something? 

Yes. It's better to have this than nothing at all.

 I
 wrote a patch that tried statistical sampling, but the figures were too
 far off for people's liking.

Well, I like your ideas, so if you have any more...

Maybe sampling every 10 rows will bring things down to an acceptable
level (after the first N). You tried less than 10 didn't you?

Maybe we can count how many real I/Os were required to perform each
particular row, so we can adjust the time per row based upon I/Os. ISTM
that sampling at too low a rate means we can't spot the effects of cache
and I/O which can often be low frequency but high impact.

 I think the best option is setitimer(), but it's not POSIX so
 platform support is going to be patchy.

Don't understand that. I thought that was to do with alarms and signals.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Martijn van Oosterhout
On Fri, Dec 15, 2006 at 12:15:59PM +, Gregory Stark wrote:
 There are various attempts at providing better timing infrastructure at low
 overhead but I'm not sure what's out there currently. I expect to do this what
 we'll have to do is invent a pg_* abstraction that has various implementations
 on different architectures. On Solaris it can use DTrace internally, on Linux
 it might have something else (or more likely several different options
 depending on the age and config options of the kernel). 

I think we need to move to a sampling approach. setitimer is good,
except it doesn't tell you if signals have been lost. Given they are
most likely to be lost during high disk I/O, they're actually
significant. I'm trying to think of a way around that. Then you don't
need a cheap gettimeofday at all...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Martijn van Oosterhout
On Fri, Dec 15, 2006 at 12:20:46PM +, Simon Riggs wrote:
  I
  wrote a patch that tried statistical sampling, but the figures were too
  far off for people's liking.
 
 Well, I like your ideas, so if you have any more...
 
 Maybe sampling every 10 rows will bring things down to an acceptable
 level (after the first N). You tried less than 10 didn't you?

Yeah, it reduced the number of calls as the count got larger. It broke
somewhere, though I don't quite remember why.

 Maybe we can count how many real I/Os were required to perform each
 particular row, so we can adjust the time per row based upon I/Os. ISTM
 that sampling at too low a rate means we can't spot the effects of cache
 and I/O which can often be low frequency but high impact.

One idea is to sample at fixed interval. Where the I/O cost is high,
there'll be a lot of sampling points. The issue being that the final
result are not totally accurate anymore.

  I think the best option is setitimer(), but it's not POSIX so
  platform support is going to be patchy.
 
 Don't understand that. I thought that was to do with alarms and signals.

On my system the manpage say setitimer() conforms to: SVr4, 4.4BSD. I'm
unsure how many of the supported platforms fall under that
catagorisation.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Florian Weimer
* Simon Riggs:

 I think the best option is setitimer(), but it's not POSIX so
 platform support is going to be patchy.

 Don't understand that. I thought that was to do with alarms and
 signals.

You could use it for sampling.  Every few milliseconds, you record
which code is executing (possibly using a global variable which is set
and reset accordingly).  But this might lead to tons of interrupted
system calls, and not all systems mask them, so this might not be an
option for the PostgreSQL code base.

On the other hand, if performance in more recent Linux releases (from
kernel.org) are acceptable, you should assume that the problem will
eventually fix itself.  FWIW, I see the 9x overhead on something that
is close to 2.6.17 (on AMD64/Opteron), so this could be wishful
thinking. 8-(

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Merlin Moncure

On 12/15/06, Ron [EMAIL PROTECTED] wrote:

I'm looking more closely into exactly what the various gcc -O
optimizations do on Kx's as well.
64b vs 32b gets x86 compatible code access to ~ 2x as many registers;
and MMX or SSE instructions get you access to not only more
registers, but wider ones as well.

As one wit has noted, all optimization is an exercise in caching.
(Terje Mathisen- one of the better assembler coders on the planet.)

It seems unusual that code generation options which give access to
more registers would ever result in slower code...


The slower is probably due to the unroll loops switch which can
actually hurt code due to the larger footprint (less cache coherency).

The extra registers are not all that important because of pipelining
and other hardware tricks.  Pretty much all the old assembly
strategies such as forcing local variables to registers are basically
obsolete...especially with regards to integer math.  As I said before,
modern CPUs are essentially RISC engines with a CISC preprocessing
engine laid in top.

Things are much more complicated than they were in the old days where
you could count instructions for the assembly optimization process.  I
suspect that there is little or no differnece between the -march=686
and the various specifc archicectures.  Did anybody think to look at
the binaries and look for the amount of differences?  I bet you code
compiled for march=opteron will just fine on a pentium 2 if compiled
for 32 bit.

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Greg Smith

On Fri, 15 Dec 2006, Merlin Moncure wrote:


The slower is probably due to the unroll loops switch which can
actually hurt code due to the larger footprint (less cache coherency).


The cache issues are so important with current processors that I'd suggest 
throwing -Os (optimize for size) into the mix people test.  That one may 
stack usefully with -O2, but probably not with -O3 (3 includes 
optimizations that increase code size).


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Fri, Dec 15, 2006 at 12:20:46PM +, Simon Riggs wrote:
 Maybe sampling every 10 rows will bring things down to an acceptable
 level (after the first N). You tried less than 10 didn't you?

 Yeah, it reduced the number of calls as the count got larger. It broke
 somewhere, though I don't quite remember why.

The fundamental problem with it was the assumption that different
executions of a plan node will have the same timing.  That's not true,
in fact not even approximately true.  IIRC the patch did realize
that first-time-through is not a predictor for the rest, but some of
our plan nodes have enormous variance even after the first time.
I think the worst case is batched hash joins.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Ron

At 09:23 AM 12/15/2006, Merlin Moncure wrote:

On 12/15/06, Ron [EMAIL PROTECTED] wrote:


It seems unusual that code generation options which give access to
more registers would ever result in slower code...


The slower is probably due to the unroll loops switch which can 
actually hurt code due to the larger footprint (less cache coherency).


I have seen that effect as well occasionally in the last few decades 
;-)  OTOH, suspicion is not _proof_; and I've seen other 
optimizations turn out to be pessimizations over the years as well.



The extra registers are not all that important because of pipelining 
and other hardware tricks.


No.  Whoever told you this or gave you such an impression was 
mistaken.  There are many instances of x86 compatible code that get 
30-40% speedups just because they get access to 16 rather than 8 GPRs 
when recompiled for x84-64.



Pretty much all the old assembly strategies such as forcing local 
variables to registers are basically obsolete...especially with 
regards to integer math.


Again, not true.  OTOH, humans are unlikely at this point to be able 
to duplicate the accuracy of the compiler's register coloring 
algorithms.  Especially on x86 compatibles.  (The flip side is that 
_expert_ humans can often put the quirky register set and instruction 
pipelines of x86 compatibles to more effective use for a specific 
chunk of code than even the best compilers can.)



As I said before, modern CPUs are essentially RISC engines with a 
CISC preprocessing engine laid in top.


I'm sure you meant modern =x86 compatible= CPUs are essentially RISC 
engines with a CISC engine on top.  Just as all the world's not a 
VAX, all CPUs are not x86 compatibles.  Forgetting this has 
occasionally cost folks I know...



Things are much more complicated than they were in the old days 
where you could count instructions for the assembly optimization process.


Those were the =very= old days in computer time...


I suspect that there is little or no differnece between the 
-march=686 and the various specifc archicectures.


There should be.  The FSF compiler folks (and the rest of the 
industry compiler folks for that matter) are far from stupid.  They 
are not just adding compiler switches because they randomly feel like it.


Evidence suggests that the most recent CPUs are in need of =more= 
arch specific TLC compared to their ancestors, and that this trend is 
not only going to continue, it is going to accelerate.



Did anybody think to look at the binaries and look for the amount of 
differences?  I bet you code compiled for march=opteron will just 
fine on a pentium 2 if compiled

for 32 bit.
Sucker bet given that the whole point of a 32b x86 compatible is to 
be able to run code on any I32 ISA. CPU.
OTOH, I bet that code optimized for best performance on a P2 is not 
getting best performance on a P4.  Or vice versa. ;-)


The big arch specific differences in Kx's are in 64b mode.  Not 32b


Ron Peacetree. 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 There are various attempts at providing better timing infrastructure at low
 overhead but I'm not sure what's out there currently. I expect to do this what
 we'll have to do is invent a pg_* abstraction that has various implementations
 on different architectures.

You've got to be kidding.  Surely it's glibc's responsibility, not ours,
to implement gettimeofday correctly for the hardware.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Martijn van Oosterhout
On Fri, Dec 15, 2006 at 09:56:57AM -0500, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  On Fri, Dec 15, 2006 at 12:20:46PM +, Simon Riggs wrote:
  Maybe sampling every 10 rows will bring things down to an acceptable
  level (after the first N). You tried less than 10 didn't you?
 
  Yeah, it reduced the number of calls as the count got larger. It broke
  somewhere, though I don't quite remember why.
 
 The fundamental problem with it was the assumption that different
 executions of a plan node will have the same timing.  That's not true,
 in fact not even approximately true.  IIRC the patch did realize
 that first-time-through is not a predictor for the rest, but some of
 our plan nodes have enormous variance even after the first time.
 I think the worst case is batched hash joins.

It didn't assume that because that's obviously bogus. It assumed the
durations would be spread as a normal distribution. Which meant that
over time the average of the measured iterations would approch the
actual average. It tried to take enough measurements to try and keep
expected error small, but it's statistics, you can only say this will
give the right answer 95% of the time.

You are correct though, the error was caused by unexpectedly large
variations, or more likely, an unexpected distribution curve.
Statistically, we took enough samples to not be affected significantly
by large variations. Even if it looked more like a gamma distribution
it should not have been as far off as it was.

Looking at alternative approaches, like sampling with a timer, you end
up with the same problem: sometimes the calculations will fail and
produce something strange. The simplest example being than a 100Hz
timer is not going to produce any useful information for queries in the
millisecond range. A higher frequency timer than that is not going to
be available portably.

You could probably throw more effort into refining the statistics behind
it, but at some point we're going to have to draw a line and say: it's
going to be wrong X% of the time, deal with it. If we're not willing to
say that, then there's no point going ahead with any statistical
approach.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Merlin Moncure

On 12/15/06, Ron [EMAIL PROTECTED] wrote:

At 09:23 AM 12/15/2006, Merlin Moncure wrote:
On 12/15/06, Ron [EMAIL PROTECTED] wrote:

It seems unusual that code generation options which give access to
more registers would ever result in slower code...

The slower is probably due to the unroll loops switch which can
actually hurt code due to the larger footprint (less cache coherency).

I have seen that effect as well occasionally in the last few decades
;-)  OTOH, suspicion is not _proof_; and I've seen other
optimizations turn out to be pessimizations over the years as well.

The extra registers are not all that important because of pipelining
and other hardware tricks.

No.  Whoever told you this or gave you such an impression was
mistaken.  There are many instances of x86 compatible code that get
30-40% speedups just because they get access to 16 rather than 8 GPRs
when recompiled for x84-64.


I'm not debating that this is true in specific cases.  Encryption and
video en/decoding  have shown to be faster in 64 bit mode on the same
achicture (a cursor search in google will confirm this).  However,
32-64 bit is not the same argument since there are a lot of other
variables besides more registers.  64 bit mode is often slower on many
programs because the extra code size from 64 bit pointers.  We
benchmarked PostgreSQL internally here and found it to be fastest in
32 bit mode running on a 64 bit platform -- this was on a quad opteron
870 runnning our specific software stack, your results might be
differnt of course.


Pretty much all the old assembly strategies such as forcing local
variables to registers are basically obsolete...especially with
regards to integer math.

Again, not true.  OTOH, humans are unlikely at this point to be able
to duplicate the accuracy of the compiler's register coloring
algorithms.  Especially on x86 compatibles.  (The flip side is that
_expert_ humans can often put the quirky register set and instruction
pipelines of x86 compatibles to more effective use for a specific
chunk of code than even the best compilers can.)


As I said before, modern CPUs are essentially RISC engines with a
CISC preprocessing engine laid in top.

I'm sure you meant modern =x86 compatible= CPUs are essentially RISC
engines with a CISC engine on top.  Just as all the world's not a
VAX, all CPUs are not x86 compatibles.  Forgetting this has
occasionally cost folks I know...


yes, In fact made this point earler.


Things are much more complicated than they were in the old days
where you could count instructions for the assembly optimization process.

Those were the =very= old days in computer time...


I suspect that there is little or no differnece between the
-march=686 and the various specifc archicectures.

There should be.  The FSF compiler folks (and the rest of the
industry compiler folks for that matter) are far from stupid.  They
are not just adding compiler switches because they randomly feel like it.

Evidence suggests that the most recent CPUs are in need of =more=
arch specific TLC compared to their ancestors, and that this trend is
not only going to continue, it is going to accelerate.


Did anybody think to look at the binaries and look for the amount of
differences?  I bet you code compiled for march=opteron will just
fine on a pentium 2 if compiled
for 32 bit.
Sucker bet given that the whole point of a 32b x86 compatible is to
be able to run code on any I32 ISA. CPU.
OTOH, I bet that code optimized for best performance on a P2 is not
getting best performance on a P4.  Or vice versa. ;-)

The big arch specific differences in Kx's are in 64b mode.  Not 32b


I dont think so.  IMO all the processor specific instruction sets were
hacks of 32 bit mode to optimize specific tasks.  Except for certain
things these instructions are rarely, if ever used in 64 bit mode,
especially in integer math (i.e. database binaries).  Since Intel and
AMD64 64 bit are virtually indentical I submit that -march is not
really important anymore except for very, very specific (but
important) cases like spinlocks.  This thread is about how much
architecture depenant binares can beat standard ones.  I say they
don't very much at all, and with the specific exception of Daniel's
benchmarking the results posted to this list bear that out.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Freitag, 15. Dezember 2006 11:28 schrieb Simon Riggs:
 Until we work out a better solution we can fix this in two ways:
 
 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ...
 2. enable_analyze_timer = off | on (default) (USERSET)

 The second one is enough in my mind.

I don't see any point in either one.  If you're not going to collect
timing data then the only useful info EXPLAIN ANALYZE could provide is
knowledge of which rowcount estimates are badly off ... and to get that,
you have to wait for the query to finish, which may well be impractical
even without the gettimeofday overhead.  We had discussed upthread the
idea of having an option to issue a NOTICE as soon as any actual
rowcount exceeds the estimate by some-configurable-percentage, and that
seems to me to be a much more useful response to the problem of
E.A. takes too long than removing gettimeofday.

One thing that's not too clear to me though is how the NOTICE would
identify the node at which the rowcount was exceeded...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
  There are various attempts at providing better timing infrastructure at low
  overhead but I'm not sure what's out there currently. I expect to do this 
  what
  we'll have to do is invent a pg_* abstraction that has various 
  implementations
  on different architectures.
 
 You've got to be kidding.  Surely it's glibc's responsibility, not ours,
 to implement gettimeofday correctly for the hardware.

Except for two things:

a) We don't really need gettimeofday. That means we don't need something
sensitive to adjustments made by ntp etc. In fact that would be actively bad.
Currently if the user runs date to reset his clock back a few days I bet
interesting things happen to a large explain analyze that's running.

In fact we don't need something that represents any absolute time, only time
elapsed since some other point we choose. That might be easier to implement
than what glibc has to do to implement gettimeofday fully.

b) glibc may not want to endure an overhead on every syscall and context
switch to make gettimeofday faster on the assumption that gettimeofday is a
rare call and it should pay the price rather than imposing an overhead on
everything else.

Postgres knows when it's running an explain analyze and a 1% overhead would be
entirely tolerable, especially if it affected the process pretty much evenly
unlike the per-gettimeofday-overhead which can get up as high as 100% on some
types of subplans and is negligible on others. And more to the point Postgres
wouldn't have to endure this overhead at all when it's not needed whereas
glibc has no idea when you're going to need gettimeofday next.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Steven Flatt

Here's the output of VACUUM VERBOSE pg_class.  I think it looks fine.  I
even did it three times in a row, each about 10 minutes apart, just to see
what was changing:


INFO:  vacuuming pg_catalog.pg_class
INFO:  index pg_class_oid_index now contains 3263 row versions in 175
pages
DETAIL:  5680 index row versions were removed.
150 index pages have been deleted, 136 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index pg_class_relname_nsp_index now contains 3263 row versions in
1301
pages
DETAIL:  5680 index row versions were removed.
822 index pages have been deleted, 734 are currently reusable.
CPU 0.01s/0.01u sec elapsed 0.03 sec.
INFO:  pg_class: removed 5680 row versions in 109 pages
DETAIL:  CPU 0.00s/0.01u sec elapsed 0.04 sec.
INFO:  pg_class: found 5680 removable, 3263 nonremovable row versions in
625 p
ages
DETAIL:  0 dead row versions cannot be removed yet.
There were 23925 unused item pointers.
0 pages are entirely empty.
CPU 0.02s/0.04u sec elapsed 0.10 sec.
VACUUM

INFO:  vacuuming pg_catalog.pg_class
INFO:  index pg_class_oid_index now contains 3263 row versions in 175
pages
DETAIL:  24 index row versions were removed.
150 index pages have been deleted, 150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index pg_class_relname_nsp_index now contains 3263 row versions in
1301
pages
DETAIL:  24 index row versions were removed.
822 index pages have been deleted, 822 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_class: removed 24 row versions in 2 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_class: found 24 removable, 3263 nonremovable row versions in 625
pag
es
DETAIL:  0 dead row versions cannot be removed yet.
There were 29581 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

INFO:  vacuuming pg_catalog.pg_class
INFO:  index pg_class_oid_index now contains 3263 row versions in 175
pages
DETAIL:  150 index pages have been deleted, 150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index pg_class_relname_nsp_index now contains 3263 row versions in
1301
pages
DETAIL:  822 index pages have been deleted, 822 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_class: found 0 removable, 3263 nonremovable row versions in 625
page
s
DETAIL:  0 dead row versions cannot be removed yet.
There were 29605 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM


The one thing that seems to be steadily increasing is the number of unused
item pointers.  Not sure if that's normal.  I should also point out that
SELECT statements are not experiencing the same degradation as the INSERTs
to the temp table.  SELECTs are performing just as well now (24 hours since
restarting the connection) as they did immediately after restarting the
connection.  INSERTs to the temp table are 5 times slower now than they were
24 hours ago.

I wonder if the problem has to do with a long running ODBC connection.

Steve


On 12/14/06, Tom Lane [EMAIL PROTECTED] wrote:


Steven Flatt [EMAIL PROTECTED] writes:
 Regarding your other email -- interesting -- but we are vacuuming
pg_class
 every hour.  So I don't think the answer lies there...

That's good, but is the vacuum actually accomplishing anything?  I'm
wondering if there's also a long-running transaction in the mix.
Try a manual VACUUM VERBOSE pg_class; after the thing has slowed down,
and see what it says about removable and nonremovable rows.

   regards, tom lane



Re: [PERFORM] File Systems Compared

2006-12-15 Thread Bruno Wolff III
The reply wasn't (directly copied to the performance list, but I will
copy this one back.

On Thu, Dec 14, 2006 at 13:21:11 -0800,
  Ron Mayer [EMAIL PROTECTED] wrote:
 Bruno Wolff III wrote:
  On Thu, Dec 14, 2006 at 01:39:00 -0500,
Jim Nasby [EMAIL PROTECTED] wrote:
  On Dec 11, 2006, at 12:54 PM, Bruno Wolff III wrote:
  This appears to be changing under Linux. Recent kernels have write  
  barriers implemented using cache flush commands (which 
  some drives ignore,  so you need to be careful).
 
 Is it true that some drives ignore this; or is it mostly
 an urban legend that was started by testers that didn't
 have kernels with write barrier support.   I'd be especially
 interested in knowing if there are any currently available
 drives which ignore those commands.
 
  In very recent kernels, software raid using raid 1 will also
  handle write barriers. To get this feature, you are supposed to
  mount ext3 file systems with the barrier=1 option. For other file  
  systems, the parameter may need to be different.
 
 With XFS the default is apparently to enable write barrier
 support unless you explicitly disable it with the nobarrier mount option.
 It also will warn you in the system log if the underlying device
 doesn't have write barrier support.
 
 SGI recommends that you use the nobarrier mount option if you do
 have a persistent (battery backed) write cache on your raid device.
 
   http://oss.sgi.com/projects/xfs/faq.html#wcache
 
 
  But would that actually provide a meaningful benefit? When you  
  COMMIT, the WAL data must hit non-volatile storage of some kind,  
  which without a BBU or something similar, means hitting the platter.  
  So I don't see how enabling the disk cache will help, unless of  
  course it's ignoring fsync.
 
 With write barriers, fsync() waits for the physical disk; but I believe
 the background writes from write() done by pdflush don't have to; so
 it's kinda like only disabling the cache for WAL files and the filesystem's
 journal, but having it enabled for the rest of your write activity (the
 tables except at checkpoints?  the log file?).
 
  Note the use case for this is more for hobbiests or development boxes. You 
  can
  only use it on software raid (md) 1, which rules out most real systems.
  
 
 Ugh.  Looking for where that's documented; and hoping it is or will soon
 work on software 1+0 as well.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] File Systems Compared

2006-12-15 Thread Bruno Wolff III
On Thu, Dec 14, 2006 at 13:21:11 -0800,
  Ron Mayer [EMAIL PROTECTED] wrote:
 Bruno Wolff III wrote:
  On Thu, Dec 14, 2006 at 01:39:00 -0500,
Jim Nasby [EMAIL PROTECTED] wrote:
  On Dec 11, 2006, at 12:54 PM, Bruno Wolff III wrote:
  This appears to be changing under Linux. Recent kernels have write  
  barriers implemented using cache flush commands (which 
  some drives ignore,  so you need to be careful).
 
 Is it true that some drives ignore this; or is it mostly
 an urban legend that was started by testers that didn't
 have kernels with write barrier support.   I'd be especially
 interested in knowing if there are any currently available
 drives which ignore those commands.

I saw posts claiming this, but no specific drives mentioned. I did see one
post that claimed that the cache flush command was mandated (not optional)
by the spec.

  In very recent kernels, software raid using raid 1 will also
  handle write barriers. To get this feature, you are supposed to
  mount ext3 file systems with the barrier=1 option. For other file  
  systems, the parameter may need to be different.
 
 With XFS the default is apparently to enable write barrier
 support unless you explicitly disable it with the nobarrier mount option.
 It also will warn you in the system log if the underlying device
 doesn't have write barrier support.

I think there might be a similar patch for ext3 going into 2.6.19. I haven't
checked a 2.6.19 kernel to make sure though.

 
 SGI recommends that you use the nobarrier mount option if you do
 have a persistent (battery backed) write cache on your raid device.
 
   http://oss.sgi.com/projects/xfs/faq.html#wcache
 
 
  But would that actually provide a meaningful benefit? When you  
  COMMIT, the WAL data must hit non-volatile storage of some kind,  
  which without a BBU or something similar, means hitting the platter.  
  So I don't see how enabling the disk cache will help, unless of  
  course it's ignoring fsync.
 
 With write barriers, fsync() waits for the physical disk; but I believe
 the background writes from write() done by pdflush don't have to; so
 it's kinda like only disabling the cache for WAL files and the filesystem's
 journal, but having it enabled for the rest of your write activity (the
 tables except at checkpoints?  the log file?).

Not exactly. Whenever you commit the file system log or fsync the wal file,
all previously written blocks will be flushed to the disk platter, before
any new write requests are honored. So journalling semantics will work
properly.

  Note the use case for this is more for hobbiests or development boxes. You 
  can
  only use it on software raid (md) 1, which rules out most real systems.
  
 
 Ugh.  Looking for where that's documented; and hoping it is or will soon
 work on software 1+0 as well.

I saw a comment somewhere that raid 0 provided some problems and the suggestion
was to handle the barrier at a different level (though I don't know how you
could). So I don't belive 1+0 or 5 are currently supported or will be in the
near term.

The other feature I would like is to be able to use write barriers with
encrypted file systems. I haven't found anythign on whether or not there
are near term plans by any one to support that.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Ron

At 09:50 AM 12/15/2006, Greg Smith wrote:

On Fri, 15 Dec 2006, Merlin Moncure wrote:

The slower is probably due to the unroll loops switch which can 
actually hurt code due to the larger footprint (less cache coherency).


The cache issues are so important with current processors that I'd 
suggest throwing -Os (optimize for size) into the mix people 
test.  That one may stack usefully with -O2, but probably not with 
-O3 (3 includes optimizations that increase code size).


-Os
Optimize for size. -Os enables all -O2 optimizations that do not 
typically increase code size. It also performs further optimizations 
designed to reduce code size.


-Os disables the following optimization flags:
-falign-functions -falign-jumps -falign-loops -falign-labels
-freorder-blocks -freorder-blocks-and-partition
-fprefetch-loop-arrays
-ftree-vect-loop-version

Hmmm.  That list of disabled flags bears thought.

-falign-functions -falign-jumps -falign-loops -falign-labels

1= Most RISC CPUs performance is very sensitive to misalignment 
issues.  Not recommended to turn these off.


-freorder-blocks
Reorder basic blocks in the compiled function in order to reduce 
number of taken branches and improve code locality.


Enabled at levels -O2, -O3.
-freorder-blocks-and-partition
In addition to reordering basic blocks in the compiled function, in 
order to reduce number of taken branches, partitions hot and cold 
basic blocks into separate sections of the assembly and .o files, to 
improve paging and cache locality performance.


This optimization is automatically turned off in the presence of 
exception handling, for link once sections, for functions with a 
user-defined section attribute and on any architecture that does not 
support named sections.


2= Most RISC CPUs are cranky about branchy code and (lack of) cache 
locality.  Wouldn't suggest punting these either.


-fprefetch-loop-arrays
If supported by the target machine, generate instructions to prefetch 
memory to improve the performance of loops that access large arrays.


This option may generate better or worse code; results are highly 
dependent on the structure of loops within the source code.


3= OTOH, This one looks worth experimenting with turning off.

-ftree-vect-loop-version
Perform loop versioning when doing loop vectorization on trees. When 
a loop appears to be vectorizable except that data alignment or data 
dependence cannot be determined at compile time then vectorized and 
non-vectorized versions of the loop are generated along with runtime 
checks for alignment or dependence to control which version is 
executed. This option is enabled by default except at level -Os where 
it is disabled.


4= ...and this one looks like a 50/50 shot.

Ron Peacetree




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] EXPLAIN ANALYZE on 8.2

2006-12-15 Thread Ron

At 10:45 AM 12/15/2006, Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
 There are various attempts at providing better timing infrastructure at low
 overhead but I'm not sure what's out there currently. I expect to 
do this what
 we'll have to do is invent a pg_* abstraction that has various 
implementations

 on different architectures.

You've got to be kidding.  Surely it's glibc's responsibility, not ours,
to implement gettimeofday correctly for the hardware.

regards, tom lane


I agree with Tom on this.  Perhaps the best compromise is for the pg 
community to make thoughtful suggestions to the glibc community?


Ron Peacetree 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Tom Lane
Steven Flatt [EMAIL PROTECTED] writes:
 Here's the output of VACUUM VERBOSE pg_class.  I think it looks fine.  I
 even did it three times in a row, each about 10 minutes apart, just to see
 what was changing:

Hm, look at the numbers of rows removed:

 INFO:  pg_class: found 5680 removable, 3263 nonremovable row versions in
 625 pages
 DETAIL:  0 dead row versions cannot be removed yet.

 INFO:  pg_class: found 24 removable, 3263 nonremovable row versions in 625
 pages
 DETAIL:  0 dead row versions cannot be removed yet.

 INFO:  pg_class: found 0 removable, 3263 nonremovable row versions in 625
 pages
 DETAIL:  0 dead row versions cannot be removed yet.

The lack of unremovable dead rows is good, but why were there so many
dead rows the first time?  You didn't say what the cycle time is on your
truncate-and-refill process, but the last two suggest that the average
rate of accumulation of dead pg_class rows is only a couple per minute,
in which case it's been a lot longer than an hour since the previous
VACUUM of pg_class.  I'm back to suspecting that you don't vacuum
pg_class regularly.  You mentioned having an hourly cron job to fire off
vacuums ... are you sure it's run as a database superuser?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Ron

At 10:55 AM 12/15/2006, Merlin Moncure wrote:

On 12/15/06, Ron [EMAIL PROTECTED] wrote:


There are many instances of x86 compatible code that get
30-40% speedups just because they get access to 16 rather than 8 GPRs
when recompiled for x84-64.


...We benchmarked PostgreSQL internally here and found it to be 
fastest in 32 bit mode running on a 64 bit platform -- this was on a 
quad opteron 870 runnning our specific software stack, your results 
might be differnt of course.


On AMD Kx's, you probably will get best performance in 64b mode (so 
you get all those extra registers and other stuff) while using 32b 
pointers (to keep code size and memory footprint down).


On Intel C2's, things are more complicated since Intel's x86-64 
implementation and memory IO architecture are both different enough 
from AMD's to have caused some consternation on occasion when Intel's 
64b performance did not match AMD's.





The big arch specific differences in Kx's are in 64b mode.  Not 32b


I dont think so.  IMO all the processor specific instruction sets were
hacks of 32 bit mode to optimize specific tasks.  Except for certain
things these instructions are rarely, if ever used in 64 bit mode,
especially in integer math (i.e. database binaries).  Since Intel and
AMD64 64 bit are virtually indentical I submit that -march is not
really important anymore except for very, very specific (but
important) cases like spinlocks.


Take a good look at the processor specific manuals and the x86-64 
benches around the net.  The evidence outside the DBMS domain is 
pretty solidly in contrast to your statement and 
position.  Admittedly, DBMS are not web servers or FPS games or ... 
That's why we need to do our own rigorous study of the subject.



This thread is about how much architecture depenant binares can beat 
standard ones.  I say they don't very much at all, and with the 
specific exception of Daniel's

benchmarking the results posted to this list bear that out.
...and IMHO the issue is still very much undecided given that we 
don't have enough properly obtained and documented evidence.


ATM, the most we can say is that in a number of systems with modest 
physical IO subsystems that are not running Gentoo Linux we have not 
been able to duplicate the results.  (We've also gotten some 
interesting results like yours suggesting the arch specific 
optimizations are bad for pg performance in your environment.)


In the process questions have been asked and issues raised regarding 
both the tolls involved and the proper way to use them.


We really do need to have someone other than Daniel duplicate his 
Gentoo environment and independently try to duplicate his results.



...and let us bear in mind that this is not just intellectual 
curiosity.  The less pg is mysterious, the better the odds pg will be 
adopted in any specific case.

Ron Peacetree
  



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Steven Flatt

Our application is such that there is a great deal of activity at the
beginning of the hour and minimal activity near the end of the hour.  Those
3 vacuums were done at (approximately) 30 minutes past, 40 minutes past, and
50 minutes past the hour, during low activity.  Vacuums of pg_class look
like they're being done on the hour.  So it's not surprising that the first
vacuum found a lot of dead rows while the latter two found very few.

In fact, I just did another vacuum (about 30 minutes past the hour again)
and got:

INFO:  pg_class: found 5490 removable, 3263 nonremovable row versions in
171 pages
DETAIL:  0 dead row versions cannot be removed yet.

... and clearly a vacuum was done under an hour ago.

The truncate and re-fill process is done once per hour, at the end of the
high-load cycle, so I doubt that's even a big contributor to the number of
removable rows in pg_class.

For this particular setup, we expect high load for 10-15 minutes at the
beginning of the hour, which is the case when a new connection is
initialized.  After a day or so (as is happening right now), the high-load
period spills into the second half of the hour.  Within 3-4 days, we start
spilling into the next hour and, as you can imagine, everything gets behind
and we spiral down from there.  For now, our workaround is to manually kill
the connection every few days, but I would like a better solution than
setting up a cron job to do this!

Thanks again,
Steve


On 12/15/06, Tom Lane [EMAIL PROTECTED] wrote:


Hm, look at the numbers of rows removed:

 INFO:  pg_class: found 5680 removable, 3263 nonremovable row versions
in
 625 pages
 DETAIL:  0 dead row versions cannot be removed yet.

 INFO:  pg_class: found 24 removable, 3263 nonremovable row versions in
625
 pages
 DETAIL:  0 dead row versions cannot be removed yet.

 INFO:  pg_class: found 0 removable, 3263 nonremovable row versions in
625
 pages
 DETAIL:  0 dead row versions cannot be removed yet.

The lack of unremovable dead rows is good, but why were there so many
dead rows the first time?  You didn't say what the cycle time is on your
truncate-and-refill process, but the last two suggest that the average
rate of accumulation of dead pg_class rows is only a couple per minute,
in which case it's been a lot longer than an hour since the previous
VACUUM of pg_class.  I'm back to suspecting that you don't vacuum
pg_class regularly.  You mentioned having an hourly cron job to fire off
vacuums ... are you sure it's run as a database superuser?

   regards, tom lane



Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Tom Lane
Steven Flatt [EMAIL PROTECTED] writes:
 Our application is such that there is a great deal of activity at the
 beginning of the hour and minimal activity near the end of the hour.

OK ...

 The truncate and re-fill process is done once per hour, at the end of the
 high-load cycle, so I doubt that's even a big contributor to the number of
 removable rows in pg_class.

Oh, then where *are* the removable rows coming from?  At this point I
think that the truncate/refill thing is not the culprit, or at any rate
is only one part of a problematic usage pattern that we don't see all of
yet.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Steven Flatt

Good question, and I agree with your point.

Are the removable rows in pg_class even an issue?  So what if 5000-6000 dead
tuples are generated every hour then vacuumed?  Performance continues to
steadily decline over a few days time.  Memory usage does not appear to be
bloating.  Open file handles remain fairly fixed.  Is there anything else I
can monitor (perhaps something to do with the odbc connection) that I could
potentially correlate with the degrading performance?

Steve


On 12/15/06, Tom Lane [EMAIL PROTECTED] wrote:


Oh, then where *are* the removable rows coming from?  At this point I
think that the truncate/refill thing is not the culprit, or at any rate
is only one part of a problematic usage pattern that we don't see all of
yet.

   regards, tom lane



Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Tom Lane
Steven Flatt [EMAIL PROTECTED] writes:
 Are the removable rows in pg_class even an issue?  So what if 5000-6000 dead
 tuples are generated every hour then vacuumed?  Performance continues to
 steadily decline over a few days time.  Memory usage does not appear to be
 bloating.  Open file handles remain fairly fixed.  Is there anything else I
 can monitor (perhaps something to do with the odbc connection) that I could
 potentially correlate with the degrading performance?

At this point I think the most productive thing for you to do is to try
to set up a self-contained test case that reproduces the slowdown.  That
would allow you to poke at it without disturbing your production system,
and would let other people look at it too.  From what you've said, I'd
try a simple little program that inserts some data into a temp table,
truncates the table, and repeats, as fast as it can, using the same SQL
commands as your real code and similar but dummy data.  It shouldn't
take long to observe the slowdown if it occurs.  If you can't reproduce
it in isolation then we'll know that some other part of your application
environment is contributing to the problem; if you can, I'd be happy to
look at the test case with gprof or oprofile and find out exactly what's
going on.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Steven Flatt

I've been trying to reproduce the problem for days now :).  I've done pretty
much exactly what you describe below, but I can't reproduce the problem on
any of our lab machines.  Something is indeed special in this environment.

Thanks for all your help,

Steve


On 12/15/06, Tom Lane [EMAIL PROTECTED] wrote:


Steven Flatt [EMAIL PROTECTED] writes:
 Are the removable rows in pg_class even an issue?  So what if 5000-6000
dead
 tuples are generated every hour then vacuumed?  Performance continues to
 steadily decline over a few days time.  Memory usage does not appear to
be
 bloating.  Open file handles remain fairly fixed.  Is there anything
else I
 can monitor (perhaps something to do with the odbc connection) that I
could
 potentially correlate with the degrading performance?

At this point I think the most productive thing for you to do is to try
to set up a self-contained test case that reproduces the slowdown.  That
would allow you to poke at it without disturbing your production system,
and would let other people look at it too.  From what you've said, I'd
try a simple little program that inserts some data into a temp table,
truncates the table, and repeats, as fast as it can, using the same SQL
commands as your real code and similar but dummy data.  It shouldn't
take long to observe the slowdown if it occurs.  If you can't reproduce
it in isolation then we'll know that some other part of your application
environment is contributing to the problem; if you can, I'd be happy to
look at the test case with gprof or oprofile and find out exactly what's
going on.

   regards, tom lane



Re: [PERFORM] Insertion to temp table deteriorating over time

2006-12-15 Thread Tom Lane
Steven Flatt [EMAIL PROTECTED] writes:
 I've been trying to reproduce the problem for days now :).  I've done pretty
 much exactly what you describe below, but I can't reproduce the problem on
 any of our lab machines.  Something is indeed special in this environment.

Yuck.  You could try strace'ing the problem backend and see if anything
is visibly different between fast and slow operation.  I don't suppose
you have oprofile on that machine, but if you did it'd be even better.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] File Systems Compared

2006-12-15 Thread Bruno Wolff III
On Fri, Dec 15, 2006 at 10:34:15 -0600,
  Bruno Wolff III [EMAIL PROTECTED] wrote:
 The reply wasn't (directly copied to the performance list, but I will
 copy this one back.

Sorry about this one, I meant to intersperse my replies and hit the 'y'
key at the wrong time. (And there ended up being a copy on performance
anyway from the news gateway.)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Alexander Staubo

On Dec 15, 2006, at 17:53 , Ron wrote:


At 09:50 AM 12/15/2006, Greg Smith wrote:

On Fri, 15 Dec 2006, Merlin Moncure wrote:

The slower is probably due to the unroll loops switch which can  
actually hurt code due to the larger footprint (less cache  
coherency).


The cache issues are so important with current processors that I'd  
suggest throwing -Os (optimize for size) into the mix people  
test.  That one may stack usefully with -O2, but probably not with  
-O3 (3 includes optimizations that increase code size).


-Os
Optimize for size. -Os enables all -O2 optimizations that do not  
typically increase code size. It also performs further  
optimizations designed to reduce code size.


So far I have been compiling PostgreSQL and running my pgbench script  
manually, but this makes me want to modify my script to run pgbench  
automatically using all possible permutations of a set of compiler  
flags.


Last I tried GCC to produce 32-bit code on this Opteron system,  
though, it complained about the lack of a compiler; can I persuade it  
to generate 32-bit code (or 32-bit pointers for that matter) without  
going the cross-compilation route?


Alexander.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Michael Stone

On Fri, Dec 15, 2006 at 12:24:46PM -0500, Ron wrote:
ATM, the most we can say is that in a number of systems with modest 
physical IO subsystems 



So I reran it on a 3.2GHz xeon with 6G RAM off a ramdisk; I/O ain't the 
bottleneck on that one. Results didn't show didn't show any signficant 
gains regardless of compilation options (results hovered around 12k 
tps). If people want to continue this, I will point out that they should 
make sure they're linked against the optimized libpq rather than an 
existing one elsewhere in the library path. Beyond that, I'm done with 
this thread. Maybe there are some gains to be found somewhere, but the 
testing done thus far (while limited) is sufficient, IMO, to demonstrate 
that compiler options aren't going to provide a blow-your-socks-off 
dramatic performance improvement.


Mike Stone

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] opportunity to benchmark a quad core Xeon

2006-12-15 Thread Jeff Frost
I have the opportunity to benchmark a system is based on Supermicro 6015B-8V. 
It has 2x Quad Xeon E5320 1.86GHZ, 4GB DDR2 533, 1x 73GB 10k SCSI.


http://www.supermicro.com/products/system/1U/6015/SYS-6015B-8V.cfm

We can add more RAM and drives for testing purposes.  Can someone suggest what 
benchmarks with what settings would be desirable to see how this system 
performs.  I don't believe I've seen any postgres benchmarks done on a quad 
xeon yet.


--
Jeff Frost, Owner   [EMAIL PROTECTED]
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---(end of broadcast)---
TIP 6: explain analyze is your friend