Jens Schipkowski wrote:
Thanks a lot to all for your tips.
Of course, I am doing all the INSERTs using a transaction. So the cost
per INSERT dropped from 30 ms to 3 ms.
The improvement factor matches with the hint by Brian Hurt.
Sorry, I forgot to mention we are using PostgreSQL 8.1.4.
Mike,
are you using -mtune/-mcpu or -march with GCC?
Witch GCC version? Are you working with a 32bits OS or 64bits?
Daniel
On 12/11/06, Michael Stone [EMAIL PROTECTED] wrote:
Can anyone else reproduce these results? I'm on similar hardware (2.5GHz
P4, 1.5G RAM) and my test results are more
On 12/12/06, Tom Lane [EMAIL PROTECTED] wrote:
Axel Waggershauser [EMAIL PROTECTED] writes:
I tested different sizes on linux some time ago and found that 64KB
was optimal. But playing with different sizes again revealed that my
windows-linux problem seems to be solved if I use _any_ other
On 12.12.2006, at 02:37, Michael Stone wrote:
Can anyone else reproduce these results? I'm on similar hardware
(2.5GHz P4, 1.5G RAM) and my test results are more like this:
I'm on totally different hardware / software (MacBook Pro 2.33GHz
C2D) and I can't reproduce the tests.
I have
On Dec 11, 2006, at 23:22 , Daniel van Ham Colchete wrote:
I ran this test at a Gentoo test machine I have here. It's a Pentium 4
3.0GHz (I don't know witch P4)
Try cat /proc/cpuinfo.
TESTS RESULTS
==
On a dual-core Opteron 280 with 4G RAM with an LSI PCI-X Fusion-MPT
SAS
On Tue, Dec 12, 2006 at 01:35:04AM -0500, Greg Smith wrote:
These changes could easily explain the magnitude of difference in results
you're seeing, expecially when combined with a 20% greater raw CPU clock.
I'm not interested in comparing the numbers between the systems (which
is obviously
On Tue, Dec 12, 2006 at 07:10:34AM -0200, Daniel van Ham Colchete wrote:
are you using -mtune/-mcpu or -march with GCC?
I used exactly the options you said you used.
Witch GCC version? Are you working with a 32bits OS or 64bits?
3.3.5; 32
Mike Stone
---(end of
On Tue, Dec 12, 2006 at 12:29:29PM +0100, Alexander Staubo wrote:
I suspect the hardware's real maximum performance of the system is
~150 tps, but that the LSI's write cache is buffering the writes. I
would love to validate this hypothesis, but I'm not sure how.
With fsync off? The write
Luke Lonergan wrote:
Can you try this with just -O3 versus -O2?
Thanks to Daniel for doing these tests.
I happen to have done the same tests about 3/4 years ago,
and concluded that gcc flags did *not* influence performance.
Moved by curiosity, I revamped those tests now on a test
machine
On Tue, Dec 12, 2006 at 01:42:06PM +0100, Cosimo Streppone wrote:
-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps
For the record, -O3 = -O6
On Tue, Dec 12, 2006 at 01:42:06PM +0100, Cosimo Streppone wrote:
-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps
I'm curious now to get the same
On Tue, Dec 12, 2006 at 07:48:06AM -0500, Michael Stone wrote:
I'd be curious to see -O2 with and without the arch-specific flags,
since that's mostly what the discussion is about.
That came across more harshly than I intended; I apologize for that.
It's certainly a useful data point to
On Mon, 2006-12-11 at 20:22 -0200, Daniel van Ham Colchete wrote:
I'm thinking about writing a script to make all the tests (more than 3
times each), get the data and plot some graphs.
I don't have the time right now to do it, maybe next week I'll have.
Check out the OSDL test suite stuff.
On Dec 12, 2006, at 13:32 , Michael Stone wrote:
On Tue, Dec 12, 2006 at 12:29:29PM +0100, Alexander Staubo wrote:
I suspect the hardware's real maximum performance of the system
is ~150 tps, but that the LSI's write cache is buffering the
writes. I would love to validate this hypothesis,
1= In all these results I'm seeing, no one has yet reported what
their physical IO subsystem is... ...when we are benching a DB.
2= So far we've got ~ a factor of 4 performance difference between
Michael Stone's 1S 1C Netburst era 2.5GHz P4 PC and Guido Neitzer's
1S 2C MacBook Pro 2.33GHz
* Cosimo Streppone:
-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps
-mcpu and -mtune are synonymous. You really should -march here (but
the
In response to Ron [EMAIL PROTECTED]:
3= Daniel van Ham Colchete is running Gentoo. That means every SW
component on his box has been compiled to be optimized for the HW it
is running on.
There may be a combination of effects going on for him that others
not running a system optimized
Alexander Staubo [EMAIL PROTECTED] writes:
No, fsync=on. The tps values are similarly unstable with fsync=off,
though -- I'm seeing bursts of high tps values followed by low-tps
valleys, a kind of staccato flow indicative of a write caching being
filled up and flushed.
It's notoriously
On 12/12/06, Florian Weimer [EMAIL PROTECTED] wrote:
* Cosimo Streppone:
-O0 ~ 957 tps
-O1 -mcpu=pentium4 -mtune=pentium4 ~ 1186 tps
-O2 -mcpu=pentium4 -mtune=pentium4 ~ 1229 tps
-O3 -mcpu=pentium4 -mtune=pentium4 ~ 1257 tps
-O6 -mcpu=pentium4 -mtune=pentium4 ~ 1254 tps
-mcpu and -mtune
Alexander Staubo wrote:
No, fsync=on. The tps values are similarly unstable with fsync=off,
though -- I'm seeing bursts of high tps values followed by low-tps
valleys, a kind of staccato flow indicative of a write caching being
filled up and flushed.
Databases with checkpointing
On Tue, 12 Dec 2006, Tom Lane wrote:
Um, you entirely missed the point: the hardware speedups you mention are
quite independent of any compiler options. The numbers we are looking
at are the relative speeds of two different compiles on the same
hardware, not whether hardware A is faster than
Tom Lane wrote:
Alexander Staubo [EMAIL PROTECTED] writes:
No, fsync=on. The tps values are similarly unstable with fsync=off,
though -- I'm seeing bursts of high tps values followed by low-tps
valleys, a kind of staccato flow indicative of a write caching being
filled up and
Axel Waggershauser [EMAIL PROTECTED] writes:
On 12/12/06, Tom Lane [EMAIL PROTECTED] wrote:
I think this almost certainly indicates a Nagle/delayed-ACK
interaction. I googled and found a nice description of the issue:
http://www.stuartcheshire.org/papers/NagleDelayedAck/
In case I was
On Tue, 12 Dec 2006, Alvaro Herrera wrote:
While skimming over the pgbench source it has looked to me like it's
necessary to pass the -s switch (scale factor) to both the
initialization (-i) and the subsequent (non -i) runs.
For non-custom runs, it's computed based on the number of branches.
Tom Lane wrote:
In case I was mistaken, this explanation makes perfectly sens to me.
But then again it would indicate a 'bug' in libpq, in the sense that
it (apparently) sets TCP_NODELAY on linux but not on windows.
No, it would mean a bug in Windows in that it fails to honor
Mike,
I'm making some other tests here at another hardware (also Gentoo). I
found out that PostgreSQL stops for a while if I change the -t
parameter on bgbench from 600 to 1000 and I have ~150 tps instead of
~950tps.
I don't know why PostgreSQL stoped, but it was longer than 5 seconds
and my
Alvaro Herrera [EMAIL PROTECTED] writes:
While skimming over the pgbench source it has looked to me like it's
necessary to pass the -s switch (scale factor) to both the
initialization (-i) and the subsequent (non -i) runs.
No, it's not supposed to be, and I've never found it needed in
Daniel van Ham Colchete [EMAIL PROTECTED] writes:
I'm making some other tests here at another hardware (also Gentoo). I
found out that PostgreSQL stops for a while if I change the -t
parameter on bgbench from 600 to 1000 and I have ~150 tps instead of
~950tps.
I don't know why PostgreSQL
At 10:47 AM 12/12/2006, Tom Lane wrote:
It's notoriously hard to get repeatable numbers out of pgbench :-(
That's not a good characteristic in bench marking SW...
Does the ODSL stuff have an easier time getting reproducible results?
A couple of tips:
* don't put any faith in short
I just made another test with a second Gentoo machine:
Pentium 4 3.0Ghz Prescott
GCC 4.1.1
Glibc 2.4
PostgreSQL 8.1.5
Kernel 2.6.17
Same postgresql.conf as yesterday's.
First test
==
GLIBC: -O2 -march=i686
PostgreSQL: -O2 -march=i686
Results: 974.638731 975.602142 975.882051
At 01:35 PM 12/12/2006, Daniel van Ham Colchete wrote:
I just made another test with a second Gentoo machine:
snip
The results showed no significant change. The conclusion of today's
test would be that there are no improvement at PostgreSQL when using
-march=prescott.
I only see 3
I just made another test with a second Gentoo machine:
Pentium 4 3.0Ghz Prescott
GCC 4.1.1
Glibc 2.4
PostgreSQL 8.1.5
Kernel 2.6.17
Same postgresql.conf as yesterday's.
First test
==
GLIBC: -O2 -march=i686
PostgreSQL: -O2 -march=i686
Results: 974.638731 975.602142
On Tue, 12 Dec 2006, Daniel van Ham Colchete wrote:
I'm making some other tests here at another hardware (also Gentoo). I
found out that PostgreSQL stops for a while if I change the -t
parameter on bgbench from 600 to 1000 and I have ~150 tps instead of
~950tps.
Sure sounds like a checkpoint
[EMAIL PROTECTED] (Alexander Staubo) wrote:
On Dec 12, 2006, at 13:32 , Michael Stone wrote:
On Tue, Dec 12, 2006 at 12:29:29PM +0100, Alexander Staubo wrote:
I suspect the hardware's real maximum performance of the system is
~150 tps, but that the LSI's write cache is buffering the
writes.
34 matches
Mail list logo