Re: [PERFORM] Postgres Benchmark Results

2007-06-02 Thread Josh Berkus

PFC,

Thanks for doing those graphs.  They've been used by Simon Heikki, and 
now me, to show our main issue with PostgreSQL performance: consistency. 
 That is, our median response time beats MySQL and even Oracle, but our 
bottom 10% does not, and is in fact intolerably bad.


If you want us to credit you by your real name, let us know what it is.

Thanks!

--Josh Berkus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-28 Thread PFC

On Mon, 28 May 2007 05:53:16 +0200, Chris [EMAIL PROTECTED] wrote:



I am re-running it with other tuning, notably cost-based vacuum  
delay and less frequent checkpoints, and it is a *lot* smoother.
These take a full night to run, so I'll post more results when I  
have usefull stuff to show.

This has proven to be a very interesting trip to benchmarkland...


[ rather late in my reply but I had to ]

Are you tuning mysql in a similar fashion ?


	Well, the tuning knobs are different, there are no check points or  
vacuum... but yes I tried to tune MySQL too, but the hardest part was  
simply making it work without deadlocking continuously.





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Postgres Benchmark Results

2007-05-27 Thread Chris


I am re-running it with other tuning, notably cost-based vacuum 
delay and less frequent checkpoints, and it is a *lot* smoother.
These take a full night to run, so I'll post more results when I 
have usefull stuff to show.

This has proven to be a very interesting trip to benchmarkland...


[ rather late in my reply but I had to ]

Are you tuning mysql in a similar fashion ?

--
Postgresql  php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Postgres Benchmark Results

2007-05-23 Thread Gregory Stark
Greg Smith [EMAIL PROTECTED] writes:

 On Tue, 22 May 2007, Gregory Stark wrote:

 However as mentioned a while back in practice it doesn't work quite right and
 you should expect to get 1/2 the expected performance. So even with 10 
 clients
 you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
 15kprm drive.

 I would agree that's the approximate size of the upper-bound.  There are so
 many factors that go into the effectiveness of commit_delay that I wouldn't
 word it so strongly as to say you can expect that much benefit. The exact
 delay amount (which can be hard to set if your client load varies greatly),
 size of the transactions, balance of seek-bound reads vs. memory based ones in
 the transactions, serialization in the transaction stream, and so many other
 things can slow the effective benefit.

This is without commit_delay set at all. Just the regular WAL sync behaviour.

 Also, there are generally other performance issues in the types of systems you
 would think would get the most benefit from this parameter that end up slowing
 things down anyway.  I've been seeing a best case of closer to 2*single tps
 rather than 5* on my single-drive systems with no write caching, but I'll 
 admit
 I haven't done an exhausting look at it yet (too busy with the real systems
 that have good controllers).  One of these days...

Certainly there can be other bottlenecks you reach before WAL fsyncs become
your limiting factor. If your transactions are reading significant amounts of
data you'll be limited by i/o from your data drives. If your data is on the
same drive as your WAL your seek times will be higher than the rotational
latency too.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Guido Neitzer

Am 21.05.2007 um 23:51 schrieb Greg Smith:

The standard pgbench transaction includes a select, an insert, and  
three updates.


I see. Didn't know that, but it makes sense.

Unless you went out of your way to turn it off, your drive is  
caching writes; every Seagate SATA drive I've ever seen does by  
default.  1062 tps with 3-4 clients just isn't possible with your  
hardware otherwise.


Btw: it wasn't my hardware in this test!

cug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Zoltan Boszormenyi

Jim C. Nasby írta:

On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
  

I also went into benchmarking mode last night for my own
amusement when I read on the linux-kernel ML that
NCQ support for nForce5 chips was released.
I tried current PostgreSQL 8.3devel CVS.
pgbench over local TCP connection with
25 clients and 3000 transacts/client gave me
around 445 tps before applying NCQ support.
680 tps after.

It went over 840 tps after adding HOT v7 patch,
still with 25 clients. It topped at 1062 tps with 3-4 clients.
I used a single Seagate 320GB SATA2 drive
for the test, which only has less than 40GB free.
So it's already at the end of the disk giving smaller
transfer rates then at the beginning. Filesystem is ext3.
Dual core Athlon64 X2 4200 in 64-bit mode.
I have never seen such a performance before
on a desktop machine.



I'd be willing to bet money that the drive is lying about commits/fsync.
  


It could well be the case.


Each transaction committed essentially requires one revolution of the
drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.
  


By revolution, you mean one 360 degrees turnaround of the platter, yes?
On the other hand, if you have multiple clients, isn't the 250 COMMITs/sec
limit is true only per client? Of course assuming that the disk subsystem
has more TCQ/NCQ threads than the actual number of DB clients.


BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
data=writeback. Note that doing that probably has a negative impact on
data recovery after a crash for non-database files.
  


I haven't touched the FS options.
I can even use ext2 if I want non-recoverability. :-)

--
--
Zoltán Böszörményi
Cybertec Geschwinde  Schönig GmbH
http://www.postgresql.at/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Zoltan Boszormenyi

Greg Smith írta:

On Mon, 21 May 2007, Guido Neitzer wrote:

Yes, that right, but if a lot of the transactions are selects, there 
is no entry in the x_log for them and most of the stuff can come from 
the cache - read from memory which is blazing fast compared to any 
disk ... And this was a pg_bench test - I don't know what the 
benchmark really does but if I remember correctly it is mostly reading.


The standard pgbench transaction includes a select, an insert, and 
three updates.  All five finished equals one transaction; the fact 
that the SELECT statment in there could be executed much faster where 
it to happen on its own doesn't matter.


Because it does the most work on the biggest table, the entire 
combination is usually driven mostly by how long the UPDATE to the 
accounts table takes.  The TPS numbers can certainly be no larger than 
the rate at which you can execute that.


As has been pointed out, every time you commit a transacation the disk 
has to actually write that out before it's considered complete.  
Unless you have a good caching disk controller (which your nForce5 is 
not) you're limited to 120 TPS with a 7200RPM drive and 250 with a 
15000 RPM one. While it's possible to improve slightly on this using 
the commit_delay feature, I haven't been able to replicate even a 100% 
improvement that way when running pgbench, and to get even close to 
that level of improvement would require a large number of clients.


Unless you went out of your way to turn it off, your drive is caching 
writes; every Seagate SATA drive I've ever seen does by default.  
1062 tps with 3-4 clients just isn't possible with your hardware 
otherwise. If you turn that feature off with:


hdparm -W0 /dev/hda (might be /dev/sda with the current driver)

that will disable the disk caching and you'll be reporting accurate 
numbers--which will be far lower than you're seeing now.


And AFAIR according to a comment on LKML some time ago,
it greatly decreases your disk's MTBF as well.
But thanks for the great insights, anyway.
I already knew that nForce5 is not a caching controller. :-)
I meant it's a good desktop performer.
And having a good UPS and a bit oversized Enermax PSU
helps avoiding crashes with the sometimes erratic power line.

While your results are an interesting commentary on how fast the 
system can run when it has a write cache available, and the increase 
with recent code is interesting, your actual figures here are a 
fantasy.  The database isn't working properly and a real system using 
this hardware would be expected to become corrupted if ran for long 
enough.  I have a paper at 
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you 
might want to read that goes into more detail than you probably want 
to know on this subject if you're like to read more about it--and you 
really, really should if you intend to put important data into a 
PostgreSQL database.


Thanks, I will read it.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




--
--
Zoltán Böszörményi
Cybertec Geschwinde  Schönig GmbH
http://www.postgresql.at/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Peter Schuller
 - Deferred Transactions, since adding a comment to a blog post
 doesn't need the same guarantees than submitting a paid order, it makes
 sense that the application could tell postgres which transactions we
 care about if power is lost. This will massively boost performance for
 websites I believe.

This would be massively useful. Very often all I care about is that the
transaction is semantically committed; that is, that other transactions
starting from that moment will see the modifications done. As opposed to
actually persisting data to disk.

In particular I have a situation where I attempt to utilize available
hardware by using concurrency. The problem is that I have to either
hugely complicate my client code or COMMIT more often than I would like
in order to satisfy dependencies between different transactions. If a
deferred/delayed commit were possible I could get all the performance
benefit without the code complexity, and with no penalty (because in
this case persistence is not important).

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 Scott Marlowe wrote:
 
 I thought you were limited to 250 or so COMMITS to disk per second, and 
 since 1 client can be committed at once, you could do greater than 250 
 tps, as long as you had 1 client providing input.  Or was I wrong?

 My impression is that you are correct in theory -- this is the commit
 delay feature.  But it seems that the feature does not work as well as
 one would like; and furthermore, it is disabled by default.

Even without commit delay a client will commit any pending WAL records when it
syncs the WAL. The clients waiting to commit their records will find it
already synced when they get woken up.

However as mentioned a while back in practice it doesn't work quite right and
you should expect to get 1/2 the expected performance. So even with 10 clients
you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
15kprm drive.

Heikki posted a patch that experimented with fixing this. Hopefully it'll be
fixed for 8.4.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Gregory Stark

What's interesting here is that on a couple metrics the green curve is
actually *better* until it takes that nosedive at 500 MB. Obviously it's not
better on average hits/s, the most obvious metric. But on deviation and
worst-case hits/s it's actually doing better.

Note that while the average hits/s between 100 and 500 is over 600 tps for
Postgres there is a consistent smattering of plot points spread all the way
down to 200 tps, well below the 400-500 tps that MySQL is getting.

Some of those are undoubtedly caused by things like checkpoints and vacuum
runs. Hopefully the improvements that are already in the pipeline will reduce
them.

I mention this only to try to move some of the focus from the average
performance to trying to remove the pitfalls that affact 1-10% of transactions
and screw the worst-case performance. In practical terms it's the worst-case
that governs perceptions, not average case.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread PFC
Note that while the average hits/s between 100 and 500 is over 600 tps  
for
Postgres there is a consistent smattering of plot points spread all the  
way

down to 200 tps, well below the 400-500 tps that MySQL is getting.


Yes, these are due to checkpointing, mostly.
	Also, note that a real forum would not insert 100 posts/s, so it would  
not feel this effect. But in order to finish the benchmark in a correct  
amount of time, we have to push on the inserts.


Some of those are undoubtedly caused by things like checkpoints and  
vacuum
runs. Hopefully the improvements that are already in the pipeline will  
reduce

them.


	I am re-running it with other tuning, notably cost-based vacuum delay and  
less frequent checkpoints, and it is a *lot* smoother.
	These take a full night to run, so I'll post more results when I have  
usefull stuff to show.

This has proven to be a very interesting trip to benchmarkland...

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres Benchmark Results

2007-05-22 Thread Greg Smith

On Tue, 22 May 2007, Gregory Stark wrote:


However as mentioned a while back in practice it doesn't work quite right and
you should expect to get 1/2 the expected performance. So even with 10 clients
you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
15kprm drive.


I would agree that's the approximate size of the upper-bound.  There are 
so many factors that go into the effectiveness of commit_delay that I 
wouldn't word it so strongly as to say you can expect that much benefit. 
The exact delay amount (which can be hard to set if your client load 
varies greatly), size of the transactions, balance of seek-bound reads vs. 
memory based ones in the transactions, serialization in the transaction 
stream, and so many other things can slow the effective benefit.


Also, there are generally other performance issues in the types of systems 
you would think would get the most benefit from this parameter that end up 
slowing things down anyway.  I've been seeing a best case of closer to 
2*single tps rather than 5* on my single-drive systems with no write 
caching, but I'll admit I haven't done an exhausting look at it yet (too 
busy with the real systems that have good controllers).  One of these 
days...


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Jim C. Nasby
On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
 I also went into benchmarking mode last night for my own
 amusement when I read on the linux-kernel ML that
 NCQ support for nForce5 chips was released.
 I tried current PostgreSQL 8.3devel CVS.
 pgbench over local TCP connection with
 25 clients and 3000 transacts/client gave me
 around 445 tps before applying NCQ support.
 680 tps after.
 
 It went over 840 tps after adding HOT v7 patch,
 still with 25 clients. It topped at 1062 tps with 3-4 clients.
 I used a single Seagate 320GB SATA2 drive
 for the test, which only has less than 40GB free.
 So it's already at the end of the disk giving smaller
 transfer rates then at the beginning. Filesystem is ext3.
 Dual core Athlon64 X2 4200 in 64-bit mode.
 I have never seen such a performance before
 on a desktop machine.

I'd be willing to bet money that the drive is lying about commits/fsync.
Each transaction committed essentially requires one revolution of the
drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.

BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
data=writeback. Note that doing that probably has a negative impact on
data recovery after a crash for non-database files.
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread PFC


Well that matches up well with my experience, better even yet, file a  
performance bug to the commercial support and you'll get an explanation  
why your schema (or your hardware, well anything but the database  
software used) is the guilty factor.


	Yeah, I filed a bug last week since REPEATABLE READ isn't repeatable : it  
works for SELECT but INSERT INTO ... SELECT switches to READ COMMITTED and  
thus does not insert the same rows that the same SELECT would have  
returned.


but you know these IT manager journals consider mysql as the relevant  
opensource database. Guess it matches better with their expection than  
PG or say MaxDB (the artist known formerly as Sap DB).


Never tried MaxDB.

	So far, my MyISAM benchmarks show that, while on the CPU limited case,  
Postgres is faster (even on small simple selects) , when the dataset grows  
larger, MyISAM keeps going much better than Postgres. That was to be  
expected since the tables are more compact, it can read indexes without  
hitting the tables, and of course it doesn't have transaction overhead.


	However, these good results are slightly mitigated by the massive data  
corruption and complete mayhem that ensues, either from transactions  
aborting mid-way, that can't be rolled back obviously, leaving stuff with  
broken relations, or plain simple engine bugs which replace your data with  
crap. After about 1/2 hour of hitting the tables hard, they start to  
corrupt and you get cryptic error messages. Fortunately REPAIR TABLE  
provides good consolation in telling you how much corrupt data it had to  
erase from your table... really reassuring !


	I believe the following current or future Postgres features will provide  
an interesting answer to MyISAM :


- The fact that it doesn't corrupt your data, duh.
- HOT
- the new non-logged tables
	- Deferred Transactions, since adding a comment to a blog post doesn't  
need the same guarantees than submitting a paid order, it makes sense that  
the application could tell postgres which transactions we care about if  
power is lost. This will massively boost performance for websites I  
believe.

- the patch that keeps tables in approximate cluster order

	By the way, about the ALTER TABLE SET PERSISTENCE ... for non-logged  
tables, will we get an ON RECOVER trigger ?
	For instance, I have counts tables that are often updated by triggers. On  
recovery, I could simply re-create the counts from the actual data. So I  
could use the extra speed of non-crash proof tables.




Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread PFC
On Mon, 21 May 2007 23:05:22 +0200, Jim C. Nasby [EMAIL PROTECTED]  
wrote:



On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote:


I felt the world needed a new benchmark ;)
So : Forum style benchmark with simulation of many users posting and
viewing forums and topics on a PHP website.

http://home.peufeu.com/ftsbench/forum1.png


Any chance of publishing your benchmark code so others can do testing?
It sounds like a useful, well-thought-out benchmark (even if it is
rather specialized).


Yes, that was the intent from the start.
	It is specialized, because forums are one of the famous server killers.  
This is mostly due to bad database design, bad PHP skills, and the  
horrendous MySQL FULLTEXT.
	I'll have to clean up the code and document it for public consumption,  
though.
	However, the Python client is too slow. It saturates at about 1000 hits/s  
on a Athlon 64 3000+, so you can forget about benchmarking anything meaner  
than a Core 2 duo.



Also, I think it's important for you to track how long it takes to
respond to requests, both average and maximum. In a web application no
one's going to care if you're doing 1000TPS if it means that every time
you click on something it takes 15 seconds to get the next page back.
With network round-trip times and what-not considered I'd say you don't
want it to take any more than 200-500ms between when a request hits a
webserver and when the last bit of data has gone back to the client.


Yeah, I will do that too.


I'm guessing that there's about 600MB of memory available for disk
caching? (Well, 600MB minus whatever shared_buffers is set to).


It's about that. The machine has 1 GB of RAM.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Guido Neitzer

Am 21.05.2007 um 15:01 schrieb Jim C. Nasby:

I'd be willing to bet money that the drive is lying about commits/ 
fsync.

Each transaction committed essentially requires one revolution of the
drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.


Yes, that right, but if a lot of the transactions are selects, there  
is no entry in the x_log for them and most of the stuff can come from  
the cache - read from memory which is blazing fast compared to any  
disk ... And this was a pg_bench test - I don't know what the  
benchmark really does but if I remember correctly it is mostly reading.


cug



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Rich

I assume red is the postgresql.  AS you add connections, Mysql always dies.

On 5/20/07, PFC [EMAIL PROTECTED] wrote:



I felt the world needed a new benchmark ;)
So : Forum style benchmark with simulation of many users posting
and
viewing forums and topics on a PHP website.

http://home.peufeu.com/ftsbench/forum1.png

One of those curves is a very popular open-source database which
claims
to offer unparallelled speed.
The other one is of course Postgres 8.2.3 which by popular belief
is
full-featured but slow

What is your guess ?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Scott Marlowe

Jim C. Nasby wrote:

On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
  

I also went into benchmarking mode last night for my own
amusement when I read on the linux-kernel ML that
NCQ support for nForce5 chips was released.
I tried current PostgreSQL 8.3devel CVS.
pgbench over local TCP connection with
25 clients and 3000 transacts/client gave me
around 445 tps before applying NCQ support.
680 tps after.

It went over 840 tps after adding HOT v7 patch,
still with 25 clients. It topped at 1062 tps with 3-4 clients.
I used a single Seagate 320GB SATA2 drive
for the test, which only has less than 40GB free.
So it's already at the end of the disk giving smaller
transfer rates then at the beginning. Filesystem is ext3.
Dual core Athlon64 X2 4200 in 64-bit mode.
I have never seen such a performance before
on a desktop machine.



I'd be willing to bet money that the drive is lying about commits/fsync.
Each transaction committed essentially requires one revolution of the
drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.

BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
data=writeback. Note that doing that probably has a negative impact on
data recovery after a crash for non-database files.
  


I thought you were limited to 250 or so COMMITS to disk per second, and 
since 1 client can be committed at once, you could do greater than 250 
tps, as long as you had 1 client providing input.  Or was I wrong?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Alvaro Herrera
Scott Marlowe wrote:
 Jim C. Nasby wrote:
 On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
   
 I also went into benchmarking mode last night for my own
 amusement when I read on the linux-kernel ML that
 NCQ support for nForce5 chips was released.
 I tried current PostgreSQL 8.3devel CVS.
 pgbench over local TCP connection with
 25 clients and 3000 transacts/client gave me
 around 445 tps before applying NCQ support.
 680 tps after.
 
 It went over 840 tps after adding HOT v7 patch,
 still with 25 clients. It topped at 1062 tps with 3-4 clients.
 I used a single Seagate 320GB SATA2 drive
 for the test, which only has less than 40GB free.
 So it's already at the end of the disk giving smaller
 transfer rates then at the beginning. Filesystem is ext3.
 Dual core Athlon64 X2 4200 in 64-bit mode.
 I have never seen such a performance before
 on a desktop machine.
 
 
 I'd be willing to bet money that the drive is lying about commits/fsync.
 Each transaction committed essentially requires one revolution of the
 drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.
 
 BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
 data=writeback. Note that doing that probably has a negative impact on
 data recovery after a crash for non-database files.
   
 
 I thought you were limited to 250 or so COMMITS to disk per second, and 
 since 1 client can be committed at once, you could do greater than 250 
 tps, as long as you had 1 client providing input.  Or was I wrong?

My impression is that you are correct in theory -- this is the commit
delay feature.  But it seems that the feature does not work as well as
one would like; and furthermore, it is disabled by default.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres Benchmark Results

2007-05-21 Thread Greg Smith

On Mon, 21 May 2007, Guido Neitzer wrote:

Yes, that right, but if a lot of the transactions are selects, there is no 
entry in the x_log for them and most of the stuff can come from the cache - 
read from memory which is blazing fast compared to any disk ... And this was 
a pg_bench test - I don't know what the benchmark really does but if I 
remember correctly it is mostly reading.


The standard pgbench transaction includes a select, an insert, and three 
updates.  All five finished equals one transaction; the fact that the 
SELECT statment in there could be executed much faster where it to happen 
on its own doesn't matter.


Because it does the most work on the biggest table, the entire combination 
is usually driven mostly by how long the UPDATE to the accounts table 
takes.  The TPS numbers can certainly be no larger than the rate at which 
you can execute that.


As has been pointed out, every time you commit a transacation the disk has 
to actually write that out before it's considered complete.  Unless you 
have a good caching disk controller (which your nForce5 is not) you're 
limited to 120 TPS with a 7200RPM drive and 250 with a 15000 RPM one. 
While it's possible to improve slightly on this using the commit_delay 
feature, I haven't been able to replicate even a 100% improvement that way 
when running pgbench, and to get even close to that level of improvement 
would require a large number of clients.


Unless you went out of your way to turn it off, your drive is caching 
writes; every Seagate SATA drive I've ever seen does by default.  1062 
tps with 3-4 clients just isn't possible with your hardware otherwise. 
If you turn that feature off with:


hdparm -W0 /dev/hda (might be /dev/sda with the current driver)

that will disable the disk caching and you'll be reporting accurate 
numbers--which will be far lower than you're seeing now.


While your results are an interesting commentary on how fast the system 
can run when it has a write cache available, and the increase with recent 
code is interesting, your actual figures here are a fantasy.  The database 
isn't working properly and a real system using this hardware would be 
expected to become corrupted if ran for long enough.  I have a paper at 
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you 
might want to read that goes into more detail than you probably want to 
know on this subject if you're like to read more about it--and you really, 
really should if you intend to put important data into a PostgreSQL 
database.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Postgres Benchmark Results

2007-05-20 Thread PFC


I felt the world needed a new benchmark ;)
	So : Forum style benchmark with simulation of many users posting and  
viewing forums and topics on a PHP website.


http://home.peufeu.com/ftsbench/forum1.png

	One of those curves is a very popular open-source database which claims  
to offer unparallelled speed.
	The other one is of course Postgres 8.2.3 which by popular belief is  
full-featured but slow


What is your guess ?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Arjen van der Meijden
I assume red is PostgreSQL and green is MySQL. That reflects my own 
benchmarks with those two.


But I don't fully understand what the graph displays. Does it reflect 
the ability of the underlying database to support a certain amount of 
users per second given a certain database size? Or is the growth of the 
database part of the benchmark?


Btw, did you consider that older topics are normally read much less and 
almost never get new postings? I think the size of the active data set 
is more dependent on the amount of active members than on the actual 
amount of data available.
That can reduce the impact of the size of the database greatly, although 
we saw very nice gains in performance on our forum (over 22GB of 
messages) when replacing the databaseserver with one with twice the 
memory, cpu's and I/O.


Best regards,

Arjen

On 20-5-2007 16:58 PFC wrote:


I felt the world needed a new benchmark ;)
So : Forum style benchmark with simulation of many users posting and 
viewing forums and topics on a PHP website.


http://home.peufeu.com/ftsbench/forum1.png

One of those curves is a very popular open-source database which 
claims to offer unparallelled speed.
The other one is of course Postgres 8.2.3 which by popular belief is 
full-featured but slow


What is your guess ?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread PFC


I assume red is PostgreSQL and green is MySQL. That reflects my own  
benchmarks with those two.


Well, since you answered first, and right, you win XD

The little curve that dives into the ground is MySQL with InnoDB.
The Energizer bunny that keeps going is Postgres.

But I don't fully understand what the graph displays. Does it reflect  
the ability of the underlying database to support a certain amount of  
users per second given a certain database size? Or is the growth of the  
database part of the benchmark?


	Basically I have a test client which simulates a certain number of  
concurrent users browsing a forum, and posting (posting rate is  
artificially high in order to fill the tables quicker than the months it  
would take in real life).


	Since the fake users pick which topics to view and post in by browsing  
the pages, like people would do, it tends to pick the topics in the first  
few pages of the forum, those with the most recent posts. So, like in real  
life, some topics fall through the first pages, and go down to rot at the  
bottom, while others grow much more.


	So, as the database grows (X axis) ; the total number of webpages served  
per second (viewings + postings) is on the Y axis, representing the user's  
experience (fast / slow / dead server)


	The number of concurrent HTTP or Postgres connections is not plotted, it  
doesn't really matter anyway for benchmarking purposes, you need to have  
enough to keep the server busy, but not too much or you're just wasting  
RAM. For a LAN that's about 30 HTTP connections and about 8 PHP processes  
with each a database connection.
	Since I use lighttpd, I don't really care about the number of actual slow  
clients (ie. real concurrent HTTP connections). Everything is funneled  
through those 8 PHP processes, so postgres never sees huge concurrency.

About 2/3 of the CPU is used by PHP anyway, only 1/3 by Postgres ;)

Btw, did you consider that older topics are normally read much less and  
almost never get new postings? I think the size of the active data set  
is more dependent on the amount of active members than on the actual  
amount of data available.


Yes, see above.
	The posts table is clustered on (topic_id, post_id) and this is key to  
performance.


That can reduce the impact of the size of the database greatly, although  
we saw very nice gains in performance on our forum (over 22GB of  
messages) when replacing the databaseserver with one with twice the  
memory, cpu's and I/O.


Well, you can see on the curve when it hits IO-bound behaviour.

	I'm writing a full report, but I'm having a lot of problems with MySQL,  
I'd like to give it a fair chance, but it shows real obstination in NOT  
working.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Arjen van der Meijden

On 20-5-2007 19:09 PFC wrote:
Since I use lighttpd, I don't really care about the number of actual 
slow clients (ie. real concurrent HTTP connections). Everything is 
funneled through those 8 PHP processes, so postgres never sees huge 
concurrency.


Well, that would only be in favour of postgres anyway, it scales in our 
benchmarks better to multiple cpu's, multiple clients and appaerantly in 
yours to larger datasets. MySQL seems to be faster up untill a certain 
amount of concurrent clients (close to the amount of cpu's available) 
and beyond that can collapse dramatically.


I'm writing a full report, but I'm having a lot of problems with 
MySQL, I'd like to give it a fair chance, but it shows real obstination 
in NOT working.


Yeah, it displayed very odd behaviour when doing benchmarks here too. If 
you haven't done already, you can try the newest 5.0-verion (5.0.41?) 
which eliminates several scaling issues in InnoDB, but afaik not all of 
them. Besides that, it just can be pretty painful to get a certain query 
fast, although we've not very often seen it failing completely in the 
last few years.


Best regards,

Arjen van der Meijden

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Tom Lane
PFC [EMAIL PROTECTED] writes:
   The little curve that dives into the ground is MySQL with InnoDB.
   The Energizer bunny that keeps going is Postgres.

Just for comparison's sake it would be interesting to see a curve for
mysql/myisam.  Mysql's claim to speed is mostly based on measurements
taken with myisam tables, but I think that doesn't hold up very well
under concurrent load.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Zoltan Boszormenyi

PFC írta:


I felt the world needed a new benchmark ;)
So : Forum style benchmark with simulation of many users posting 
and viewing forums and topics on a PHP website.


http://home.peufeu.com/ftsbench/forum1.png

One of those curves is a very popular open-source database which 
claims to offer unparallelled speed.
The other one is of course Postgres 8.2.3 which by popular belief 
is full-featured but slow


What is your guess ?


Red is PostgreSQL.

The advertised unparallelled speed must surely mean
benchmarking only single-client access on the noname DB. ;-)

I also went into benchmarking mode last night for my own
amusement when I read on the linux-kernel ML that
NCQ support for nForce5 chips was released.
I tried current PostgreSQL 8.3devel CVS.
pgbench over local TCP connection with
25 clients and 3000 transacts/client gave me
around 445 tps before applying NCQ support.
680 tps after.

It went over 840 tps after adding HOT v7 patch,
still with 25 clients. It topped at 1062 tps with 3-4 clients.
I used a single Seagate 320GB SATA2 drive
for the test, which only has less than 40GB free.
So it's already at the end of the disk giving smaller
transfer rates then at the beginning. Filesystem is ext3.
Dual core Athlon64 X2 4200 in 64-bit mode.
I have never seen such a performance before
on a desktop machine.

--
--
Zoltán Böszörményi
Cybertec Geschwinde  Schönig GmbH
http://www.postgresql.at/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread PFC

On Sun, 20 May 2007 19:26:38 +0200, Tom Lane [EMAIL PROTECTED] wrote:


PFC [EMAIL PROTECTED] writes:

The little curve that dives into the ground is MySQL with InnoDB.
The Energizer bunny that keeps going is Postgres.


Just for comparison's sake it would be interesting to see a curve for
mysql/myisam.  Mysql's claim to speed is mostly based on measurements
taken with myisam tables, but I think that doesn't hold up very well
under concurrent load.

regards, tom lane



I'm doing that now. Here is what I wrote in the report :

	Using prepared statements (important), Postgres beats MyISAM on simple  
selects as they say, as well as complex selects, even with 1 thread.


	MyISAM caused massive data corruption : posts and topics disappear,  
storage engine errors pop off, random thrashed rows appear in the forums  
table, therefore screwing up everything, etc. In short : it doesn't work.  
But, since noone in their right mind would use MyISAM for critical data, I  
include this result anyway, as a curiosity.


	I had to write a repair SQL script to fix the corruption in order to see  
how MySQL will fare when it gets bigger than RAM...



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Andreas Kostyrka

   I'm writing a full report, but I'm having a 
 lot of problems with MySQL,  
 I'd like to give it a fair chance, but it shows 
 real obstination in NOT  
 working.

Well that matches up well with my experience, better even yet, file a 
performance bug to the commercial support and you'll get an explanation why 
your schema (or your hardware, well anything but the database software used) is 
the guilty factor.

but you know these IT manager journals consider mysql as the relevant 
opensource database. Guess it matches better with their expection than PG or 
say MaxDB (the artist known formerly as Sap DB).

Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings