Re: [PERFORM] Rewriting DISTINCT and losing performance
Chuck, explain analyze SELECT country_id, country_name FROM geo.country WHERE country_id IN (select country_id FROM geo.city) ; -- won't complete in a reasonable amount of time. Can we see the plan? --Josh ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Rewriting DISTINCT and losing performance
Chuck D. wrote: Table "geo.city" Column | Type | Modifiers ++--- city_id| integer| not null state_id | smallint | country_id | smallint | latitude | numeric(9,7) | longitude | numeric(10,7) | city_name | character varying(100) | Indexes: "city_pk" PRIMARY KEY, btree (city_id) "idx_city_country_id" btree (country_id) CLUSTER Foreign-key constraints: "city_state_id_fk" FOREIGN KEY (state_id) REFERENCES geo.state(state_id) ON UPDATE CASCADE ON DELETE CASCADE Any good reason why country_id is NULLable? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Increasing Shared_buffers = slow commits?
Hi everyone, I am testing my shared_buffers pool and am running into a problem with slow inserts and commits. I was reading in several places that in the 8.XPostgreSQL engines should set the shared_buffers closer to 25% of the systems memory. On me development system, I have done that. We have 9GB of memory on the machine and I set my shared_buffers = 292188 (~25% of total memory). When my users logged in today, they are noticing the system is much slower. Tracing my log files, I am seeing that most of the commits are taking over 1sec. I am seeing a range of 1-5 seconds per commit. What is the correlation here between the shared_buffers and the disk activity? This is not something I would have expected at all. I was wanting to test for improved performance so I can have a good basis for making changes in my production systems. My postgresql.conf is pasted below. Thanks for any comments/clarifications, chris PG 8.1.3 RH 4 AS # - # PostgreSQL configuration file # - listen_addresses = '*'# what IP address(es) to listen on; port = 50001 max_connections = 1024 superuser_reserved_connections = 10 shared_buffers = 292188# setting to 25% of memory max_prepared_transactions = 256# can be 0 or more work_mem = 16384# min 64, size in KB maintenance_work_mem = 1048576# min 1024, size in KB max_fsm_pages = 800# min max_fsm_relations*16, 6 bytes each max_fsm_relations = 2# min 100, ~70 bytes each vacuum_cost_delay = 0# 0-1000 milliseconds vacuum_cost_page_hit = 0# 0-1 credits vacuum_cost_page_miss = 0# 0-1 credits vacuum_cost_page_dirty = 0# 0-1 credits vacuum_cost_limit = 1# 0-1 credits wal_buffers = 64# min 4, 8KB each checkpoint_segments = 256# in logfile segments, min 1, 16MB each checkpoint_timeout = 300# range 30-3600, in seconds archive_command = '/home/postgres/bin/archive_pg_xlog.sh %p %f 50001'# command to use to archive a logfile effective_cache_size = 383490# typically 8KB each random_page_cost = 2# units are one sequential page fetch default_statistics_target = 100# range 1-1000 constraint_exclusion = on redirect_stderr = on# Enable capturing of stderr into log log_directory = 'pg_log'# Directory where log files are written log_truncate_on_rotation = on # If on, any existing log file of the same log_rotation_age = 1440# Automatic rotation of logfiles will log_rotation_size = 1048576# Automatic rotation of logfiles will log_min_messages = debug2# Values, in order of decreasing detail: log_min_duration_statement = 0# -1 is disabled, 0 logs all statements log_connections = on log_disconnections = on log_duration = on log_line_prefix = '%d,%p,%u,%m,%c,%l,%s,%x,%i,'# Special values: log_statement = 'all'# none, mod, ddl, all stats_start_collector = on stats_command_string = on stats_block_level = on stats_row_level = on stats_reset_on_server_start = on autovacuum = on# enable autovacuum subprocess? autovacuum_naptime = 60# time between autovacuum runs, in secs autovacuum_vacuum_threshold = 1000# min # of tuple updates before autovacuum_analyze_threshold = 500# min # of tuple updates before autovacuum_vacuum_scale_factor = 0.001# fraction of rel size before autovacuum_analyze_scale_factor = 0.0005# fraction of rel size before autovacuum_vacuum_cost_delay = -1# default vacuum cost delay for autovacuum_vacuum_cost_limit = -1# default vacuum cost limit for statement_timeout = 0# 0 is disabled, in milliseconds lc_messages = 'C'# locale for system error message lc_monetary = 'C'# locale for monetary formatting lc_numeric = 'C'# locale for number formatting lc_time = 'C'# locale for time formatting add_missing_from = on
Re: [PERFORM] Rewriting DISTINCT and losing performance
On Monday 21 May 2007 03:14, Josh Berkus wrote: > Chuck, > > Can we see the plan? > > --Josh > Sorry Josh, I guess I could have just used EXPLAIN instead of EXPLAIN ANALYZE. # explain SELECT country_id, country_name FROM geo.country WHERE country_id IN (select country_id FROM geo.city) ; QUERY PLAN Nested Loop IN Join (cost=0.00..1252.60 rows=155 width=15) Join Filter: (country.country_id = city.country_id) -> Seq Scan on country (cost=0.00..6.44 rows=244 width=15) -> Seq Scan on city (cost=0.00..43409.12 rows=2122712 width=2) (4 rows) Versus the same query using the older, larger world_city table: # explain SELECT country_id, country_name FROM geo.country WHERE country_id IN (select country_id FROM geo.world_city) ; QUERY PLAN Nested Loop IN Join (cost=0.00..23.16 rows=155 width=15) -> Seq Scan on country (cost=0.00..6.44 rows=244 width=15) -> Index Scan using idx_world_city_country_id on world_city (cost=0.00..706.24 rows=12602 width=2) Index Cond: (country.country_id = world_city.country_id) (4 rows) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Rewriting DISTINCT and losing performance
On Monday 21 May 2007 05:40, Richard Huxton wrote: > Chuck D. wrote: > > Any good reason why country_id is NULLable? It has been a while since I imported the data so it took some time to examine it but here is what I found. In the original data, some cities do not have coutries. Strange huh? Most were in the Gaza Strip, No Man's Land or disputed territory where several countries claimed ownership. This is according to USGS and the board of names. Recognizing that this did me no good in my application I decided to repair that data so that country_id could have a NOT NULL modifier. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] pg_stats how-to?
On Fri, May 18, 2007 at 04:26:05PM -0700, Y Sidhu wrote: > >To answer your original question, a way to take a look at how bloated > >your tables are would be to ANALYZE, divide reltuples by relpages from > >pg_class (gives how many rows per page you have) and compare that to 8k > >/ average row size. The average row size for table rows would be the sum > >of avg_width from pg_stats for the table + 24 bytes overhead. For > >indexes, it would be the sum of avg_width for all fields in the index > >plus some overhead (8 bytes, I think). > > > >An even simpler alternative would be to install contrib/pgstattuple and > >use the pgstattuple function, though IIRC that does read the entire > >relation from disk. > >-- > >Jim Nasby [EMAIL PROTECTED] > >EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > > > Here are my results: > > a. SELECT sum(reltuples)/sum(relpages) as rows_per_page FROM pg_class; > > I get 66 > > b. SELECT (8000/(sum(avg_width)+24)) as table_stat FROM pg_stats; > > I get 1 And those results will be completely meaningless because they're covering the entire database (catalog tables included). You need to compare the two numbers on a table-by-table basis, and you'd also have to ignore any small tables (say smaller than 1000 pages). Also, a page is 8192 bytes in size (though granted there's a page header that's something like 16 bytes). -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Rewriting DISTINCT and losing performance
Chuck D. wrote: On Monday 21 May 2007 03:14, Josh Berkus wrote: Chuck, Can we see the plan? --Josh Sorry Josh, I guess I could have just used EXPLAIN instead of EXPLAIN ANALYZE. # explain SELECT country_id, country_name FROM geo.country WHERE country_id IN (select country_id FROM geo.city) ; QUERY PLAN Nested Loop IN Join (cost=0.00..1252.60 rows=155 width=15) Join Filter: (country.country_id = city.country_id) -> Seq Scan on country (cost=0.00..6.44 rows=244 width=15) -> Seq Scan on city (cost=0.00..43409.12 rows=2122712 width=2) The only thing I can think of is that the CLUSTERing on city.country_id makes the system think it'll be cheaper to seq-scan the whole table. I take it you have got 2 million rows in "city"? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Rewriting DISTINCT and losing performance
On Monday 21 May 2007 11:34, Richard Huxton wrote: > Chuck D. wrote: > > The only thing I can think of is that the CLUSTERing on city.country_id > makes the system think it'll be cheaper to seq-scan the whole table. > > I take it you have got 2 million rows in "city"? Well here is where it gets strange. The CLUSTER was just one thing I tried to do to enhance the performance. I had the same result prior to cluster. However, after updating that country_id column to NOT NULL and eliminating NULL values it will use the country_id index and perform quickly. Oddly enough, the original table, world_city still has NULL values in the country_id column and it has always used the country_id index. Doesn't that seem a bit strange? Does it have to do with the smaller size of the new table maybe? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Increasing Shared_buffers = slow commits?
On 5/21/07, Chris Hoover <[EMAIL PROTECTED]> wrote: Hi everyone, I am testing my shared_buffers pool and am running into a problem with slow inserts and commits. I was reading in several places that in the 8.X PostgreSQL engines should set the shared_buffers closer to 25% of the systems memory. On me development system, I have done that. We have 9GB of memory on the machine and I set my shared_buffers = 292188 (~25% of total memory). When my users logged in today, they are noticing the system is much slower. Tracing my log files, I am seeing that most of the commits are taking over 1sec. I am seeing a range of 1-5 seconds per commit. What is the correlation here between the shared_buffers and the disk activity? This is not something I would have expected at all. have you overcommited your memory? maybe you are thrashing a bit...long commit times are usually symptom of high iowait. can you pop up top and monitor iowait for a bit? can you lower shared buffers again and confirm that performance increases? how about doing some iostat/vmstat runs and looking for values that are significantly different depending on the shared buffers setting. merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 500 requests per second
On 5/12/07, Tarhon-Onu Victor <[EMAIL PROTECTED]> wrote: Hi guys, I'm looking for a database+hardware solution which should be able to handle up to 500 requests per second. The requests will consist in: - single row updates in indexed tables (the WHERE clauses will use the index(es), the updated column(s) will not be indexed); - inserts in the same kind of tables; - selects with approximately the same WHERE clause as the update statements will use. So nothing very special about these requests, only about the throughput. Can anyone give me an idea about the hardware requirements, type of clustering (at postgres level or OS level), and eventually about the OS (ideally should be Linux) which I could use to get something like this in place? I work on a system about like you describe400tps constant24/7. Major challenges are routine maintenance and locking. Autovacuum is your friend but you will need to schedule a full vaccum once in a while because of tps wraparound. If you allow AV to do this, none of your other tables get vacuumed until it completesheh! If you lock the wrong table, transactions will accumulate rapidly and the system will grind to a halt rather quickly (this can be mitigated somewhat by smart code on the client). Other general advice: * reserve plenty of space for WAL and keep volume separate from data volume...during a long running transaction WAL files will accumulate rapidly and panic the server if it runs out of space. * set reasonable statement timeout * backup with pitr. pg_dump is a headache on extremely busy servers. * get good i/o system for your box. start with 6 disk raid 10 and go from there. * spend some time reading about bgwriter settings, commit_delay, etc. * keep an eye out for postgresql hot (hopefully coming with 8.3) and make allowances for it in your design if possible. * normalize your database and think of vacuum as dangerous enemy. good luck! :-) merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] pg_stats how-to?
Thanks again! I'll make the change and get those numbers. Yudhvir On 5/21/07, Jim C. Nasby <[EMAIL PROTECTED]> wrote: On Fri, May 18, 2007 at 04:26:05PM -0700, Y Sidhu wrote: > >To answer your original question, a way to take a look at how bloated > >your tables are would be to ANALYZE, divide reltuples by relpages from > >pg_class (gives how many rows per page you have) and compare that to 8k > >/ average row size. The average row size for table rows would be the sum > >of avg_width from pg_stats for the table + 24 bytes overhead. For > >indexes, it would be the sum of avg_width for all fields in the index > >plus some overhead (8 bytes, I think). > > > >An even simpler alternative would be to install contrib/pgstattuple and > >use the pgstattuple function, though IIRC that does read the entire > >relation from disk. > >-- > >Jim Nasby [EMAIL PROTECTED] > >EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > > > Here are my results: > > a. SELECT sum(reltuples)/sum(relpages) as rows_per_page FROM pg_class; > > I get 66 > > b. SELECT (8000/(sum(avg_width)+24)) as table_stat FROM pg_stats; > > I get 1 And those results will be completely meaningless because they're covering the entire database (catalog tables included). You need to compare the two numbers on a table-by-table basis, and you'd also have to ignore any small tables (say smaller than 1000 pages). Also, a page is 8192 bytes in size (though granted there's a page header that's something like 16 bytes). -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) -- Yudhvir Singh Sidhu 408 375 3134 cell ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] 500 requests per second
I'm looking for a database+hardware solution which should be able to handle up to 500 requests per second. What proportion of reads and writes in those 500 tps ? (If you have 450 selects and 50 inserts/update transactions, your hardware requirements will be different than those for the reverse proportion) What is the estimated size of your data and hot working set ? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 500 requests per second
On Mon, May 21, 2007 at 03:50:27PM -0400, Merlin Moncure wrote: > I work on a system about like you describe400tps constant24/7. > Major challenges are routine maintenance and locking. Autovacuum is > your friend but you will need to schedule a full vaccum once in a > while because of tps wraparound. If you allow AV to do this, none of > your other tables get vacuumed until it completesheh! BTW, that's changed in either 8.2 or 8.3; the change is that freeze information is now tracked on a per-table basis instead of per-database. So autovacuum isn't forced into freezing everything in the database at once. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Postgres Benchmark Results
On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > I also went into benchmarking mode last night for my own > amusement when I read on the linux-kernel ML that > NCQ support for nForce5 chips was released. > I tried current PostgreSQL 8.3devel CVS. > pgbench over local TCP connection with > 25 clients and 3000 transacts/client gave me > around 445 tps before applying NCQ support. > 680 tps after. > > It went over 840 tps after adding HOT v7 patch, > still with 25 clients. It topped at 1062 tps with 3-4 clients. > I used a single Seagate 320GB SATA2 drive > for the test, which only has less than 40GB free. > So it's already at the end of the disk giving smaller > transfer rates then at the beginning. Filesystem is ext3. > Dual core Athlon64 X2 4200 in 64-bit mode. > I have never seen such a performance before > on a desktop machine. I'd be willing to bet money that the drive is lying about commits/fsync. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option data=writeback. Note that doing that probably has a negative impact on data recovery after a crash for non-database files. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres Benchmark Results
On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote: > > I felt the world needed a new benchmark ;) > So : Forum style benchmark with simulation of many users posting and > viewing forums and topics on a PHP website. > > http://home.peufeu.com/ftsbench/forum1.png Any chance of publishing your benchmark code so others can do testing? It sounds like a useful, well-thought-out benchmark (even if it is rather specialized). Also, I think it's important for you to track how long it takes to respond to requests, both average and maximum. In a web application no one's going to care if you're doing 1000TPS if it means that every time you click on something it takes 15 seconds to get the next page back. With network round-trip times and what-not considered I'd say you don't want it to take any more than 200-500ms between when a request hits a webserver and when the last bit of data has gone back to the client. I'm guessing that there's about 600MB of memory available for disk caching? (Well, 600MB minus whatever shared_buffers is set to). -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] 121+ million record table perf problems
On May 18, 2007, at 2:30 PM, Andrew Sullivan wrote: Note also that your approach of updating all 121 million records in one statement is approximately the worst way to do this in Postgres, because it creates 121 million dead tuples on your table. (You've created some number of those by killing the query as well.) All of that said, 17 hours seems kinda long. I don't think that is too long. Growing the table one page at a time takes a long time when you add a lot of pages to a table that big. Add in the single disk and you're flying the disk head all over the place so it will just be slow. No way around it. And just for good measure, I ran a count on one of my big tables which consists of two integers and a varchar(7): db=> select count(*) from mytable; count --- 311994721 (1 row) Time: 157689.057 ms So I'm going to bet $1 that you're I/O starved. Also, for memory usage, postgres won't use more than you tell it to... ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Rewriting DISTINCT and losing performance
"Chuck D." <[EMAIL PROTECTED]> writes: > Doesn't that seem a bit strange? Does it have to do with the smaller size of > the new table maybe? No, it seems to be a planner bug: http://archives.postgresql.org/pgsql-hackers/2007-05/msg00920.php I imagine that your table statistics are close to the critical point where a bitmap scan looks cheaper or more expensive than a plain index scan, and so the chosen plan varies depending on more-or-less chance factors. Certainly getting rid of NULLs shouldn't have had any direct impact on this choice. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres Benchmark Results
Well that matches up well with my experience, better even yet, file a performance bug to the commercial support and you'll get an explanation why your schema (or your hardware, well anything but the database software used) is the guilty factor. Yeah, I filed a bug last week since REPEATABLE READ isn't repeatable : it works for SELECT but INSERT INTO ... SELECT switches to READ COMMITTED and thus does not insert the same rows that the same SELECT would have returned. but you know these IT manager journals consider mysql as the relevant opensource database. Guess it matches better with their expection than PG or say MaxDB (the artist known formerly as Sap DB). Never tried MaxDB. So far, my MyISAM benchmarks show that, while on the CPU limited case, Postgres is faster (even on small simple selects) , when the dataset grows larger, MyISAM keeps going much better than Postgres. That was to be expected since the tables are more compact, it can read indexes without hitting the tables, and of course it doesn't have transaction overhead. However, these good results are slightly mitigated by the massive data corruption and complete mayhem that ensues, either from "transactions" aborting mid-way, that can't be rolled back obviously, leaving stuff with broken relations, or plain simple engine bugs which replace your data with crap. After about 1/2 hour of hitting the tables hard, they start to corrupt and you get cryptic error messages. Fortunately "REPAIR TABLE" provides good consolation in telling you how much corrupt data it had to erase from your table... really reassuring ! I believe the following current or future Postgres features will provide an interesting answer to MyISAM : - The fact that it doesn't corrupt your data, duh. - HOT - the new non-logged tables - Deferred Transactions, since adding a comment to a blog post doesn't need the same guarantees than submitting a paid order, it makes sense that the application could tell postgres which transactions we care about if power is lost. This will massively boost performance for websites I believe. - the patch that keeps tables in approximate cluster order By the way, about the ALTER TABLE SET PERSISTENCE ... for non-logged tables, will we get an ON RECOVER trigger ? For instance, I have counts tables that are often updated by triggers. On recovery, I could simply re-create the counts from the actual data. So I could use the extra speed of non-crash proof tables. Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Postgres Benchmark Results
On Mon, 21 May 2007 23:05:22 +0200, Jim C. Nasby <[EMAIL PROTECTED]> wrote: On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote: I felt the world needed a new benchmark ;) So : Forum style benchmark with simulation of many users posting and viewing forums and topics on a PHP website. http://home.peufeu.com/ftsbench/forum1.png Any chance of publishing your benchmark code so others can do testing? It sounds like a useful, well-thought-out benchmark (even if it is rather specialized). Yes, that was the intent from the start. It is specialized, because forums are one of the famous server killers. This is mostly due to bad database design, bad PHP skills, and the horrendous MySQL FULLTEXT. I'll have to clean up the code and document it for public consumption, though. However, the Python client is too slow. It saturates at about 1000 hits/s on a Athlon 64 3000+, so you can forget about benchmarking anything meaner than a Core 2 duo. Also, I think it's important for you to track how long it takes to respond to requests, both average and maximum. In a web application no one's going to care if you're doing 1000TPS if it means that every time you click on something it takes 15 seconds to get the next page back. With network round-trip times and what-not considered I'd say you don't want it to take any more than 200-500ms between when a request hits a webserver and when the last bit of data has gone back to the client. Yeah, I will do that too. I'm guessing that there's about 600MB of memory available for disk caching? (Well, 600MB minus whatever shared_buffers is set to). It's about that. The machine has 1 GB of RAM. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Postgres Benchmark Results
Am 21.05.2007 um 15:01 schrieb Jim C. Nasby: I'd be willing to bet money that the drive is lying about commits/ fsync. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. Yes, that right, but if a lot of the transactions are selects, there is no entry in the x_log for them and most of the stuff can come from the cache - read from memory which is blazing fast compared to any disk ... And this was a pg_bench test - I don't know what the benchmark really does but if I remember correctly it is mostly reading. cug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres Benchmark Results
I assume red is the postgresql. AS you add connections, Mysql always dies. On 5/20/07, PFC <[EMAIL PROTECTED]> wrote: I felt the world needed a new benchmark ;) So : Forum style benchmark with simulation of many users posting and viewing forums and topics on a PHP website. http://home.peufeu.com/ftsbench/forum1.png One of those curves is "a very popular open-source database which claims to offer unparallelled speed". The other one is of course Postgres 8.2.3 which by popular belief is "full-featured but slow" What is your guess ? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres Benchmark Results
Jim C. Nasby wrote: On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: I also went into benchmarking mode last night for my own amusement when I read on the linux-kernel ML that NCQ support for nForce5 chips was released. I tried current PostgreSQL 8.3devel CVS. pgbench over local TCP connection with 25 clients and 3000 transacts/client gave me around 445 tps before applying NCQ support. 680 tps after. It went over 840 tps after adding HOT v7 patch, still with 25 clients. It topped at 1062 tps with 3-4 clients. I used a single Seagate 320GB SATA2 drive for the test, which only has less than 40GB free. So it's already at the end of the disk giving smaller transfer rates then at the beginning. Filesystem is ext3. Dual core Athlon64 X2 4200 in 64-bit mode. I have never seen such a performance before on a desktop machine. I'd be willing to bet money that the drive is lying about commits/fsync. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option data=writeback. Note that doing that probably has a negative impact on data recovery after a crash for non-database files. I thought you were limited to 250 or so COMMITS to disk per second, and since >1 client can be committed at once, you could do greater than 250 tps, as long as you had >1 client providing input. Or was I wrong? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgres Benchmark Results
Scott Marlowe wrote: > Jim C. Nasby wrote: > >On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > > > >>I also went into benchmarking mode last night for my own > >>amusement when I read on the linux-kernel ML that > >>NCQ support for nForce5 chips was released. > >>I tried current PostgreSQL 8.3devel CVS. > >>pgbench over local TCP connection with > >>25 clients and 3000 transacts/client gave me > >>around 445 tps before applying NCQ support. > >>680 tps after. > >> > >>It went over 840 tps after adding HOT v7 patch, > >>still with 25 clients. It topped at 1062 tps with 3-4 clients. > >>I used a single Seagate 320GB SATA2 drive > >>for the test, which only has less than 40GB free. > >>So it's already at the end of the disk giving smaller > >>transfer rates then at the beginning. Filesystem is ext3. > >>Dual core Athlon64 X2 4200 in 64-bit mode. > >>I have never seen such a performance before > >>on a desktop machine. > >> > > > >I'd be willing to bet money that the drive is lying about commits/fsync. > >Each transaction committed essentially requires one revolution of the > >drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. > > > >BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option > >data=writeback. Note that doing that probably has a negative impact on > >data recovery after a crash for non-database files. > > > > I thought you were limited to 250 or so COMMITS to disk per second, and > since >1 client can be committed at once, you could do greater than 250 > tps, as long as you had >1 client providing input. Or was I wrong? My impression is that you are correct in theory -- this is the "commit delay" feature. But it seems that the feature does not work as well as one would like; and furthermore, it is disabled by default. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 500 requests per second
* set reasonable statement timeout * backup with pitr. pg_dump is a headache on extremely busy servers. Where do you put your pitr wal logs so that they don't take up extra I/O ? * get good i/o system for your box. start with 6 disk raid 10 and go from there. * spend some time reading about bgwriter settings, commit_delay, etc. * keep an eye out for postgresql hot (hopefully coming with 8.3) and make allowances for it in your design if possible. * normalize your database and think of vacuum as dangerous enemy. good luck! :-) merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Postgres Benchmark Results
On Mon, 21 May 2007, Guido Neitzer wrote: Yes, that right, but if a lot of the transactions are selects, there is no entry in the x_log for them and most of the stuff can come from the cache - read from memory which is blazing fast compared to any disk ... And this was a pg_bench test - I don't know what the benchmark really does but if I remember correctly it is mostly reading. The standard pgbench transaction includes a select, an insert, and three updates. All five finished equals one transaction; the fact that the SELECT statment in there could be executed much faster where it to happen on its own doesn't matter. Because it does the most work on the biggest table, the entire combination is usually driven mostly by how long the UPDATE to the accounts table takes. The TPS numbers can certainly be no larger than the rate at which you can execute that. As has been pointed out, every time you commit a transacation the disk has to actually write that out before it's considered complete. Unless you have a good caching disk controller (which your nForce5 is not) you're limited to 120 TPS with a 7200RPM drive and 250 with a 15000 RPM one. While it's possible to improve slightly on this using the commit_delay feature, I haven't been able to replicate even a 100% improvement that way when running pgbench, and to get even close to that level of improvement would require a large number of clients. Unless you went out of your way to turn it off, your drive is caching writes; every Seagate SATA drive I've ever seen does by default. "1062 tps with 3-4 clients" just isn't possible with your hardware otherwise. If you turn that feature off with: hdparm -W0 /dev/hda (might be /dev/sda with the current driver) that will disable the disk caching and you'll be reporting accurate numbers--which will be far lower than you're seeing now. While your results are an interesting commentary on how fast the system can run when it has a write cache available, and the increase with recent code is interesting, your actual figures here are a fantasy. The database isn't working properly and a real system using this hardware would be expected to become corrupted if ran for long enough. I have a paper at http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you might want to read that goes into more detail than you probably want to know on this subject if you're like to read more about it--and you really, really should if you intend to put important data into a PostgreSQL database. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres Benchmark Results
Am 21.05.2007 um 23:51 schrieb Greg Smith: The standard pgbench transaction includes a select, an insert, and three updates. I see. Didn't know that, but it makes sense. Unless you went out of your way to turn it off, your drive is caching writes; every Seagate SATA drive I've ever seen does by default. "1062 tps with 3-4 clients" just isn't possible with your hardware otherwise. Btw: it wasn't my hardware in this test! cug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres Benchmark Results
Jim C. Nasby írta: On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: I also went into benchmarking mode last night for my own amusement when I read on the linux-kernel ML that NCQ support for nForce5 chips was released. I tried current PostgreSQL 8.3devel CVS. pgbench over local TCP connection with 25 clients and 3000 transacts/client gave me around 445 tps before applying NCQ support. 680 tps after. It went over 840 tps after adding HOT v7 patch, still with 25 clients. It topped at 1062 tps with 3-4 clients. I used a single Seagate 320GB SATA2 drive for the test, which only has less than 40GB free. So it's already at the end of the disk giving smaller transfer rates then at the beginning. Filesystem is ext3. Dual core Athlon64 X2 4200 in 64-bit mode. I have never seen such a performance before on a desktop machine. I'd be willing to bet money that the drive is lying about commits/fsync. It could well be the case. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. By "revolution", you mean one 360 degrees turnaround of the platter, yes? On the other hand, if you have multiple clients, isn't the 250 COMMITs/sec limit is true only per client? Of course assuming that the disk subsystem has more TCQ/NCQ threads than the actual number of DB clients. BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option data=writeback. Note that doing that probably has a negative impact on data recovery after a crash for non-database files. I haven't touched the FS options. I can even use ext2 if I want non-recoverability. :-) -- -- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres Benchmark Results
Greg Smith írta: On Mon, 21 May 2007, Guido Neitzer wrote: Yes, that right, but if a lot of the transactions are selects, there is no entry in the x_log for them and most of the stuff can come from the cache - read from memory which is blazing fast compared to any disk ... And this was a pg_bench test - I don't know what the benchmark really does but if I remember correctly it is mostly reading. The standard pgbench transaction includes a select, an insert, and three updates. All five finished equals one transaction; the fact that the SELECT statment in there could be executed much faster where it to happen on its own doesn't matter. Because it does the most work on the biggest table, the entire combination is usually driven mostly by how long the UPDATE to the accounts table takes. The TPS numbers can certainly be no larger than the rate at which you can execute that. As has been pointed out, every time you commit a transacation the disk has to actually write that out before it's considered complete. Unless you have a good caching disk controller (which your nForce5 is not) you're limited to 120 TPS with a 7200RPM drive and 250 with a 15000 RPM one. While it's possible to improve slightly on this using the commit_delay feature, I haven't been able to replicate even a 100% improvement that way when running pgbench, and to get even close to that level of improvement would require a large number of clients. Unless you went out of your way to turn it off, your drive is caching writes; every Seagate SATA drive I've ever seen does by default. "1062 tps with 3-4 clients" just isn't possible with your hardware otherwise. If you turn that feature off with: hdparm -W0 /dev/hda (might be /dev/sda with the current driver) that will disable the disk caching and you'll be reporting accurate numbers--which will be far lower than you're seeing now. And AFAIR according to a comment on LKML some time ago, it greatly decreases your disk's MTBF as well. But thanks for the great insights, anyway. I already knew that nForce5 is not a caching controller. :-) I meant it's a good desktop performer. And having a good UPS and a bit oversized Enermax PSU helps avoiding crashes with the sometimes erratic power line. While your results are an interesting commentary on how fast the system can run when it has a write cache available, and the increase with recent code is interesting, your actual figures here are a fantasy. The database isn't working properly and a real system using this hardware would be expected to become corrupted if ran for long enough. I have a paper at http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you might want to read that goes into more detail than you probably want to know on this subject if you're like to read more about it--and you really, really should if you intend to put important data into a PostgreSQL database. Thanks, I will read it. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- -- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgres Benchmark Results
> - Deferred Transactions, since adding a comment to a blog post > doesn't need the same guarantees than submitting a paid order, it makes > sense that the application could tell postgres which transactions we > care about if power is lost. This will massively boost performance for > websites I believe. This would be massively useful. Very often all I care about is that the transaction is semantically committed; that is, that other transactions starting from that moment will see the modifications done. As opposed to actually persisting data to disk. In particular I have a situation where I attempt to utilize available hardware by using concurrency. The problem is that I have to either hugely complicate my client code or COMMIT more often than I would like in order to satisfy dependencies between different transactions. If a deferred/delayed commit were possible I could get all the performance benefit without the code complexity, and with no penalty (because in this case persistence is not important). -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org signature.asc Description: OpenPGP digital signature