Re: [PERFORM] shared_buffers advice
Greg Smith g...@2ndquadrant.com writes: I'm not sure how to make progress on similar ideas about tuning closer to the filesystem level without having something automated that takes over the actual benchmark running and data recording steps; it's just way too time consuming to do those right now with every tool that's available for PostgreSQL so far. That's the problem I work on, there are easily a half dozen good ideas for improvements here floating around where coding time is dwarfed by required performance validation time. I still think the best tool around currently for this kind of testing is tsung, but I've yet to have the time to put money where my mouth is, as they say. Still, I'd be happy to take some time a help you decide if it's the tool you want to base your performance testing suite on or not. http://tsung.erlang-projects.org/ Regards, -- dim -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump far too slow
On Sun, 14 Mar 2010, David Newall wrote: nohup time pg_dump -f database.dmp -Z9 database I presumed pg_dump was CPU-bound because of gzip compression, but a test I ran makes that seem unlikely... There was some discussion about this a few months ago at http://archives.postgresql.org/pgsql-performance/2009-07/msg00348.php It seems that getting pg_dump to do the compression is a fair amount slower than piping the plain format dump straight through gzip. You get a bit more parallelism that way too. Matthew -- I'm always interested when [cold callers] try to flog conservatories. Anyone who can actually attach a conservatory to a fourth floor flat stands a marginally better than average chance of winning my custom. (Seen on Usenet) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] mysql to postgresql, performance questions
Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 mysql {select_type=SIMPLE, key_len=1, id=1, table=users, type=range, possible_keys=birthday_age, rows=7572, Extra=Using where, ref=nil, key=birthday_age} 15.104055404663 14.209032058716 18.857002258301 15.714883804321 14.73593711853 15.048027038574 14.589071273804 14.847040176392 15.192985534668 15.115976333618 postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 41.156768798828 39.988040924072 40.470123291016 40.035963058472 40.077924728394 40.94386100769 40.183067321777 39.83211517334 40.256977081299 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? Thanks, Corin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
I guess we need some more details about the test. Is the connection/disconnection part of each test iteration? And how are the databases connected (using a socked / localhost / different host)? Anyway measuring such simple queries will tell you almost nothing about the general app performance - use the queries that are used in the application. I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? I doubt there's a 'best tool' to time queries, but I'd vote for logging from the application itself, as it measures the performance from the end user view-point (and that's what you're interested in). Just put some simple logging into the database access layer. regards Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
If you expect this DB to be memory resident, you should update the cpu/disk cost parameters in postgresql.conf. There was a post earlier today with some more reasonable starting values. Certainly your test DB will be memory resident. Ken On Thu, Mar 18, 2010 at 03:31:18PM +0100, Corin wrote: Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 mysql {select_type=SIMPLE, key_len=1, id=1, table=users, type=range, possible_keys=birthday_age, rows=7572, Extra=Using where, ref=nil, key=birthday_age} 15.104055404663 14.209032058716 18.857002258301 15.714883804321 14.73593711853 15.048027038574 14.589071273804 14.847040176392 15.192985534668 15.115976333618 postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 41.156768798828 39.988040924072 40.470123291016 40.035963058472 40.077924728394 40.94386100769 40.183067321777 39.83211517334 40.256977081299 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? Thanks, Corin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On 18 March 2010 14:31, Corin wakath...@gmail.com wrote: Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 mysql {select_type=SIMPLE, key_len=1, id=1, table=users, type=range, possible_keys=birthday_age, rows=7572, Extra=Using where, ref=nil, key=birthday_age} 15.104055404663 14.209032058716 18.857002258301 15.714883804321 14.73593711853 15.048027038574 14.589071273804 14.847040176392 15.192985534668 15.115976333618 postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 41.156768798828 39.988040924072 40.470123291016 40.035963058472 40.077924728394 40.94386100769 40.183067321777 39.83211517334 40.256977081299 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? pgAdmin will return the query time in the status bar of a query window. Similarly, you can use psql and activate query times by using \timing. Regards Thom
Re: [PERFORM] mysql to postgresql, performance questions
time that psql or pgAdmin shows is purely the postgresql time. Question here was about the actual application's time. Sometimes the data transmission, fetch and processing on the app's side can take longer than the 'postgresql' time.
Re: [PERFORM] mysql to postgresql, performance questions
Corin, * Corin (wakath...@gmail.com) wrote: I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. Can you provide at least your postgresql.conf? That could be useful, though this does seem like a really simple query. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. I wouldn't expect it to matter a whole lot, but have you considered using prepared queries? Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 Do you use every column from users, and do you really want 1000 records back? {QUERY PLAN=Total runtime: 5.847 ms} This runtime is the amount of time it took for the backend to run the query. 44.173002243042 These times are including all the time required to get the data back to the client. If you don't use cursors, all data from the query is returned all at once. Can you post the script you're using along with the table schema and maybe some sample or example data? Also, are you doing this all inside a single transaction, or are you creating a new transaction for every query? I trust you're not reconnecting to the database for every query.. I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? As was mentioned elsewhere, certainly the best tool to test with is your actual application, if that's possible.. Or at least the language your application is in. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] shared_buffers advice
Dimitri Fontaine wrote: I still think the best tool around currently for this kind of testing is tsung I am happy to say that for now, pgbench is the only actual testing tool supported. Done; now I don't need tsung. However, that doesn't actually solve any of the problems I was talking about though, which is why I'm not even talking about that part. We need the glue to pull out software releases, run whatever testing tool is appropriate, and then save the run artifacts in some standardized form so they can be referenced with associated build/configuration information to track down a regression when it does show up. Building those boring bits are the real issue here; load testing tools are easy to find because those are fun to work on. And as a general commentary on the vision here, tsung will never fit into this anyway because something that can run on the buildfarm machines with the software they already have installed is the primary target. I don't see anything about tsung so interesting that it trumps that priority, even though it is an interesting tool. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On Thu, Mar 18, 2010 at 16:09, Stephen Frost sfr...@snowman.net wrote: Corin, * Corin (wakath...@gmail.com) wrote: {QUERY PLAN=Total runtime: 5.847 ms} This runtime is the amount of time it took for the backend to run the query. 44.173002243042 These times are including all the time required to get the data back to the client. If you don't use cursors, all data from the query is returned all at once. Can you post the script you're using along with the table schema and maybe some sample or example data? Also, are you doing this all inside a single transaction, or are you creating a new transaction for every query? I trust you're not reconnecting to the database for every query.. Just as a note here, since the OP is using Debian. If you are connecting over TCP, debian will by default to SSL on your connection which obviously adds a *lot* of overhead. If you're not actively using it (in which case you will control this from pg_hba.conf), just edit postgresql.conf and disable SSL, then restart the server. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On Thu, Mar 18, 2010 at 8:31 AM, Corin wakath...@gmail.com wrote: Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 mysql {select_type=SIMPLE, key_len=1, id=1, table=users, type=range, possible_keys=birthday_age, rows=7572, Extra=Using where, ref=nil, key=birthday_age} 15.104055404663 14.209032058716 18.857002258301 15.714883804321 14.73593711853 15.048027038574 14.589071273804 14.847040176392 15.192985534668 15.115976333618 postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 41.156768798828 39.988040924072 40.470123291016 40.035963058472 40.077924728394 40.94386100769 40.183067321777 39.83211517334 40.256977081299 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? It's different because it only takes pgsql 5 milliseconds to run the query, and 40 seconds to transfer the data across to your applicaiton, which THEN promptly throws it away. If you run it as MySQL's client lib doesn't transfer over the whole thing. This is more about how each db interface is implemented in those languages. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On 18-3-2010 16:50 Scott Marlowe wrote: It's different because it only takes pgsql 5 milliseconds to run the query, and 40 seconds to transfer the data across to your applicaiton, which THEN promptly throws it away. If you run it as MySQL's client lib doesn't transfer over the whole thing. This is more about how each db interface is implemented in those languages. Its the default behavior of both PostgreSQL and MySQL to transfer the whole resultset over to the client. Or is that different for Ruby's MySQL-driver? At least in PHP the behavior is similar for both. And I certainly do hope its 40ms rather than 40s, otherwise it would be a really bad performing network in either case (15s for mysql) or very large records (which I doubt). I'm wondering if a new connection is made between each query. PostgreSQL is (afaik still is but I haven't compared that recently) a bit slower on that department than MySQL. Best regards, Arjen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Building multiple indexes concurrently
On Wed, 2010-03-17 at 16:49 -0400, Greg Smith wrote: Alvaro Herrera wrote: Andres Freund escribió: I find it way much easier to believe such issues exist on a tables in constrast to indexes. The likelihood to get sequential accesses on an index is small enough on a big table to make it unlikely to matter much. Vacuum walks indexes sequentially, for one. That and index-based range scans were the main two use-cases I was concerned would be degraded by interleaving index builds, compared with doing them in succession. I guess that tweaking file systems to allocate in bigger chunks help here ? I know that xfs can be tuned in that regard, but how about other common file systems like ext3 ? - Hannu Krosing http://www.2ndQuadrant.com PostgreSQL Scalability and Availability Services, Consulting and Training -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Forcing index scan on query produces 16x faster
I've also observed the same behaviour on a very large table (200GB data, 170GB for 2 indexes) I have a table which has 6 small columns, let's call them (a, b, c, d, e, f) and about 1 billion rows. There is an index on (a, b, c, d) - not my idea, Hibernate requires primary keys for every table. If I do the following query: *select max(c) from tbl where a=[constant literal] and b=[other constant literal];* then with maxed out analysis histograms, and no changes to any of the page_cost type stuff, it still deparately wants toi do a full table scan, which is ... kinda slow. Of course, a billion row table is also rather suboptimal (our app collects a lot more data than it used to) and so I'm bypassing Hibernate, and sharding it all by time, so that the tables and indexes will be a manageable size, and will also be vacuum-free as my aging out process is now DROP TABLE :-) Cheers Dave On Wed, Mar 17, 2010 at 8:01 PM, Eger, Patrick pe...@automotive.com wrote: I'm running 8.4.2 and have noticed a similar heavy preference for sequential scans and hash joins over index scans and nested loops. Our database is can basically fit in cache 100% so this may not be applicable to your situation, but the following params seemed to help us: seq_page_cost = 1.0 random_page_cost = 1.01 cpu_tuple_cost = 0.0001 cpu_index_tuple_cost = 0.5 cpu_operator_cost = 0.25 effective_cache_size = 1000MB shared_buffers = 1000MB Might I suggest the Postgres developers reconsider these defaults for 9.0 release, or perhaps provide a few sets of tuning params for different workloads in the default install/docs? The cpu_*_cost in particular seem to be way off afaict. I may be dead wrong though, fwiw =) -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Christian Brink Sent: Wednesday, March 17, 2010 2:26 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Forcing index scan on query produces 16x faster I am running into a problem with a particular query. The execution plan cost shows that the Seq Scan is a better bet (cost=54020.49..54020.55) over the forced index 'enable_seqscan = false' (cost=1589703.87..1589703.93). But when I run the query both ways I get a vastly different result (below). It appears not to want to bracket the salesitems off of the 'id' foreign_key unless I force it. Is there a way to rewrite or hint the planner to get me the better plan without resorting to 'enable_seqscan' manipulation (or am I missing something)? postream= select version(); version - PostgreSQL 8.0.3 on i386-redhat-linux-gnu, compiled by GCC i386-redhat-linux-gcc (GCC) 4.0.0 20050505 (Red Hat 4.0.0-4) postream= SET enable_seqscan = false; SET postream= EXPLAIN ANALYZE postream- SELECT si.group1_id as name, sum(si.qty) as count, sum(si.amt) as amt postream- FROM salesitems si, sales s, sysstrings postream- WHERE si.id = s.id postream-AND si.group1_id != '' postream-AND si.group1_id IS NOT NULL postream-AND NOT si.void postream-AND NOT s.void postream-AND NOT s.suspended postream-AND s.tranzdate = (cast('2010-02-15' as date) + cast(sysstrings.data as time)) postream-AND s.tranzdate ((cast('2010-02-15' as date) + 1) + cast(sysstrings.data as time)) postream-AND sysstrings.id='net/Console/Employee/Day End Time' postream- GROUP BY name; QUERY PLAN HashAggregate (cost=1589703.87..1589703.93 rows=13 width=35) (actual time=33.414..33.442 rows=12 loops=1) - Nested Loop (cost=0.01..1588978.22 rows=96753 width=35) (actual time=0.284..22.115 rows=894 loops=1) - Nested Loop (cost=0.01..2394.31 rows=22530 width=4) (actual time=0.207..4.671 rows=225 loops=1) - Index Scan using sysstrings_pkey on sysstrings (cost=0.00..5.78 rows=1 width=175) (actual time=0.073..0.078 rows=1 loops=1) Index Cond: (id = 'net/Console/Employee/Day End Time'::text) - Index Scan using sales_tranzdate_index on sales s (cost=0.01..1825.27 rows=22530 width=12) (actual time=0.072..3.464 rows=225 loops=1) Index Cond: ((s.tranzdate = ('2010-02-15'::date + (outer.data)::time without time zone)) AND (s.tranzdate ('2010-02-16'::date + (outer.data)::time without time zone))) Filter: ((NOT void) AND (NOT suspended)) - Index Scan using salesitems_pkey on salesitems si (cost=0.00..70.05 rows=30 width=39) (actual time=0.026..0.052 rows=4 loops=225) Index
Re: [PERFORM] mysql to postgresql, performance questions
Corin wrote: Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The relaunch looks like you are nearing the end (the launch) of the project - if so, you should know that switching databases near the project deadline is almost always a suicidal act. Even if the big differences are easily fixable, the small differences will kill you. database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Your table will probably fit in RAM but the whole database obviously won't. Not that it matters here. Did you configure anything at all in postgresql.conf? The defaults assume a very small database. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. It depends... 15.115976333618 So this is 15 ms? postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the It looks like you are spending ~~38 ms in delivering the data to your application. Whatever you are using, stop using it :) same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? The only rational way is to measure at the database itself and not include other factors like the network, scripting language libraries, etc. To do this, login at your db server with a shell and use psql. Start it as psql databasename username and issue a statement like EXPLAIN ANALYZE SELECT ...your_query Unless magic happens, this will open a local unix socket connection to the database for the query, which has the least overhead. You can of course also do this for MySQL though I don't know if it has an equivalent of EXPLAIN ANALYZE. But even after you have found where the problem is, and even if you see that Pg is faster than MySQL, you will still need realistic loads to test the real-life performance difference. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance