Re: [PERFORM] log_statement at postgres.conf
On Mon, 21 Jul 2008, System/IJS - Joko wrote: Thx a lot Nicolas, I finaly success to log query statement because of your simple explanation. I have other question: 1. Is there posibility to automatically logging that statement to table? I don't know, never tried that. 2. All of that statement is come from every database on my server, could I know from which database that statement come? or at least I can filter to log only from database X ? You can modify 'log_line_prefix' to add the database name : use '%d %t %p %r ' instead of the default '%t %p %r ' for example. 3. If I need to log only changed made on my database, then the value of 'log_statement' is 'mod' ? yes Nicolas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
Luke Lonergan wrote: pgbench is unrelated to the workload you are concerned with if ETL/ELT and decision support / data warehousing queries are your target. Also - placing the xlog on dedicated disks is mostly irrelevant to data warehouse / decision support work or ELT. If you need to maximize loading speed while concurrent queries are running, it may be necessary, but I think you'll be limited in load speed by CPU related to data formatting anyway. Indeed. pgbench was mostly done as 'informative' and not really relevant to the future workload of this db. (given the queries it's doing not sure it's relevant for anything but connections speed, interesting for me to get reference for tx like however). I was more interested in the raw disk performance. The primary performance driver for ELT / DW is sequential transfer rate, thus the dd test at 2X memory. With six data disks of this type, you should expect a maximum of around 6 x 80 = 480 MB/s. With RAID10, depending on the raid adapter, you may need to have two or more IO streams to use all platters, otherwise your max speed for one query would be 1/2 that, or 240 MB/s. ok, which seems to be in par with what I'm getting. (the 240 that is) I'd suggest RAID5, or even better, configure all eight disks as a JBOD in the RAID adapter and run ZFS RAIDZ. You would then expect to get about 7 x 80 = 560 MB/s on your single query. Do you have a particular controller and disk hardware configuration in mind when you're suggesting RAID5 ? My understanding was it was more difficult to find the right hardware to get performance on RAID5 compared to RAID10. That said, your single cpu on one query will only be able to scan that data at about 300 MB/s (try running a SELECT COUNT(*) against a table that is 2X memory size). Note quite 2x memory size, but ~26GB (accounts with scaling factor 2000): $ time psql -c "select count(*) from accounts" pgbench count --- 2 (1 row) real1m52.050s user0m0.020s sys 0m0.020s NB: For the sake of completness, reran the pgbench by taking average of 10 runs for each scaling factor (same configuration as per initial mail, columns are scaling factor, db size, average tps) 1 20 23451 100 1565 21898 200 3127 20474 300 4688 20003 400 6249 20637 500 7810 16434 600 9372 15114 700 11000 14595 800 12000 16090 900 14000 14894 1000 15000 3071 1200 18000 3382 1400 21000 1888 1600 24000 1515 1800 27000 1435 2000 3 1354 -- stephane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
Hi Stephane, On 7/21/08 1:53 AM, "Stephane Bailliez" <[EMAIL PROTECTED]> wrote: >> I'd suggest RAID5, or even better, configure all eight disks as a JBOD >> in the RAID adapter and run ZFS RAIDZ. You would then expect to get >> about 7 x 80 = 560 MB/s on your single query. >> > Do you have a particular controller and disk hardware configuration in > mind when you're suggesting RAID5 ? > My understanding was it was more difficult to find the right hardware to > get performance on RAID5 compared to RAID10. If you're running RAIDZ on ZFS, the controller you have should be fine. Just configure the HW RAID controller to treat the disks as JBOD (eight individual disks), then make a single RAIDZ zpool of the eight disks. This will run them in a robust SW RAID within Solaris. The fault management is superior to what you would otherwise have in your HW RAID and the performance should be much better. - Luke -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
Greg Smith wrote: Note that I've had some issues with the desktop Ubuntu giving slower results in tests like this than the same kernel release using the stock kernel parameters. Haven't had a chance yet to see how the server Ubuntu kernel fits into that or exactly what the desktop one is doing wrong yet. Could be worse--if you were running any 8.04 I expect your pgbench results would be downright awful. Ah interesting. Isn't it a scheduler problem, I thought CFQ was the default for desktop ? I doublechecked the 7.10 server on this box and it's really the deadline one that is used: cat /sys/block/sdb/queue/scheduler noop anticipatory [deadline] cfq Do you have some more pointers on the 8.04 issues you mentioned ? (that's deemed to be the next upgrade from ops) postgresql 8.2.9 with data and xlog as mentioned above There are so many known performance issues in 8.2 that are improved in 8.3 that I'd suggest you really should be considering it for a new install at this point. Yes I'd definitely prefer to go 8.3 as well but there are a couple reasons for now I have to suck it up: - 8.2 is the one in the 7.10 repository. - I need plr as well and 8.3-plr debian package does not exist yet. (I know in both cases we could recompile and install it from there, but ...) In general, you'll want to use a couple of clients per CPU core for pgbench tests to get a true look at the scalability. Unfortunately, the way the pgbench client runs means that it tends to top out at 20 or 30 thousand TPS on read-only tests no matter how many cores you have around. But you may find operations where peak throughput comes at closer to 32 clients here rather than just 8. ok. Make sense. As far as the rest of your results go, Luke's comment that you may need more than one process to truly see the upper limit of your disk performance is right on target. More useful commentary on that issue I'd recomend is near the end of http://www.commandprompt.com/blogs/joshua_drake/2008/04/is_that_performance_i_smell_ext2_vs_ext3_on_50_spindles_testing_for_postgresql/ Yeah I was looking at that url as well. Very useful. Thanks for all the info Greg. -- stephane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Less rows -> better performance?
Dear PostgreSQL community, first some info about our application: - Online course directory for a University - Amount of data: complete dump is 27 MB - Semester is part of primary key in each table - Data for approx. 10 semesters stored in the DB - Read-only access from web application (JDBC) Our client has asked us if the performance of the application could be improved by moving the data from previous years to a separate "archive" application. This would reduce the overall amount of data in the main application by about 80% at the moment. Actually I doubt that this will have the desired effect, since the semester is part of the primary key in virtually all tables (apart from some small tables containing string constants etc.), and therefore indexed. Some tests with EXPLAIN ANALYZE and some web tests (JMeter) seem to confirm this, the queries showed the same performance with 2 and 10 semesters. But since I'm not sure yet, I would very much appreciate any answers to the following questions: - Do you think the approach (reducing the data) is effective? - Are there any particular tests which I should do? Thanks a lot in advance! -- Andreas -- Andreas Hartmann, CTO BeCompany GmbH http://www.becompany.ch Tel.: +41 (0) 43 818 57 01 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Andreas Hartmann wrote: Dear PostgreSQL community, first some info about our application: - Online course directory for a University - Amount of data: complete dump is 27 MB - Semester is part of primary key in each table - Data for approx. 10 semesters stored in the DB - Read-only access from web application (JDBC) Our client has asked us if the performance of the application could be improved by moving the data from previous years to a separate "archive" application. If you had 27GB of data maybe, but you've only got 27MB - that's presumably all sitting in memory. What in particular is slow? -- Richard Huxton Archonet Ltd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Perl/DBI vs Native
Hi, I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No matter how many times I run the test the results were always the same. I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres consistently logs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. If I were timing the applications I would have been too much surprised by these results, obviously, processing with Perl would be slower than a native application. But it's the postmaster that gives these results. Could it be because the DBI module is slower at assimilating the data? Any light on the subject would be greatly appreciated. Regards, Val __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Richard, thanks for your reply! Richard Huxton schrieb: Andreas Hartmann wrote: Dear PostgreSQL community, first some info about our application: - Online course directory for a University - Amount of data: complete dump is 27 MB - Semester is part of primary key in each table - Data for approx. 10 semesters stored in the DB - Read-only access from web application (JDBC) Our client has asked us if the performance of the application could be improved by moving the data from previous years to a separate "archive" application. If you had 27GB of data maybe, but you've only got 27MB - that's presumably all sitting in memory. Here's some info about the actual amount of data: SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database where pg_database.datname = 'vvz_live_1'; datname| size ---+- vvz_live_1| 2565 MB I wonder why the actual size is so much bigger than the data-only dump - is this because of index data etc.? What in particular is slow? There's no particular bottleneck (at least that we're aware of). During the first couple of days after the beginning of the semester the application request processing tends to slow down due to the high load (many students assemble their schedule). The customer upgraded the hardware (which already helped a lot), but they asked us to find further approaches to performance optimiziation. -- Andreas -- Andreas Hartmann, CTO BeCompany GmbH http://www.becompany.ch Tel.: +41 (0) 43 818 57 01 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
On Mon, Jul 21, 2008 at 1:25 PM, Andreas Hartmann <[EMAIL PROTECTED]> wrote: > SELECT pg_database.datname, > pg_size_pretty(pg_database_size(pg_database.datname)) AS size > FROM pg_database where pg_database.datname = 'vvz_live_1'; > >datname| size > ---+- > vvz_live_1| 2565 MB > > I wonder why the actual size is so much bigger than the data-only dump - is > this because of index data etc.? More probably because the database is totally bloated. Do you run VACUUM regularly or did you set up autovacuum? -- Guillaume -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Andreas Hartmann wrote: Here's some info about the actual amount of data: SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database where pg_database.datname = 'vvz_live_1'; datname| size ---+- vvz_live_1| 2565 MB I wonder why the actual size is so much bigger than the data-only dump - is this because of index data etc.? I suspect Guillame is right and you've not been vacuuming. That or you've got a *LOT* of indexes. If the database is only 27MB dumped, I'd just dump/restore it. Since the database is read-only it might be worth running CLUSTER on the main tables if there's a sensible ordering for them. What in particular is slow? There's no particular bottleneck (at least that we're aware of). During the first couple of days after the beginning of the semester the application request processing tends to slow down due to the high load (many students assemble their schedule). The customer upgraded the hardware (which already helped a lot), but they asked us to find further approaches to performance optimiziation. 1. Cache sensibly at the application (I should have thought there's plenty of opportunity here). 2. Make sure you're using a connection pool and have sized it reasonably (try 4,8,16 see what loads you can support). 3. Use prepared statements where it makes sense. Not sure how you'll manage the interplay between this and connection pooling in JDBC. Not a Java man I'm afraid. If you're happy with the query plans you're looking to reduce overheads as much as possible during peak times. 4. Offload more of the processing to clients with some fancy ajax-ed interface. 5. Throw in a spare machine as an app server for the first week of term. Presumably your load is 100 times average at this time. -- Richard Huxton Archonet Ltd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perl/DBI vs Native
Valentin Bogdanov wrote: Hi, I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No matter how many times I run the test the results were always the same. I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres consistently logs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. If I were timing the applications I would have been too much surprised by these results, obviously, processing with Perl would be slower than a native application. But it's the postmaster that gives these results. Could it be because the DBI module is slower at assimilating the data? Any light on the subject would be greatly appreciated. Random guess: Perl's DBI is using parameterized prepared statements, preventing the optimizer from using its knowledge about common values in the table to decide whether or not index use is appropriate. When you're writing the query in psql, you're not using prepared statements so the optimizer can be cleverer. Try comparing: SELECT statement to PREPARE test(params) AS statement; EXECUTE test(params); eg: SELECT x + 44 FROM t; vs: PREPARE test(int) AS x + $1 FROM t; EXECUTE test(44); Use EXPLAIN ANALYZE to better understand the changes in the query plan. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Guillaume Smet wrote: On Mon, Jul 21, 2008 at 1:25 PM, Andreas Hartmann <[EMAIL PROTECTED]> wrote: SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database where pg_database.datname = 'vvz_live_1'; datname| size ---+- vvz_live_1| 2565 MB I wonder why the actual size is so much bigger than the data-only dump - is this because of index data etc.? More probably because the database is totally bloated. Do you run VACUUM regularly or did you set up autovacuum? You might also want to REINDEX and see if that improves things. My understanding is that if vacuum isn't run regularly, the indexes may end up a bit of a mess as well as the tables. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perl/DBI vs Native
On Jul 21, 2008, at 5:19 AM, Valentin Bogdanov wrote: Hi, I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No matter how many times I run the test the results were always the same. I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres consistently logs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. If I were timing the applications I would have been too much surprised by these results, obviously, processing with Perl would be slower than a native application. But it's the postmaster that gives these results. Could it be because the DBI module is slower at assimilating the data? Hi Val, Yes, DBI can be slower then the native C interface. The speed depends on how the data is being returned inside of Perl. Returning hashes is a slower method then returning arrays from what I've found due to the overhead in the creation of the objects in Perl. So: my $r = $dbh->selectall_arrayref("select * from table", { Columns => {}}); Is slower then: my $r = $dbh->selectall_arrayref("select * from table", undef); Secondarily, if you're returning a lot of rows you may want to look into using a cursor, so that you can fetch the rows a 1000 at a time in a tight loop then discard them once you are done with them. This will hopefully prevent the system from having continually allocate memory for all of your rows. For each field in each row Perl allocates memory to store the value from Postgres, so if you have many fields on your table this can be a large number of allocations depending on the number of rows. Any userland profile tool should help you debug what's going on here. Cheers, Rusty -- Rusty Conover InfoGears Inc. http://www.infogears.com - http://www.gearbuyer.com - http://www.footwearbuyer.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Hi, Reducing the amount of data will only have effect on table scan or index scan. If your queries are selective and optimized, it will have no effect. Before looking for solutions, the first thing to do is to understand what's happen. If you already know the queries then explain them. Otherwise, you must log duration with the log_statement and log_min_duration parameters in the postgresql.conf. Before this, you must at least run VACUUM ANALYZE on the database to collect actual statistics and have current explain plans. Best regards. Christian 2008/7/21 Richard Huxton <[EMAIL PROTECTED]> > Andreas Hartmann wrote: > >> >> Here's some info about the actual amount of data: >> >> SELECT pg_database.datname, >> pg_size_pretty(pg_database_size(pg_database.datname)) AS size >> FROM pg_database where pg_database.datname = 'vvz_live_1'; >> >>datname| size >> ---+- >> vvz_live_1| 2565 MB >> >> I wonder why the actual size is so much bigger than the data-only dump - >> is this because of index data etc.? >> > > I suspect Guillame is right and you've not been vacuuming. That or you've > got a *LOT* of indexes. If the database is only 27MB dumped, I'd just > dump/restore it. > > Since the database is read-only it might be worth running CLUSTER on the > main tables if there's a sensible ordering for them. > > What in particular is slow? >>> >> >> There's no particular bottleneck (at least that we're aware of). During >> the first couple of days after the beginning of the semester the application >> request processing tends to slow down due to the high load (many students >> assemble their schedule). The customer upgraded the hardware (which already >> helped a lot), but they asked us to find further approaches to performance >> optimiziation. >> > > 1. Cache sensibly at the application (I should have thought there's plenty > of opportunity here). > 2. Make sure you're using a connection pool and have sized it reasonably > (try 4,8,16 see what loads you can support). > 3. Use prepared statements where it makes sense. Not sure how you'll manage > the interplay between this and connection pooling in JDBC. Not a Java man > I'm afraid. > > If you're happy with the query plans you're looking to reduce overheads as > much as possible during peak times. > > 4. Offload more of the processing to clients with some fancy ajax-ed > interface. > 5. Throw in a spare machine as an app server for the first week of term. > Presumably your load is 100 times average at this time. > > -- > Richard Huxton > Archonet Ltd > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] Less rows -> better performance?
Guillaume Smet schrieb: On Mon, Jul 21, 2008 at 1:25 PM, Andreas Hartmann <[EMAIL PROTECTED]> wrote: SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database where pg_database.datname = 'vvz_live_1'; datname| size ---+- vvz_live_1| 2565 MB I wonder why the actual size is so much bigger than the data-only dump - is this because of index data etc.? More probably because the database is totally bloated. Do you run VACUUM regularly or did you set up autovacuum? Thanks for the hint! I just verified that the autovacuum property is enabled. I did the following to prepare the tests: - setup two test databases, let's call them db_all and db_current - import the dump from the live DB into both test DBs - delete the old semester data from db_current, leaving only the current data Both test DBs were 600 MB large after this. I did a VACUUM FULL ANALYZE on both of them now. db_all didn't shrink significantly (only 1 MB), db_current shrunk to 440 MB. We're using quite a lot of indexes, I guess that's why that much data are allocated. -- Andreas -- Andreas Hartmann, CTO BeCompany GmbH http://www.becompany.ch Tel.: +41 (0) 43 818 57 01 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Andreas, > I just verified that the autovacuum property is enabled. I did the following > to prepare the tests: "autovacuum property is enabled" Did you also check the logs, if autovacuum is working? > - setup two test databases, let's call them db_all and db_current > - import the dump from the live DB into both test DBs > - delete the old semester data from db_current, leaving only the current > data > > Both test DBs were 600 MB large after this. I did a VACUUM FULL ANALYZE on > both of them now. db_all didn't shrink significantly (only 1 MB), db_current > shrunk to 440 MB. Your test is not testing if vacuum is done on your production database! With pg_dump + pg_restore you removed next to all database bloat. (theoretically all) After loading a fresh dump, vacuuming ideally has to do nearly nothing; after deleting some data VACUUM reclaims the memory of the deleted rows, thats the shrinking you see after delete + vacuum. The bload in your production system may be the result of updates and deletes in that system; dumping and restoring removes that bloat. If your life DB is ~2,5Gig, and your dumped / restored DB is only 600MB, that 2500MB minus 600MB is some bloat from not vacuuming or bloated indexes. So, before the start of the next semester, at least do vacuum. (maybe also reindex) Best wishes, Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 no fx, no carrier pidgeon - EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Mario Weilguni schrieb: Andreas Hartmann schrieb: […] I just verified that the autovacuum property is enabled. […] Did you have: stats_start_collector = on stats_block_level = on stats_row_level = on Otherwise autovacuum won't run IMO. Thanks for the hint! The section looks like this: stats_start_collector = on #stats_command_string = off #stats_block_level = off stats_row_level = on #stats_reset_on_server_start = off I'll check the logs if the vacuum really runs - as soon as I find them :) -- Andreas -- Andreas Hartmann, CTO BeCompany GmbH http://www.becompany.ch Tel.: +41 (0) 43 818 57 01 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Less rows -> better performance?
Andreas Hartmann schrieb: Mario Weilguni schrieb: Andreas Hartmann schrieb: […] I just verified that the autovacuum property is enabled. […] Did you have: stats_start_collector = on stats_block_level = on stats_row_level = on Otherwise autovacuum won't run IMO. Thanks for the hint! The section looks like this: stats_start_collector = on #stats_command_string = off #stats_block_level = off stats_row_level = on #stats_reset_on_server_start = off I'll check the logs if the vacuum really runs - as soon as I find them :) -- Andreas You might want to use these entries in your config: redirect_stderr = on log_directory = 'pg_log' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' log_rotation_age = 1d Fit those to your needs, then you will find log entries in $PGDATA/pg_log/ And BTW, I was wrong, you just need to have stats_row_level=On, stats_block_level doesn't matter. But in fact it's simple, if you don't have 24x7 requirements type VACUUM FULL ANALYZE; and check if your DB becomes smaller, I really doubt you can have that much indizes that 27MB dumps might use 2.3 GB on-disk. You can check this too: select relname, relpages, reltuples, relkind from pg_class where relkind in ('r', 'i') order by relpages desc limit 20; Will give you the top-20 tables and their sizes, 1 page is typically 8KB, so you can cross-check if relpages/reltuples is completly off, this is a good indicator for table/index bloat. Regards, Mario -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perl/DBI vs Native
Valentin Bogdanov wrote: I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No matter how many times I run the test the results were always the same. I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres consistently logs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. The problem may be that your two tests are not equivalent. When Perl executes a statement, it copies the *entire* result set back to the client before it returns the first row. The following program might appear to just be fetching the first row: $sth = $dbh->prepare("select item from mytable"); $sth->execute(); $item = $sth->fetchrow_array(); But in fact, before Perl returns from the $sth->execute() statement, it has already run the query and copied all of the rows into a hidden, client-side cache. Each $sth->fetchrow_array() merely copies the data from the hidden cache into your local variable. By contrast, psql executes the query, and starts returning the data a page at a time. So it may appear to be much faster. This also means that Perl has trouble with very large tables. If the "mytable" in the above example is very large, say a hundred billion rows, you simply can't execute this statement in Perl. It will try to copy 100 billion rows into memory before returning the first answer. The reason for Perl's behind-the-scenes caching is because it allows multiple connections to a single database, and multiple statements on each database handle. By executing each statement completely, it gives the appearance that multiple concurrent queries are supported. The downside is that it can be a huge memory hog. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] [BACKUPS]Little backups
Hi Guys, I am developing a project with PostgreSQL and one guy from project is familiar with Oracle and did a question for me, but i could not answer, if someone could help it will be good. =) The question is : * - In oracle he makes a full backup two times in a day. In this range of time, Oracle make a lot of mini-backups, but this backups is about just the data whose have changed in this time. If the system fails, he could reconstruct the database adding the last "big backup" with "mini-backups". Can Postgres do this ? * Regards, Leví - Brazil
Re: [PERFORM] [BACKUPS]Little backups
>>> "Leví Teodoro da Silva" <[EMAIL PROTECTED]> wrote: > - In oracle he makes a full backup two times in a day. In this range of > time, Oracle make a lot of mini-backups, but this backups is about just the > data whose have changed in this time. If the system fails, he could > reconstruct the database adding the last "big backup" with "mini-backups". > Can Postgres do this ? * The equivalent capability in PostgreSQL is the Point-In-Time Recovery backup strategy: http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html Twice daily seems rather extreme -- we generally go with monthly. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BACKUPS]Little backups
A Dilluns 21 Juliol 2008, Leví Teodoro da Silva va escriure: > Hi Guys, > > I am developing a project with PostgreSQL and one guy from project is > familiar with Oracle and did a question for me, but i could not answer, if > someone could help it will be good. =) > The question is : > * > - In oracle he makes a full backup two times in a day. In this range of > time, Oracle make a lot of mini-backups, but this backups is about just the > data whose have changed in this time. If the system fails, he could > reconstruct the database adding the last "big backup" with "mini-backups". > Can Postgres do this ? * Yes, it can. If you need detailed information, you can take a look at http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html > > > > > Regards, > Leví - Brazil -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BACKUPS]Little backups
am Mon, dem 21.07.2008, um 15:20:27 -0300 mailte Leví Teodoro da Silva folgendes: > - In oracle he makes a full backup two times in a day. In this range of time, > Oracle make a lot of mini-backups, but this backups is about just the data > whose have changed in this time. If the system fails, he could reconstruct the > database adding the last "big backup" with "mini-backups". Can Postgres do > this > ? Sure, with the WAL-files. http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
On Mon, 21 Jul 2008, Stephane Bailliez wrote: Isn't it a scheduler problem, I thought CFQ was the default for desktop ? CFQ/Deadline/AS are I/O scheduler choices. What changed completely in 2.6.23 is the kernel process scheduler. http://people.redhat.com/mingo/cfs-scheduler/sched-design-CFS.txt gives some info about the new one. While the switch to CFS has shown great improvements in terms of desktop and many server workloads, what I discovered is that the pgbench test program itself is really incompatible with it. There's a kernel patch that seems to fix the problem at http://lkml.org/lkml/2008/5/27/58 but I don't think it's made it into a release yet. This is not to say the kernel itself is unsuitable for running PostgreSQL itself, but if you're using pgbench as the program to confirm that I expect you'll be dissapointed with results under the Ubuntu 8.04 kernel. It tops out at around 10,000 TPS running the select-only test for me while older kernels did 3X that much. Yes I'd definitely prefer to go 8.3 as well but there are a couple reasons for now I have to suck it up: - 8.2 is the one in the 7.10 repository. - I need plr as well and 8.3-plr debian package does not exist yet. (I know in both cases we could recompile and install it from there, but ...) Stop and think about this for a minute. You're going into production with an older version having a set of known, impossible to work around issues that if you hit them the response will be "upgrade to 8.3 to fix that", which will require the major disruption to your application of a database dump and reload at that point if that fix becomes critical. And you can't just do that now because of some packaging issues? I hope you can impress upon the other people involved how incredibly short-sighted that is. Unfortunately, it's harder than everyone would like to upgrade an existing PostgreSQL installation. That really argues for going out of your way ir necessary to deploy the latest stable release when you're building something new, if there's not some legacy bits seriously holding you back. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perl/DBI vs Native
Craig James <[EMAIL PROTECTED]> writes: > Valentin Bogdanov wrote: >> I have ran quite a few tests comparing how long a query takes to >> execute from Perl/DBI as compared to psql/pqlib. No matter how many >> times I run the test the results were always the same. >> >> I run a SELECT all on a fairly big table and enabled the >> log_min_duration_statement option. With psql postgres consistently >> logs half a second while the exact same query executed with Perl/DBI >> takes again consistently 2 seconds. > The problem may be that your two tests are not equivalent. When Perl > executes a statement, it copies the *entire* result set back to the > client before it returns the first row. Sure, but so does psql (unless you've turned on the magic FETCH_COUNT setting). I think the theories about prepared versus literal statements were more promising; but I don't know DBI well enough to know exactly what it was sending to the server. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BACKUPS]Little backups
On Mon, Jul 21, 2008 at 03:20:27PM -0300, Leví Teodoro da Silva wrote: > - In oracle he makes a full backup two times in a day. In this range of > time, Oracle make a lot of mini-backups, but this backups is about just the > data whose have changed in this time. If the system fails, he could > reconstruct the database adding the last "big backup" with "mini-backups". > Can Postgres do this ? * Take a look at Point-In-Time-Recovery, PITR: http://www.postgresql.org/docs/current/static/continuous-archiving.html -Berge -- Berge Schwebs Bjørlo Alegría! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BACKUPS]Little backups
Thank you guys for the fast answer. This same guy asked me about the support on PostgreSQL. When he see the community behind PostgreSQL , he never will be worried about support. =) Thanks a lot, Leví A. Kretschmer escreveu: am Mon, dem 21.07.2008, um 15:20:27 -0300 mailte Leví Teodoro da Silva folgendes: - In oracle he makes a full backup two times in a day. In this range of time, Oracle make a lot of mini-backups, but this backups is about just the data whose have changed in this time. If the system fails, he could reconstruct the database adding the last "big backup" with "mini-backups". Can Postgres do this ? Sure, with the WAL-files. http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html Andreas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Perl/DBI vs Native
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Tom Lane wrote: > Sure, but so does psql (unless you've turned on the magic FETCH_COUNT > setting). I think the theories about prepared versus literal statements > were more promising; but I don't know DBI well enough to know exactly > what it was sending to the server. Almost certainly a prepared_statement unless no placeholders were being used at all. Another way to test (from the DBI side) is to set $sth->{pg_server_prepare} = 0, which will send the SQL directly to the backend, just as if you've typed it in at a command prompt. You can also use the tracing mechanism of DBI to see what's going on behind the scenes. For example: $dbh->trace('SQL'); $dbh->do("SELECT 1234 FROM pg_class WHERE relname = 'bob'"); $dbh->do("SELECT 1234 FROM pg_class WHERE relname = ?", undef, 'mallory'); $sth = $dbh->prepare("SELECT 4567 FROM pg_class WHERE relname = ?"); $sth->execute('alice'); $sth->{pg_server_prepare} = 0; $sth->execute('eve1'); $sth->{pg_server_prepare} = 1; $sth->execute('eve2'); $dbh->commit; Outputs: === begin; SELECT 1234 FROM pg_class WHERE relname = 'bob'; EXECUTE SELECT 1234 FROM pg_class WHERE relname = $1 ( $1: mallory ); PREPARE dbdpg_p22988_1 AS SELECT 4567 FROM pg_class WHERE relname = $1; EXECUTE dbdpg_p22988_1 ( $1: alice ); SELECT 4567 FROM pg_class WHERE relname = 'eve1'; EXECUTE dbdpg_p22988_1 ( $1: eve2 ); commit; DEALLOCATE dbdpg_p22988_1; === You can even view exactly which libpq calls are being used at each point with: $dbh->trace('SQL,libpq'); To get back to the original poster's complaint, you may want to figure out why the difference is so great for a prepared plan. It may be that you need to cast the placeholder(s) to a specific type, for example. - -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation PGP Key: 0x14964AC8 200807211637 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkiE83wACgkQvJuQZxSWSsiGrwCdGMLgauGwR2UzfoMPrTH/mrRg nxsAnjx14goMV23a9yRjtSw+ixJWQkuI =gjVE -END PGP SIGNATURE- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] A guide/tutorial to performance monitoring and tuning
On 2:59 pm 06/29/08 Greg Smith <[EMAIL PROTECTED]> wrote: > Right now I'm working with a few other people to put together a more > straightforward single intro guide that should address some of the > vagueness you point out here, Was that ever completed? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
[...] Yes I'd definitely prefer to go 8.3 as well but there are a couple reasons for now I have to suck it up: - 8.2 is the one in the 7.10 repository. - I need plr as well and 8.3-plr debian package does not exist yet. (I know in both cases we could recompile and install it from there, but ...) At least on debian it was quite easy to "backport" 8.3.3 from sid to etch using apt-get's source and build-dep functions. That way you get a normal installable package. I'm not sure, but given the similarity I would guess it won't be much harder on ubuntu. // Emil -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
--On tisdag, juli 22, 2008 01.20.52 +0200 Emil Pedersen <[EMAIL PROTECTED]> wrote: [...] Yes I'd definitely prefer to go 8.3 as well but there are a couple reasons for now I have to suck it up: - 8.2 is the one in the 7.10 repository. - I need plr as well and 8.3-plr debian package does not exist yet. (I know in both cases we could recompile and install it from there, but ...) At least on debian it was quite easy to "backport" 8.3.3 from sid to etch using apt-get's source and build-dep functions. That way you get a normal installable package. I'm not sure, but given the similarity I would guess it won't be much harder on ubuntu. I should have said that I was talking about the postgresql, I missed the plr part. I appologize for the noice. // Emil -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
Emil Pedersen <[EMAIL PROTECTED]> writes: >> At least on debian it was quite easy to "backport" 8.3.3 from sid >> to etch using apt-get's source and build-dep functions. That way >> you get a normal installable package. > I should have said that I was talking about the postgresql, I > missed the plr part. I appologize for the noice. Still, there's not normally that much difference between the packaging for one version and for the next. I can't imagine that it would take much time to throw together a package for 8.3 plr based on what you're using for 8.2. All modern package-based distros make this pretty easy. The only reason not to do it would be if you're buying support from a vendor who will only support specific package versions... regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] A guide/tutorial to performance monitoring and tuning
On Mon, 21 Jul 2008, Francisco Reyes wrote: On 2:59 pm 06/29/08 Greg Smith <[EMAIL PROTECTED]> wrote: Right now I'm working with a few other people to put together a more straightforward single intro guide that should address some of the vagueness you point out here, Was that ever completed? Not done yet; we're planning to have a first rev done in another couple of weeks. The work in progress is at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server and I'm due to work out another set of improvements to that this week during OSCON. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)
Greg Smith wrote: CFQ/Deadline/AS are I/O scheduler choices. What changed completely in 2.6.23 is the kernel process scheduler. http://people.redhat.com/mingo/cfs-scheduler/sched-design-CFS.txt gives some info about the new one. While the switch to CFS has shown great improvements in terms of desktop and many server workloads, what I discovered is that the pgbench test program itself is really incompatible with it. There's a kernel patch that seems to fix the problem at http://lkml.org/lkml/2008/5/27/58 but I don't think it's made it into a release yet. This is not to say the kernel itself is unsuitable for running PostgreSQL itself, but if you're using pgbench as the program to confirm that I expect you'll be dissapointed with results under the Ubuntu 8.04 kernel. It tops out at around 10,000 TPS running the select-only test for me while older kernels did 3X that much. ok, thanks for all the details. good to know. Stop and think about this for a minute. You're going into production with an older version having a set of known, impossible to work around issues that if you hit them the response will be "upgrade to 8.3 to fix that", which will require the major disruption to your application of a database dump and reload at that point if that fix becomes critical. And you can't just do that now because of some packaging issues? I hope you can impress upon the other people involved how incredibly short-sighted that is. I understand what you're saying. However if I were to play devil's advocate, the existing one that I'm 'migrating' (read entirely changing schemas, 'migrating' data) is coming out from a 8.1.11 install. It is not a critical system. The source data is always available from another system and the postgresql system would be a 'client'. So if 8.2.x is so abysmal it should not even be considered for install compared to 8.1.x and that only 8.3.x is viable then ok that makes sense and I have to go the extra mile. But message received loud and clear. Conveniently 8.3.3 is also available on backports so it does not cost much and pinning it will be and pinning it is right now. (don't think there will be any pb with plr, even though the original seems to be patched a bit, but that will be for later when I don't know what to do and that all is ready). For the sake of completeness (even though irrelevant), here's the run with 32 clients on 8.3 same config as before (except max_fsm_pages at 204800) 1 19 36292 100 1499 32127 200 2994 30679 300 4489 29673 400 5985 18627 500 7480 19714 600 8975 19437 700 1 20271 800 12000 18038 900 13000 9842 1000 15000 5996 1200 18000 5404 1400 2 3701 1600 23000 2877 1800 26000 2657 2000 29000 2612 cheers, -- stephane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance