Re: [PERFORM] cost-based vacuum
On Mon, 2005-07-11 at 15:51 +0100, Simon Riggs wrote: On Mon, 2005-07-11 at 09:07 -0400, Ian Westmacott wrote: On Mon, 2005-07-11 at 07:31, Simon Riggs wrote: The ANALYZE commands hold read locks on the tables you wish to write to. If you slow them down, you merely slow down your write transactions also, and then the read transactions that wait behind them. Every time the ANALYZE sleeps it wakes up the other transactions, which then realise they can't move because of locks and then wake up the ANALYZEs for another shot. The end result is that you introduce more context- switching, without any chance of doing more useful work while the ANALYZEs sleep. Let me make sure I understand. ANALYZE acquires a read lock on the table, that it holds until the operation is complete (including any sleeps). That read lock blocks the extension of that table via COPY. Is that right? According to the 8.0 docs, ANALYZE acquires an ACCESS SHARE lock on the table, and that conflicts only with ACCESS EXCLUSIVE. Thats why I didn't think I had a lock issue, since I think COPY only needs ROW EXCLUSIVE. Or perhaps the transaction needs something more? The docs are correct, but don't show catalog and buffer locks. ...but on further reading of the code there are no catalog locks or buffer locks held across the sleep points. So, my explanation doesn't work as an explanation for the sleep/no sleep difference you have observed. I've been through all the code now and can't find any resource that is held across a delay point. Nor any reason to believe that the vacuum cost accounting would slow anything down. Since vacuum_cost_delay is a userset parameter, you should be able to SET this solely for the analyze_thread. That way we will know with more certainty that it is the analyze_thread that is interfering. What is your default_statistics_target? Do you have other stats targets set? How long does ANALYZE take to run, with/without the vacuum_cost_delay? Thanks, Best Regards, Simon Riggs ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] join and query planner
On Wed, Jul 06, 2005 at 18:54:02 -0300, Dario Pudlo [EMAIL PROTECTED] wrote: (first at all, sorry for my english) Hi. - Does left join restrict the order in which the planner must join tables? I've read about join, but i'm not sure about left join... The left join operator is not associative so in general the planner doesn't have much flexibility to reorder left (or right) joins. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Projecting currentdb to more users
Hi, We have a couple of database that are identical (one for each customer). They are all relatively small, ranging from 100k records to 1m records. There's only one main table with some smaller tables, a lot of indexes and some functions. I would like to make an estimation of the performance, the diskspace and other related things, when we have database of for instance 10 million records or 100 million records. Is there any math to be done on that ? Met vriendelijke groeten, Bien à vous, Kind regards, Yves Vindevogel Implements Mail: [EMAIL PROTECTED] - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be x-tad-smaller First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi./x-tad-smaller ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] cost-based vacuum
On Tue, 2005-07-12 at 03:45, Simon Riggs wrote: Since vacuum_cost_delay is a userset parameter, you should be able to SET this solely for the analyze_thread. That way we will know with more certainty that it is the analyze_thread that is interfering. That is what I have been doing. In fact, I have eliminated the reader_thread and analyze_thread. I just have the writer_thread running, and a psql connection with which I perform ANALYZE, for various vacuum_cost_* parameters. (I'm trying to extract a reproducible experiment) It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. What is your default_statistics_target? All other configs are default; default_statistics_target=10. Do you have other stats targets set? No. The only thing slightly out of the ordinary with the tables is that they are created WITHOUT OIDS. Some indexes, but no primary keys. All columns NOT NULL. How long does ANALYZE take to run, with/without the vacuum_cost_delay? Well, on one table with about 50K rows, it takes about 1/4s to ANALYZE with vacuum_cost_delay=0, and about 15s with vacuum_cost_delay=1000. Other things of note: - VACUUM has the same effect. If I VACUUM or ANALYZE the whole DB, the CPU spikes reset between tables. - vmstat reports blocks written drops as the CPU rises. Don't know if it is cause or effect yet. On a small test system, I'm writing about 1.5MB/s. After about 20s of cost-based ANALYZE, this drops under 0.5MB/s. - this is a dual Xeon. I have tried both with and without hyperthreading. I haven't tried to reproduce it elsewhere yet, but will. - Looking at oprofile reports for 10-minute runs of a database-wide VACUUM with vacuum_cost_delay=0 and 1000, shows the latter spending a lot of time in LWLockAcquire and LWLockRelease (20% each vs. 2%). Thanks, --Ian ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Projecting currentdb to more users
On 7/12/05, Yves Vindevogel [EMAIL PROTECTED] wrote: Hi, We have a couple of database that are identical (one for each customer). They are all relatively small, ranging from 100k records to 1m records. There's only one main table with some smaller tables, a lot of indexes and some functions. I would like to make an estimation of the performance, the diskspace and other related things, when we have database of for instance 10 million records or 100 million records. Is there any math to be done on that ? Its pretty easy to make a database run fast with only a few thousand records, or even a million records, however things start to slow down non-linearly when the database grows too big to fit in RAM. I'm not a guru, but my attempts to do this have not been very accurate. Maybe (just maybe) you could get an idea by disabling the OS cache on the file system(s) holding the database and then somehow fragmenting the drive severly (maybe by putting each table in it's own disk partition?!?) and measuring performance. On the positive side, there are a lot of wise people on this list who have +++ experience optimzing slow queries on big databases. So queries now that run in 20 ms but slow down to 7 seconds when your tables grow will likely benefit from optimizing. -- Matthew Nuzum www.bearfruit.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Projecting currentdb to more users
From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. ...125. Intel has designed its compiler purposely to degrade performance when a program is run on an AMD platform. To achieve this, Intel designed the compiler to compile code along several alternate code paths. Some paths are executed when the program runs on an Intel platform and others are executed when the program is operated on a computer with an AMD microprocessor. (The choice of code path is determined when the program is started, using a feature known as CPUID which identifies the computer's microprocessor.) By design, the code paths were not created equally. If the program detects a Genuine Intel microprocessor, it executes a fully optimized code path and operates with the maximum efficiency. However, if the program detects an Authentic AMD microprocessor, it executes a different code path that will degrade the program's performance or cause it to crash... ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] General DB Tuning
Help! After recently migrating to Postgres 8, I've discovered to my horror that I can't determine which queries are poorly performing anymore because the logging has drastically changed and no longer shows durations for anything done through JDBC. So I'm desperately trying to do performance tuning on my servers and have no way to sort out which statements are the slowest. Does anyone have any suggestions? How do you determine what queries are behaving badly when you can't get durations out of the logs? I have a perl script that analyzes the output from Postgres 7 logs and it works great! But it relies on the duration being there. I did some searches on postgresql.org mailing lists and have seen a few people discussing this problem, but noone seems to be too worried about it. Is there a simple work-around? Sincerely, Brent Sell on Yahoo! Auctions no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] General DB Tuning
I have this in my postgresql.conf file and it works fine (set the min to whatever you want to log) log_min_duration_statement = 3000 # -1 is disabled, in milliseconds. Another setting that might get what you want: #log_duration = false uncomment and change to true. From the docs: (http://www.postgresql.org/docs/8.0/interactive/runtime-config.html) Causes the duration of every completed statement which satisfies log_statement to be logged. When using this option, if you are not using syslog, it is recommended that you log the PID or session ID using log_line_prefix so that you can link the statement to the duration using the process ID or session ID. The default is off. Only superusers can change this setting. Brent Henry wrote: Help! After recently migrating to Postgres 8, I've discovered to my horror that I can't determine which queries are poorly performing anymore because the logging has drastically changed and no longer shows durations for anything done through JDBC. So I'm desperately trying to do performance tuning on my servers and have no way to sort out which statements are the slowest. Does anyone have any suggestions? How do you determine what queries are behaving badly when you can't get durations out of the logs? I have a perl script that analyzes the output from Postgres 7 logs and it works great! But it relies on the duration being there. I did some searches on postgresql.org mailing lists and have seen a few people discussing this problem, but noone seems to be too worried about it. Is there a simple work-around? Sincerely, Brent Sell on Yahoo! Auctions – no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] General DB Tuning
Yes, that is exactly what I want to use! Unfortunately, it doesn't work if you access postgres through a JDBC connection. I don't know why. I found a posting from back in February which talks aobut this a little: http://archives.postgresql.org/pgsql-admin/2005-02/msg00055.php But I can't find anywhere where someone has fixed it. Am I the only one accessing postgres through JDBC? -Brent --- Tom Arthurs [EMAIL PROTECTED] wrote: I have this in my postgresql.conf file and it works fine (set the min to whatever you want to log) log_min_duration_statement = 3000 # -1 is disabled, in milliseconds. Another setting that might get what you want: #log_duration = false uncomment and change to true. From the docs: (http://www.postgresql.org/docs/8.0/interactive/runtime-config.html) Causes the duration of every completed statement which satisfies log_statement to be logged. When using this option, if you are not using syslog, it is recommended that you log the PID or session ID using log_line_prefix so that you can link the statement to the duration using the process ID or session ID. The default is off. Only superusers can change this setting. Brent Henry wrote: Help! After recently migrating to Postgres 8, I've discovered to my horror that I can't determine which queries are poorly performing anymore because the logging has drastically changed and no longer shows durations for anything done through JDBC. So I'm desperately trying to do performance tuning on my servers and have no way to sort out which statements are the slowest. Does anyone have any suggestions? How do you determine what queries are behaving badly when you can't get durations out of the logs? I have a perl script that analyzes the output from Postgres 7 logs and it works great! But it relies on the duration being there. I did some searches on postgresql.org mailing lists and have seen a few people discussing this problem, but noone seems to be too worried about it. Is there a simple work-around? Sincerely, Brent Sell on Yahoo! Auctions no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] General DB Tuning
we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Brent Henry wrote: Yes, that is exactly what I want to use! Unfortunately, it doesn't work if you access postgres through a JDBC connection. I don't know why. I found a posting from back in February which talks aobut this a little: http://archives.postgresql.org/pgsql-admin/2005-02/msg00055.php But I can't find anywhere where someone has fixed it. Am I the only one accessing postgres through JDBC? -Brent --- Tom Arthurs [EMAIL PROTECTED] wrote: I have this in my postgresql.conf file and it works fine (set the min to whatever you want to log) log_min_duration_statement = 3000 # -1 is disabled, in milliseconds. Another setting that might get what you want: #log_duration = false uncomment and change to true. From the docs: (http://www.postgresql.org/docs/8.0/interactive/runtime-config.html) Causes the duration of every completed statement which satisfies log_statement to be logged. When using this option, if you are not using syslog, it is recommended that you log the PID or session ID using log_line_prefix so that you can link the statement to the duration using the process ID or session ID. The default is off. Only superusers can change this setting. Brent Henry wrote: Help! After recently migrating to Postgres 8, I've discovered to my horror that I can't determine which queries are poorly performing anymore because the logging has drastically changed and no longer shows durations for anything done through JDBC. So I'm desperately trying to do performance tuning on my servers and have no way to sort out which statements are the slowest. Does anyone have any suggestions? How do you determine what queries are behaving badly when you can't get durations out of the logs? I have a perl script that analyzes the output from Postgres 7 logs and it works great! But it relies on the duration being there. I did some searches on postgresql.org mailing lists and have seen a few people discussing this problem, but noone seems to be too worried about it. Is there a simple work-around? Sincerely, Brent Sell on Yahoo! Auctions – no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] General DB Tuning
Tom Arthurs wrote: we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Doesn't it depend on what jdbc driver you are using? Dennis ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] General DB Tuning
we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Doesn't it depend on what jdbc driver you are using? It depends if he's using new-protocol prepared queries which don't get logged properly. Wasn't that fixed for 8.1 or something? Chris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] General DB Tuning
hmm, yea maybe -- we are using the 7.4 driver with 8.0.x db. Dennis wrote: Tom Arthurs wrote: we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Doesn't it depend on what jdbc driver you are using? Dennis ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] General DB Tuning
We are running Postgres 8.0.2 with the 8.0.2 jdbc driver. And yes we are using prepared statements. I've spent hours trying to get the 'log_min_duration_statement' and 'log_duration' options to work with no luck. I never get any duration from the statement. I also never see 'begin' or 'commit' in the log so I can't tell how long my batch commands are taking to commit to the DB. Is there a different kind of 'prepared' statements that we should be using in the driver to get logging to work properly? What is the 'new' protocol? Tom, what version are you using? Are you using prepared statements in JDBC? -Brent --- Christopher Kings-Lynne [EMAIL PROTECTED] wrote: we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Doesn't it depend on what jdbc driver you are using? It depends if he's using new-protocol prepared queries which don't get logged properly. Wasn't that fixed for 8.1 or something? Chris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] General DB Tuning
Is there a different kind of 'prepared' statements that we should be using in the driver to get logging to work properly? What is the 'new' protocol? The 8.0.2 jdbc driver uses real prepared statements instead of faked ones. The problem is the new protocol (that the 8.0.2 driver users) has a bug where protocol-prepared queries don't get logged properly. I don't know if it's been fixed... Chris ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] General DB Tuning
Here's the answer for you from the jdbc list: Alvin Hung wrote: Currently, 8.0.2 / JDBC 8.0-310, log_min_duration_statement does not work with JDBC. Nothing will get logged. This makes it very difficult to tune a java application. Can you tell me when will this be fixed? Thanks. This is a server limitation: it does not handle logging of the V3 extended query protocol very well. There's gradual progress being made on it; you might want to search the pgsql-hackers and pgsql-patches archives for details. == We are using prepared statements, but we are using the 7.4 driver with the 8.0.3 server. I think it comes down to locally (on the client) prepared statements vs using server side prepared statments. I never got past this issue (changing the code is in our todo list, but pretty far down it) so I never noticed the logging issues.) I had a problem with prepared statements with the 8.x drivers -- here's what I got from the jdbc list when I asked the question: 1. What changed between the driver versions that generate this error? The driver started to use server-side prepared statements for parameterization of queries (i.e. the driver translates ? to $n in the main query string, and sends the actual parameter values out-of-band from the query itself). One sideeffect of this is that parameters are more strongly typed than in the 7.4.x versions where the driver would do literal parameter substitution into the query string before sending it to the backend. Also, you can use parameters in fewer places (they must fit the backend's idea of where parameterizable expressions are allowed) -- e.g. see the recent thread about ORDER BY ? changing behaviour with the newer driver. 2. What is the downside of continuing to use the 7.x version of the driver -- or are there better alternatives (patch, new version, etc). I am using build 311 of the driver. Most active development happens on the 8.0 version; 7.4.x is maintained for bugfixes but that's about it, you won't get the benefit of any performance improvements or added features that go into 8.0. Also, the 7.4.x driver won't necessarily work with servers = 8.0. In the longer term, the 7.4.x version will eventually become unmaintained. So for the short term, you could downgrade your driver. Brent Henry wrote: We are running Postgres 8.0.2 with the 8.0.2 jdbc driver. And yes we are using prepared statements. I've spent hours trying to get the 'log_min_duration_statement' and 'log_duration' options to work with no luck. I never get any duration from the statement. I also never see 'begin' or 'commit' in the log so I can't tell how long my batch commands are taking to commit to the DB. Is there a different kind of 'prepared' statements that we should be using in the driver to get logging to work properly? What is the 'new' protocol? Tom, what version are you using? Are you using prepared statements in JDBC? -Brent --- Christopher Kings-Lynne [EMAIL PROTECTED] wrote: we are using jdbc -- the log_min_duration_statement = 3000 statement works fine for me. Looks like there's no other work around for the bug(?). Not sure since I have no interest in logging a million statements a day, I only want to see the poorly performing hits. Doesn't it depend on what jdbc driver you are using? It depends if he's using new-protocol prepared queries which don't get logged properly. Wasn't that fixed for 8.1 or something? Chris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Projecting currentdb to more users
2005/7/12, Mohan, Ross [EMAIL PROTECTED]: From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. Postgres is compiled with gnu compiler. Isn't it ? I don't know how much can Postgres benefit from an optimized Intel compiler. -- Jean-Max Reymond CKR Solutions Open Source Nice France http://www.ckr-solutions.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster