Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
On Tue, 19 Jul 2005 12:54:22 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > Robert Creager <[EMAIL PROTECTED]> writes: > > Hmm, I hadn't thought about the possible impact of multiple concurrent > vacuums. Is the problem caused by that, or has performance already gone > into the tank by the time the cron-driven vacuums are taking long enough > to overlap? All statements over 5 seconds are logged. Vacuums are running on the 5 minute mark. Log file shows the first query starts going bad a 9:32:15 (7 seconds), although the second query start before the first . The first vacuum statement logged shows 1148 seconds completing at 9:54:09, so starting at 9:35. Looks like the vacuum is an innocent bystander of the problem. The first problem queries are below. Additionally, I've attached 5 minutes (bzipped) of logs starting at the first event below. Jul 19 09:32:15 annette postgres[17029]: [2-1] LOG: duration: 7146.168 ms statement: Jul 19 09:32:15 annette postgres[17029]: [2-2] ^I SELECT location_id, location_type.name AS type, library, rail Jul 19 09:32:15 annette postgres[17029]: [2-3] ^I FROM location_lock JOIN location USING( location_id ) Jul 19 09:32:15 annette postgres[17029]: [2-4] ^I JOIN location_type USING( location_type_id ) Jul 19 09:32:15 annette postgres[17029]: [2-5] ^I WHERE test_session_id = '5264' Jul 19 09:32:20 annette postgres[17092]: [2-1] LOG: duration: 13389.730 ms statement: Jul 19 09:32:20 annette postgres[17092]: [2-2] ^I SELECT location_type.name AS location_type_name, Jul 19 09:32:20 annette postgres[17092]: [2-3] ^I library, rail, col, side, row, location_id, Jul 19 09:32:20 annette postgres[17092]: [2-4] ^I hli_lsm, hli_panel, hli_row, hli_col Jul 19 09:32:20 annette postgres[17092]: [2-5] ^I FROM location JOIN location_type USING( location_type_id ) Jul 19 09:32:20 annette postgres[17092]: [2-6] ^IJOIN complex USING( library_id ) Jul 19 09:32:20 annette postgres[17092]: [2-7] ^ILEFT OUTER JOIN hli_location USING( location_id ) Jul 19 09:32:20 annette postgres[17092]: [2-8] ^ILEFT OUTER JOIN application USING( application_id ) Jul 19 09:32:20 annette postgres[17092]: [2-9] ^I WHERE complex.complex_id = '13' Jul 19 09:32:20 annette postgres[17092]: [2-10] ^I AND location_id NOT IN Jul 19 09:32:20 annette postgres[17092]: [2-11] ^I(SELECT location_id Jul 19 09:32:20 annette postgres[17092]: [2-12] ^I FROM location_lock) Jul 19 09:32:20 annette postgres[17092]: [2-13] ^I AND location_id NOT IN Jul 19 09:32:20 annette postgres[17092]: [2-14] ^I(SELECT location_id Jul 19 09:32:20 annette postgres[17092]: [2-15] ^I FROM cartridge) Jul 19 09:32:20 annette postgres[17092]: [2-16] ^IAND (location_type.name ~ 'cell' AND application.name ~ 'hli' AND hli_lsm = 1 AND col BETWEEN -2 AND 2) Jul 19 09:32:20 annette postgres[17092]: [2-17] ^I Jul 19 09:32:20 annette postgres[17092]: [2-18] ^I ORDER BY location.usage_count, location.rand LIMIT 1 Jul 19 09:32:20 annette postgres[17092]: [2-19] ^I FOR UPDATE OF location Cheers, Rob pg.log.bz2 Description: BZip2 compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
On Tue, 19 Jul 2005 12:54:22 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > Hmm, I hadn't thought about the possible impact of multiple concurrent > vacuums. Is the problem caused by that, or has performance already gone > into the tank by the time the cron-driven vacuums are taking long enough > to overlap? Don't know just yet. When I run the vacuums manually on a healthy system on 741, they take less than 30 seconds. I've stopped the cron vacuum and canceled all the outstanding vacuum processes, but the 803 is still struggling (1/2 hour later). I'll re-start the database, vacuum full analyze and restart the runs without the cron vacuum running. Cheers, Rob ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
Robert Creager <[EMAIL PROTECTED]> writes: > Alright. Restarted the 803 database. Cron based vacuum analyze is > running every 5 minutes. vacuum_cost_delay is 0. The problem showed > up after about 1/2 hour of running. I've got vacuum jobs stacked from > the last 35 minutes, with 2 vacuums running at the same time. CS is > around 73k. Hmm, I hadn't thought about the possible impact of multiple concurrent vacuums. Is the problem caused by that, or has performance already gone into the tank by the time the cron-driven vacuums are taking long enough to overlap? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
On Mon, 18 Jul 2005 13:52:53 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > > Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what > the active setting is. > Alright. Restarted the 803 database. Cron based vacuum analyze is running every 5 minutes. vacuum_cost_delay is 0. The problem showed up after about 1/2 hour of running. I've got vacuum jobs stacked from the last 35 minutes, with 2 vacuums running at the same time. CS is around 73k. What do I do now? I can bring the db back to normal and not run any cron based vacuum to see if it still happens, but I suspect nothing will happen without the vacuum. I'll leave it in it's current semi-catatonic state as long as possible in case there is something to look at? Cheers, Rob ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
In regards to http://archives.postgresql.org/pgsql-performance/2005-07/msg00261.php Tom Says: > ... as indeed it does according to Robert's recent reports. Still > awaiting the definitive test, but I'm starting to think this is another > case of the strange behavior Ian Westmacott exhibited. Ok. This morning at around 7:30am I started tests against a freshly VACUUM FULL ANALYZE 803 database with the vacuum delay on and cron running vacuum analyze every 5 minutes. Around 8:15 I was starting to receive hits of a few seconds of high CS hits, higher than the previous 7 hour run on 741. I changed the vacuum delay to 0 and HUP'ed the server (how can I see the value vacuum_cost_delay run time?). By 10:30, I had vacuum jobs backed up since 9:20 and the queries were over 75 seconds. I'm currently running on 741 as I need to get work done today ;-) I'll restart the 803 db, vacuum full analyze again and next opportunity (maybe tonight), start runs again with cron vacuum and a vacuum_cost_delay of 0, unless I should try something else? Cheers, Rob -- Robert Creager Advisory Software Engineer Phone 303.673.2365 Pager 888.912.4458 Fax 303.661.5379 StorageTek ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
Robert Creager <[EMAIL PROTECTED]> writes: > Around 8:15 I was starting to receive hits of a few seconds of high CS hits, > higher than the previous 7 hour run on 741. I changed the vacuum delay to 0 > and > HUP'ed the server (how can I see the value vacuum_cost_delay run > time?). Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what the active setting is. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
On Mon, 18 Jul 2005 13:52:53 -0400 Tom Lane <[EMAIL PROTECTED]> wrote: > Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what > the active setting is. Thanks. It does show 0 for 803 in a session that was up since I thought I had HUPed the server with the new value. This is leading me to believe that 803 doesn't do very well with VACUUM ANALYZE running often, at least in my particular application... I will provide a more definitive statement to that affect, hopefully tonight. Cheers, Rob ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
Robert Creager <[EMAIL PROTECTED]> writes: > I'm guessing this is the CS problem that reared it's head last year? The context swap problem was no worse in 8.0 than in prior versions, so that hardly seems like a good explanation. Have you tried reverting to the cron-based vacuuming method you used in 7.4? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org