Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-19 Thread Robert Creager
On Tue, 19 Jul 2005 12:54:22 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> Robert Creager <[EMAIL PROTECTED]> writes:
> 
> Hmm, I hadn't thought about the possible impact of multiple concurrent
> vacuums.  Is the problem caused by that, or has performance already gone
> into the tank by the time the cron-driven vacuums are taking long enough
> to overlap?

All statements over 5 seconds are logged.  Vacuums are running on the 5 minute 
mark.

Log file shows the first query starts going bad a 9:32:15 (7 seconds), although 
the second query start before the first .  The first vacuum statement logged 
shows 1148 seconds completing at 9:54:09, so starting at 9:35.  Looks like the 
vacuum is an innocent bystander of the problem.

The first problem queries are below.  Additionally, I've attached 5 minutes 
(bzipped) of logs starting at the first event below.

Jul 19 09:32:15 annette postgres[17029]: [2-1] LOG:  duration: 7146.168 ms  
statement:
Jul 19 09:32:15 annette postgres[17029]: [2-2] ^I  SELECT location_id, 
location_type.name AS type, library, rail
Jul 19 09:32:15 annette postgres[17029]: [2-3] ^I  FROM location_lock JOIN 
location USING( location_id )
Jul 19 09:32:15 annette postgres[17029]: [2-4] ^I JOIN 
location_type USING( location_type_id )
Jul 19 09:32:15 annette postgres[17029]: [2-5] ^I  WHERE test_session_id = 
'5264'
Jul 19 09:32:20 annette postgres[17092]: [2-1] LOG:  duration: 13389.730 ms  
statement:
Jul 19 09:32:20 annette postgres[17092]: [2-2] ^I  SELECT 
location_type.name AS location_type_name,
Jul 19 09:32:20 annette postgres[17092]: [2-3] ^I library, rail, 
col, side, row, location_id,
Jul 19 09:32:20 annette postgres[17092]: [2-4] ^I hli_lsm, 
hli_panel, hli_row, hli_col
Jul 19 09:32:20 annette postgres[17092]: [2-5] ^I  FROM location JOIN 
location_type USING( location_type_id )
Jul 19 09:32:20 annette postgres[17092]: [2-6] ^IJOIN 
complex USING( library_id )
Jul 19 09:32:20 annette postgres[17092]: [2-7] ^ILEFT OUTER 
JOIN hli_location USING( location_id )
Jul 19 09:32:20 annette postgres[17092]: [2-8] ^ILEFT OUTER 
JOIN application USING( application_id )
Jul 19 09:32:20 annette postgres[17092]: [2-9] ^I  WHERE  
complex.complex_id = '13'
Jul 19 09:32:20 annette postgres[17092]: [2-10] ^I AND location_id NOT 
IN
Jul 19 09:32:20 annette postgres[17092]: [2-11] ^I(SELECT 
location_id
Jul 19 09:32:20 annette postgres[17092]: [2-12] ^I FROM 
location_lock)
Jul 19 09:32:20 annette postgres[17092]: [2-13] ^I AND location_id NOT 
IN
Jul 19 09:32:20 annette postgres[17092]: [2-14] ^I(SELECT 
location_id
Jul 19 09:32:20 annette postgres[17092]: [2-15] ^I FROM 
cartridge)
Jul 19 09:32:20 annette postgres[17092]: [2-16] ^IAND (location_type.name ~ 
'cell' AND application.name ~ 'hli'  AND hli_lsm = 1 AND col BETWEEN -2 AND 2)
Jul 19 09:32:20 annette postgres[17092]: [2-17] ^I
Jul 19 09:32:20 annette postgres[17092]: [2-18] ^I ORDER BY 
location.usage_count, location.rand LIMIT 1
Jul 19 09:32:20 annette postgres[17092]: [2-19] ^I FOR UPDATE OF 
location

Cheers,
Rob



pg.log.bz2
Description: BZip2 compressed data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-19 Thread Robert Creager
On Tue, 19 Jul 2005 12:54:22 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> Hmm, I hadn't thought about the possible impact of multiple concurrent
> vacuums.  Is the problem caused by that, or has performance already gone
> into the tank by the time the cron-driven vacuums are taking long enough
> to overlap?

Don't know just yet.  When I run the vacuums manually on a healthy system on
741, they take less than 30 seconds.  I've stopped the cron vacuum and canceled
all the outstanding vacuum processes, but the 803 is still struggling (1/2 hour
later).

I'll re-start the database, vacuum full analyze and restart the runs without the
cron vacuum running.

Cheers,
Rob

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-19 Thread Tom Lane
Robert Creager <[EMAIL PROTECTED]> writes:
> Alright.  Restarted the 803 database.  Cron based vacuum analyze is
> running every 5 minutes.  vacuum_cost_delay is 0.  The problem showed
> up after about 1/2 hour of running.  I've got vacuum jobs stacked from
> the last 35 minutes, with 2 vacuums running at the same time.  CS is
> around 73k.

Hmm, I hadn't thought about the possible impact of multiple concurrent
vacuums.  Is the problem caused by that, or has performance already gone
into the tank by the time the cron-driven vacuums are taking long enough
to overlap?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-19 Thread Robert Creager
On Mon, 18 Jul 2005 13:52:53 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> 
> Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what
> the active setting is.
> 

Alright.  Restarted the 803 database.  Cron based vacuum analyze is running
every 5 minutes.  vacuum_cost_delay is 0.  The problem showed up after about 1/2
hour of running.  I've got vacuum jobs stacked from the last 35 minutes, with 2
vacuums running at the same time.  CS is around 73k.

What do I do now?  I can bring the db back to normal and not run any cron based
vacuum to see if it still happens, but I suspect nothing will happen without the
vacuum.  I'll leave it in it's current semi-catatonic state as long as possible
in case there is something to look at?

Cheers,
Rob

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-18 Thread Robert Creager

In regards to
http://archives.postgresql.org/pgsql-performance/2005-07/msg00261.php

Tom Says:
> ... as indeed it does according to Robert's recent reports.  Still
> awaiting the definitive test, but I'm starting to think this is another
> case of the strange behavior Ian Westmacott exhibited.

Ok.  This morning at around 7:30am I started tests against a freshly VACUUM FULL
ANALYZE 803 database with the vacuum delay on and cron running vacuum analyze
every 5 minutes.

Around 8:15 I was starting to receive hits of a few seconds of high CS hits,
higher than the previous 7 hour run on 741.  I changed the vacuum delay to 0 and
HUP'ed the server (how can I see the value vacuum_cost_delay run time?).  By
10:30, I had vacuum jobs backed up since 9:20 and the queries were over 75
seconds.

I'm currently running on 741 as I need to get work done today ;-)  I'll restart
the 803 db, vacuum full analyze again and next opportunity (maybe tonight),
start runs again with cron vacuum and a vacuum_cost_delay of 0, unless I should
try something else?

Cheers,
Rob

-- 

Robert Creager
Advisory Software Engineer
Phone 303.673.2365
Pager 888.912.4458
Fax   303.661.5379
StorageTek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-18 Thread Tom Lane
Robert Creager <[EMAIL PROTECTED]> writes:
> Around 8:15 I was starting to receive hits of a few seconds of high CS hits,
> higher than the previous 7 hour run on 741.  I changed the vacuum delay to 0 
> and
> HUP'ed the server (how can I see the value vacuum_cost_delay run
> time?).

Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what
the active setting is.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-18 Thread Robert Creager
On Mon, 18 Jul 2005 13:52:53 -0400
Tom Lane <[EMAIL PROTECTED]> wrote:

> Start a fresh psql session and "SHOW vacuum_cost_delay" to verify what
> the active setting is.

Thanks.  It does show 0 for 803 in a session that was up since I thought I had
HUPed the server with the new value.

This is leading me to believe that 803 doesn't do very well with VACUUM ANALYZE
running often, at least in my particular application...  I will provide a more
definitive statement to that affect, hopefully tonight.

Cheers,
Rob

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?

2005-07-17 Thread Tom Lane
Robert Creager <[EMAIL PROTECTED]> writes:
> I'm guessing this is the CS problem that reared it's head last year?

The context swap problem was no worse in 8.0 than in prior versions,
so that hardly seems like a good explanation.  Have you tried reverting
to the cron-based vacuuming method you used in 7.4?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org