Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Tobias Brox
On 15 September 2010 12:05, Tobias Brox tobi...@gmail.com wrote:
 Recently we've frequently encountered issues where some simple selects
 (meaning, selects doing an index lookup and fetching one row) have
 become stuck for several minutes.  Apparently all requests on one
 exact table gets stuck, all requests not related to said table are
 going through without any problems.

Now I've set up all kind of logging regarding locks, so it seems like
we're having issues that aren't lock-related.  I just did a bit of
research into one situation today.

All while having this problem, there was one heavy query running in
parallell ... not sure if that's relevant.

Then comes one query that requires a seq scan on the problem table
(that won't happen again - I just added a new index).  Four seconds
later comes another query requiring a simple index lookup.  Still more
queries comes in, most of them simple index lookups, but on different
indexes.  After one minute there are 25 queries in the
pg_stat_activity view towards this table.  It's not a particularly
huge table.  Moments later all 25 queries have been executed.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Kevin Grittner
Tobias Brox tobi...@gmail.com wrote:
 
 All while having this problem, there was one heavy query running
 in parallell ... not sure if that's relevant.
 
Have you turned on checkpoint logging?  You might want to see if
these are happening at some particular point in the checkpoint
processing.  If so, look through the archives for posts from Greg
Smith on how to tune that -- he's worked out a nice methodology to
iteratively improve your configuration in this regard.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Tobias Brox
On 23 September 2010 22:55, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Have you turned on checkpoint logging?

Yes ... it seems so:

13:19:13.840 - LOG: checkpoint complete: wrote 3849 buffers (0.2%); 0
transaction log file(s) added, 0 removed, 5 recycled; write=269.551 s,
sync=0.103 s, total=269.953 s
13:19:13.841 - LOG:  checkpoint starting: xlog
13:19:33 - the seq scan query towards the affected table started
13:20:31.454 - one of the index lookup queries towards the affected
table was finished
13:20:43.176 - LOG:  checkpoint complete: wrote 108199 buffers (6.9%);
0 transaction log file(s) added, 0 removed, 16 recycled; write=11.521
s, sync=77.533 s, total=89.335 s

 You might want to see if
 these are happening at some particular point in the checkpoint
 processing.  If so, look through the archives for posts from Greg
 Smith on how to tune that -- he's worked out a nice methodology to
 iteratively improve your configuration in this regard.

Thank you, I will ... hmm ... I found this blog post:

http://blog.2ndquadrant.com/en/2010/01/measuring-postgresql-checkpoin.html

Of course I'm doing it my own way:

select *,now() as snapshot into tmp_pg_stat_bgwriter from pg_stat_bgwriter ;

create view tmp_delta_pg_stat_bgwriter as
 select a.checkpoints_timed-b.checkpoints_timed as
checkpoints_timed,a.checkpoints_req-b.checkpoints_req as
checkpoints_req,a.buffers_checkpoint-b.buffers_checkpoint as
buffers_checkpoint,a.buffers_clean-b.buffers_clean as
buffers_clean,a.maxwritten_clean-b.maxwritten_clean as
maxwritten_clean,a.buffers_backend-b.buffers_backend as
buffers_backend,a.buffers_alloc-b.buffers_alloc as buffers_alloc,
now()-b.snapshot as interval
 from pg_stat_bgwriter a ,
 (select * from tmp_pg_stat_bgwriter order by snapshot desc limit 1) as b;

Checkpoint timeout is set to 5 minutes.  Right now we're having
relatively low activity.  I'm not sure how to read the stats below,
but they look OK to me:

select * from tmp_delta_pg_stat_bgwriter ;
 checkpoints_timed | checkpoints_req | buffers_checkpoint |
buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc |
 interval
---+-++---+--+-+---+-
 3 |   0 |   8277 |
15 |0 | 185 | 18691 |
00:12:02.988842
(1 row)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-23 Thread Kevin Grittner
Tobias Brox tobi...@gmail.com wrote:
 
 13:19:13.840 - LOG: checkpoint complete
 
 13:19:13.841 - LOG:  checkpoint starting
 
 13:20:43.176 - LOG:  checkpoint complete
 
There wasn't a lot of time between the completion of one checkpoint
and the start of the next.  And the two checkpoints finished a
minute and a half apart.  Perhaps you need to boost your
checkpoint_segments setting?  What is it now?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] locking issue on simple selects?

2010-09-15 Thread Tobias Brox
We have a production database server ... it's quite busy but usually
working completely fine, simple queries taking a fraction of a
millisecond to run.

Recently we've frequently encountered issues where some simple selects
(meaning, selects doing an index lookup and fetching one row) have
become stuck for several minutes.  Apparently all requests on one
exact table gets stuck, all requests not related to said table are
going through without any problems.  According to the pg_stat_activity
view, all queries getting stuck was read-queries (selects), no updates
or anything like that (some of the transactions were doing updates
and/or inserts though).

The obvious thought seems to be that this is a locking issue ... but
it doesn't seem so.  For one thing, AFAIK locking shouldn't affect
selects, only updates?  I've also looked through tons of logs without
finding any obvious locking issues.  In one of the instances, I could
find that there were some open transactions doing updates on one row
in the table and then later becoming stuck (as all other transactions)
when doing a select on another row in the said table.

My second thought was that the database is on the edge of being
overloaded and that the memory situation is also just on the edge ...
important indexes that used to be in memory now has to be accessed
from the disk.  Still, it doesn't make sense, we're not seeing any
serious impact on the CPU iowait status, and it seems improbable that
it should take minutes to load an index?

There aren't any long-lasting transactions going on when the jam
occurs.  I haven't checked much up, usually the jam seems to resolve
itself pretty instantly, but I think that at some point it took half a
minute from the first query was finished until the pg_stat_activity
view got back to normal (meaning typically 0-5 simultaneously
processed queries).

FWIW, we're running pg 8.3.11, transaction isolation level
serializable.  The machine is quad-core hyperthreaded (16 CPUs visible
to the OS), a SAN is used for storage, and different RAIDs are used
for the root partition, pg data and pg wals.

Any ideas?  I'm aware that some configuration (i.e. checkpoint
interval etc) may cause significant delay on write-queries ... but
this is only read-queries.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Tom Lane
Tobias Brox tobi...@gmail.com writes:
 Recently we've frequently encountered issues where some simple selects
 (meaning, selects doing an index lookup and fetching one row) have
 become stuck for several minutes.  Apparently all requests on one
 exact table gets stuck, all requests not related to said table are
 going through without any problems.  According to the pg_stat_activity
 view, all queries getting stuck was read-queries (selects), no updates
 or anything like that (some of the transactions were doing updates
 and/or inserts though).

 The obvious thought seems to be that this is a locking issue ... but
 it doesn't seem so.  For one thing, AFAIK locking shouldn't affect
 selects, only updates?

An exclusive lock will block selects too.  Have you looked into pg_locks
for ungranted lock requests?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Tobias Brox
On 15 September 2010 15:39, Tom Lane t...@sss.pgh.pa.us wrote:
 An exclusive lock will block selects too.  Have you looked into pg_locks
 for ungranted lock requests?

Well - I thought so, we have a logging script that logs the content of
the pg_locks table, it didn't log anything interesting but it may be a
problem with the script itself.  It does an inner join on
pg_locks.relation = pg_class.oid but when I check now this join seems
to remove most of the rows in the pg_locks table.  Does it make sense
at all to join pg_class with pg_locks?  I will ask the sysadm to
change to an outer join as for now.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Tom Lane
Tobias Brox tobi...@gmail.com writes:
 Well - I thought so, we have a logging script that logs the content of
 the pg_locks table, it didn't log anything interesting but it may be a
 problem with the script itself.  It does an inner join on
 pg_locks.relation = pg_class.oid but when I check now this join seems
 to remove most of the rows in the pg_locks table.  Does it make sense
 at all to join pg_class with pg_locks?  I will ask the sysadm to
 change to an outer join as for now.

There are lots of locks that are not tied to a specific table, so an
inner join is definitely bad.  You could use an outer join to annotate
rows that do correspond to table locks, though.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Greg Smith

Tobias Brox wrote:

I thought so, we have a logging script that logs the content of
the pg_locks table, it didn't log anything interesting but it may be a
problem with the script itself.  It does an inner join on
pg_locks.relation = pg_class.oid but when I check now this join seems
to remove most of the rows in the pg_locks table.


Some of the waits you might be running into will be things waiting on 
another transaction holding a lock to finish, which are probably wiped 
out by this approach.  There are some useful examples of lock views on 
the wiki:


http://wiki.postgresql.org/wiki/Lock_Monitoring
http://wiki.postgresql.org/wiki/Lock_dependency_information
http://wiki.postgresql.org/wiki/Find_Locks

And the idea you have of coverting the pg_class one to an outer join 
will help.


The other thing you should try is toggling on log_lock_waits and 
possibly reducing deadlock_timeout.  This will put a lot of the 
information you're trying to collect right into the logs.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, PostgreSQL 9.0 High PerformancePre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Brad Nicholson

 On 10-09-15 03:07 PM, Tobias Brox wrote:

On 15 September 2010 15:39, Tom Lanet...@sss.pgh.pa.us  wrote:

An exclusive lock will block selects too.  Have you looked into pg_locks
for ungranted lock requests?

Well - I thought so, we have a logging script that logs the content of
the pg_locks table, it didn't log anything interesting but it may be a
problem with the script itself.  It does an inner join on
pg_locks.relation = pg_class.oid but when I check now this join seems
to remove most of the rows in the pg_locks table.  Does it make sense
at all to join pg_class with pg_locks?  I will ask the sysadm to
change to an outer join as for now.



You can also enable log_lock_waits and the lock waits will appear in 
your Postgres logs.


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Tobias Brox
On 15 September 2010 21:28, Greg Smith g...@2ndquadrant.com wrote:
 There are some useful examples of lock views on the wiki:

 http://wiki.postgresql.org/wiki/Lock_Monitoring
 http://wiki.postgresql.org/wiki/Lock_dependency_information
 http://wiki.postgresql.org/wiki/Find_Locks

Thanks.  I think those pages probably should be merged ... hmm ... if
I manage to solve my locking issues I should probably try and
contribute to the wiki.

Reading the wiki pages, for me it boils down to three things:

1) the current query we're logging seems good enough except that we
should do an outer join except for inner join towards pg_class, so
I've asked our sysadm to fix it.

2) the middle query on http://wiki.postgresql.org/wiki/Lock_Monitoring
seems very useful, and I've asked our sysadm to set up logging of this
one as well.

3) That log_lock_waits config option that you and Brad points to seems
very useful, so I've asked our sysadm to enable it.

I also discovered that there is an attribute pg_stat_activity.waiting
- I suppose it is 't' if a query is waiting for a lock?  It seems
quite useful ...

 reducing deadlock_timeout.

It's set to one second, and some of the jams we have been experiencing
has lasted for several minutes.  I also think it should say in the pg
log if there is a deadlock situation?  I grepped for deadlock in the
logs without finding anything.

Well, we'll improve the logging, and wait for the next jam to occur
... and I'll make sure to post an update if/when I figure out
something.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] locking issue on simple selects?

2010-09-15 Thread Greg Smith

Tobias Brox wrote:

I think those pages probably should be merged ... hmm ... if
I manage to solve my locking issues I should probably try and
contribute to the wiki.
  


Certainly the 3rd one could be merged with one of the other two, and 
maybe all merged into one.  I haven't cleaned up that whole area in a 
whole, it's due for a round of it soon. You're welcome to take a shot at 
it.  We can always revert any mistakes made.



reducing deadlock_timeout.



It's set to one second, and some of the jams we have been experiencing
has lasted for several minutes.  I also think it should say in the pg
log if there is a deadlock situation?  


deadlock_timeout is how long a process trying to acquire a lock waits 
before it a) looks for a deadlock, and b) prints a report that it's 
waiting into the logs when log_lock_waits is on.  So if you're looking 
for slow lock acquisition that's in the sub-second range, it can be 
useful to reduce regardless of whether deadlock ever happens.  That 
doesn't sound like your situation though.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, PostgreSQL 9.0 High PerformancePre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book