[PERFORM] increasing query time after analyze

2014-02-05 Thread Katharina Koobs
Hi,
 
We have a PostgreSQL DB, version 9.3 on a Suse Linux system.
We ran the update from postgresql 8.4 to 9.3.
After importing the database the query time of one sql query is about 30
sec.
After ANALYZE the DB the query time of this sql query is about 45 minutes.
We can see that after analyzing the indexes will no longer be used.
 
Has anyone an idea why ANALYZE cause this problem?
 
Thanks a lot for your help!
 
Katharina
 
 


Re: [PERFORM] increasing query time after analyze

2014-02-05 Thread Pavel Stehule
Hello


2014-02-05 Katharina Koobs katharina.ko...@uni-konstanz.de:

 Hi,



 We have a PostgreSQL DB, version 9.3 on a Suse Linux system.

 We ran the update from postgresql 8.4 to 9.3.

 After importing the database the query time of one sql query is about 30
 sec.

 After ANALYZE the DB the query time of this sql query is about 45 minutes.

 We can see that after analyzing the indexes will no longer be used.



 Has anyone an idea why ANALYZE cause this problem?


yes, it is possible - sometimes due more reasons (some strange dataset or
correlation between columns) a statistics estimations are totally out. And
bad musty statistics can produces better estimations than fresh statistics

please send a EXPLAIN ANALYZE output for fast and slow queries.

Regards

Pavel Stehule




 Thanks a lot for your help!



 Katharina







[PERFORM] Postgres Query Plan Live Lock

2014-02-05 Thread Jeff Janes
On Monday, February 3, 2014, Pweaver (Paul Weaver) pwea...@panjiva.com
wrote:

 We have been running into a (live lock?) issue on our production Postgres
 instance causing queries referencing a particular table to become extremely
 slow and our application to lock up.

  This tends to occur on a particular table that gets a lot of queries
 against it after a large number of deletes. When this happens, the
 following symptoms occur when queries referencing that table are run (even
 it we stop the deleting):


What do you mean by stop the deleting?  Are you pausing the delete but
without either committing or rolling back the transaction, but just holding
it open?  Are you stopping it cleanly, between transactions?

Also, how many queries are happening concurrently?  Perhaps you need a
connection pooler.

Is the CPU time user time or system time?  What kernel version do you have?


 SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to complete
 EXPLAIN SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to
 complete the explain query, the query plan looks reasonable


This sounds like the problem we heard quite a bit about recently, where
processes spend a lot of time fighting over the proclock while they try to
check the commit status of open transactions while.  But I don't see how
deletes could trigger that behavior.  If the delete has not committed, the
tuples are still visible and the LIMIT 10 is quickly satisfied.  If the
delete has committed, the tuples quickly get hinted, and so the next query
along should be faster.

I also don't see why the explain would be slow.  A similar problem was
tracked down to digging through in-doubt tuples while trying to use an
index to find the true the min or max during estimating the cost of a merge
join.  But I don't think a simple table query should lead to that, unless
table_name is a view.  And I don't see how deletes, rather than uncommitted
inserts, could trigger it either.


 max_connections  | 600  |
 configuration file


That is quite extreme.  If a temporary load spike (like from the deletes
and the hinting needed after them) slows down the select queries and you
start more and more of them, soon you could tip the system over into kernel
scheduler insanity with high system time.  Once in this mode, it will stay
there until the incoming stream of queries stops and the existing ones
clear out.  But, if that is what is occurring, I don't know why queries on
other tables would still be fast.

Cheers,

Jeff




Re: [PERFORM] increasing query time after analyze

2014-02-05 Thread Claudio Freire
On Wed, Feb 5, 2014 at 8:50 AM, Katharina Koobs
katharina.ko...@uni-konstanz.de wrote:
 Hi,



 We have a PostgreSQL DB, version 9.3 on a Suse Linux system.

 We ran the update from postgresql 8.4 to 9.3.

 After importing the database the query time of one sql query is about 30
 sec.

 After ANALYZE the DB the query time of this sql query is about 45 minutes.

 We can see that after analyzing the indexes will no longer be used.



 Has anyone an idea why ANALYZE cause this problem?


If you just restored a dump, you should do a VACUUM ANALYZE.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres Query Plan Live Lock

2014-02-05 Thread bricklen
On Mon, Feb 3, 2014 at 1:35 PM, Pweaver (Paul Weaver)
pwea...@panjiva.comwrote:


 table_name stats:
 ~ 400,000,000 rows
 We are deleting 10,000,000s of rows in 100,000 row increments over a few
 days time prior/during this slowdown.


If you issue VACUUM or VACUUM ANALYZE after each DELETE, do the SELECTs
become more responsive?


Re: [PERFORM] Postgres Query Plan Live Lock

2014-02-05 Thread Pweaver (Paul Weaver)
On Tue, Feb 4, 2014 at 9:03 PM, Peter Geoghegan peter.geoghega...@gmail.com
 wrote:

 On Mon, Feb 3, 2014 at 1:35 PM, Pweaver (Paul Weaver)
 pwea...@panjiva.com wrote:
  We have been running into a (live lock?) issue on our production Postgres
  instance causing queries referencing a particular table to become
 extremely
  slow and our application to lock up.

 Livelock? Really? That would imply that the query would never finish.
 A livelock is morally equivalent to an undetected deadlock.

Livelock is bad term.


  This tends to occur on a particular table that gets a lot of queries
 against
  it after a large number of deletes. When this happens, the following
  symptoms occur when queries referencing that table are run (even it we
 stop
  the deleting):
 
  SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to complete
  EXPLAIN SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to
 complete
  the explain query, the query plan looks reasonable
  EXPLAIN SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to
 complete
  the explain analyze query, query plan looks reasonable, timing stats says
  query took sub millisecond time to complete

 Why should explain analyze say that? You'd need to catch the problem
 as it is run.

  SELECT * FROM another_table LIMIT 10; -- takes sub millisecond time
  EXPLAIN * FROM another_table LIMIT 10; -- takes sub millisecond time,
 query
  plan looks reasonable
 
  This behavior only stops and the queries go back to taking sub
 millisecond
  time if we take the application issuing the SELECTs offline and wait for
 the
  active queries to finish (or terminate them).
 
  There is not a particularly large load on the database machine at the
 time,
  neither are there a particularly large number of wal logs being written
  (although there is a burst of wal log writes immediately after the queue
 is
  cleared).

 Are you aware of hint bits?

 https://wiki.postgresql.org/wiki/Hint_Bits

No, but why would this cause the EXPLAIN queries to be slow?



 --
 Regards,
 Peter Geoghegan




-- 
Thank You,
Pweaver (pwea...@panjiva.com)


Re: [PERFORM] Postgres Query Plan Live Lock

2014-02-05 Thread Pweaver (Paul Weaver)
On Wed, Feb 5, 2014 at 9:52 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Monday, February 3, 2014, Pweaver (Paul Weaver) pwea...@panjiva.com
 wrote:

 We have been running into a (live lock?) issue on our production Postgres
 instance causing queries referencing a particular table to become extremely
 slow and our application to lock up.

  This tends to occur on a particular table that gets a lot of queries
 against it after a large number of deletes. When this happens, the
 following symptoms occur when queries referencing that table are run (even
 it we stop the deleting):


 What do you mean by stop the deleting?  Are you pausing the delete but
 without either committing or rolling back the transaction, but just holding
 it open?  Are you stopping it cleanly, between transactions?


We are repeatedly running delete commands in their own transactions. We
stop issuing new deletes and let them finish cleanly.


 Also, how many queries are happening concurrently?  Perhaps you need a
 connection pooler.

Usually between 1 and 20. When it gets locked up closer to 100-200.
We should add a connection pooler. Would the number of active queries on
the table be causing the issue?


 Is the CPU time user time or system time?  What kernel version do you have?

Real time - 3.2.0-26



 SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to complete
 EXPLAIN SELECT * FROM table_name LIMIT 10;  -- takes ~45 seconds to
 complete the explain query, the query plan looks reasonable


 This sounds like the problem we heard quite a bit about recently, where
 processes spend a lot of time fighting over the proclock while they try to
 check the commit status of open transactions while.  But I don't see how
 deletes could trigger that behavior.  If the delete has not committed, the
 tuples are still visible and the LIMIT 10 is quickly satisfied.  If the
 delete has committed, the tuples quickly get hinted, and so the next query
 along should be faster.

 I also don't see why the explain would be slow.  A similar problem was
 tracked down to digging through in-doubt tuples while trying to use an
 index to find the true the min or max during estimating the cost of a merge
 join.  But I don't think a simple table query should lead to that, unless
 table_name is a view.  And I don't see how deletes, rather than uncommitted
 inserts, could trigger it either.


  max_connections  | 600  |
 configuration file


 That is quite extreme.  If a temporary load spike (like from the deletes
 and the hinting needed after them) slows down the select queries and you
 start more and more of them, soon you could tip the system over into kernel
 scheduler insanity with high system time.  Once in this mode, it will stay
 there until the incoming stream of queries stops and the existing ones
 clear out.  But, if that is what is occurring, I don't know why queries on
 other tables would still be fast.

We probably want a connection pooler anyways, but in this particular case,
the load average is fairly low on the machine running Postrgres.


 Cheers,

 Jeff




-- 
Thank You,
Pweaver (pwea...@panjiva.com)


Re: [PERFORM] Postgres Query Plan Live Lock

2014-02-05 Thread Claudio Freire
On Wed, Feb 5, 2014 at 4:47 PM, Pweaver (Paul Weaver)
pwea...@panjiva.com wrote:
 That is quite extreme.  If a temporary load spike (like from the deletes
 and the hinting needed after them) slows down the select queries and you
 start more and more of them, soon you could tip the system over into kernel
 scheduler insanity with high system time.  Once in this mode, it will stay
 there until the incoming stream of queries stops and the existing ones clear
 out.  But, if that is what is occurring, I don't know why queries on other
 tables would still be fast.

 We probably want a connection pooler anyways, but in this particular case,
 the load average is fairly low on the machine running Postrgres.


Indeed, if lack of connection pooling was the cause, I'd expect a huge
load average (around 100).

Can you post the output of vmstat 6 10 and iostat -x -m -d 6 10
while the server is overloaded? (try to run them at the same time so
results can be correlated).

Also, some details on the hardware wouldn't hurt, like amount of RAM,
number of processors, kind of processor, whether it's a virtual
machine or a bare metal one, number of disks and disk configuration,
etc...


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance