[PERFORM] LIMIT confuses the planner (again)

2009-09-28 Thread Kouber Saparev
Hello, I am using PostgreSQL 8.3.7 and I am experiencing an issue similar to the one I've already described some time ago: http://archives.postgresql.org/pgsql-performance/2009-02/msg00261.php Again, adding a LIMIT clause to a query, which is normally executing very fast thanks to an index,

Re: [PERFORM] LIMIT confuses the planner (again)

2009-09-28 Thread Robert Haas
On Mon, Sep 28, 2009 at 4:43 AM, Kouber Saparev kou...@saparev.com wrote: Hello, I am using PostgreSQL 8.3.7 and I am experiencing an issue similar to the one I've already described some time ago: http://archives.postgresql.org/pgsql-performance/2009-02/msg00261.php Again, adding a LIMIT

Re: [PERFORM] LIMIT confuses the planner

2009-03-22 Thread marcin mank
So the bottom line here is just that the estimated n_distinct is too low.  We've seen before that the equation we use tends to do that more often than not.  I doubt that consistently erring on the high side would be better though :-(.  Estimating n_distinct from a limited sample of the

Re: [PERFORM] LIMIT confuses the planner

2009-03-22 Thread marcin mank
I hit an interestinhg paper on n_distinct calculation: http://www.pittsburgh.intel-research.net/people/gibbons/papers/distinct-values-chapter.pdf the PCSA algorithm described there requires O(1) calculation per value. Page 22 describes what to do with updates streams. This I think

Re: [PERFORM] LIMIT confuses the planner

2009-03-22 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes: I hit an interestinhg paper on n_distinct calculation: http://www.pittsburgh.intel-research.net/people/gibbons/papers/distinct-values-chapter.pdf I don't think we're quite ready to make ANALYZE read every row of a table in order to estimate n_distinct.

Re: [PERFORM] LIMIT confuses the planner

2009-02-24 Thread Kouber Saparev
Tom Lane wrote: Kouber Saparev kou...@saparev.com writes: Now the planner believes there're 910 rows, which is a bit closer to the real data: swing=# select avg(length) from (select username, count(*) as length from login_attempt group by username) as freq; avg

[PERFORM] LIMIT confuses the planner

2009-02-23 Thread Kouber Saparev
Hello, I'm experiencing a strange issue. I have a table with around 11 million records (11471762 to be exact), storing login attempts to a web site. Thanks to the index I have created on username, looking into that table by username is very fast: db=# EXPLAIN ANALYZE SELECT * FROM

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Richard Huxton
Kouber Saparev wrote: db=# EXPLAIN ANALYZE SELECT * FROM login_attempt WHERE username='kouber' ORDER BY login_attempt_sid DESC; QUERY PLAN

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Robert Haas
On Mon, Feb 23, 2009 at 7:26 AM, Kouber Saparev kou...@saparev.com wrote: Now, recently I have altered some of the default parameters in order to get as much as possible out of the hardware - 12 GB of RAM, 8 processors. So, I guess I have done something wrong, thus the planner is taking that

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: If you left seq_page_cost (which isn't mentioned here) at the default value but reduced random_page_cost to 0.1, then you have random_page_cost seq_page_cost. That's probably Bad. ... well, it's certainly going to push the planner to believe

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Kouber Saparev
Richard Huxton wrote: Since it's expecting 7914 rows for kouber it thinks it will find the 20 rows you want fairly quickly by just looking backward through the login_attempt_pkey index. Try increasing the stats on the username column. ALTER TABLE login_attempt ALTER COLUMN username SET

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Tom Lane
Kouber Saparev kou...@saparev.com writes: Now the planner believes there're 910 rows, which is a bit closer to the real data: swing=# select avg(length) from (select username, count(*) as length from login_attempt group by username) as freq; avg --

Re: [PERFORM] LIMIT confuses the planner

2009-02-23 Thread Kouber Saparev
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: If you left seq_page_cost (which isn't mentioned here) at the default value but reduced random_page_cost to 0.1, then you have random_page_cost seq_page_cost. That's probably Bad. ... well, it's certainly going to push the planner to