subject:"Re\: \[PERFORM\] SELECT LIMIT 1 VIEW Performance Issue"

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-12 Thread Merlin Moncure

KC wrote:
 
 So I guess it all comes back to the basic question:
 
 For the query select distinct on (PlayerID) * from Player a where
 PlayerID='0' order by PlayerId Desc, AtDate Desc;
 can the optimizer recognise the fact the query is selecting by the
primary
 key (PlayerID,AtDate), so it can skip the remaining rows for that
 PlayerID,
 as if LIMIT 1 is implied?
 
 Best regards, KC.

Hi KC, have you tried:
select * from player where playerid = '0' and atdate  99
order by platerid desc, atdate desc limit 1;

??
Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-12 Thread K C Lau


Dear Merlin and all,

That direct SQL returns in 0 ms. The problem only appears when a view is used.

What we've done to work around this problem is to modify the table to add a 
field DataStatus which is set to 1 for the latest record for each player, 
and reset to 0 when it is superceded.


A partial index is then created as:
CREATE INDEX IDX_CurPlayer on Player (PlayerID) where DataStatus = 1;

The VCurPlayer view is changed to:
CREATE or REPLACE VIEW VCurPlayer as select * from Player where DataStatus = 1;
and it now returns in 0 ms.

This is not the best solution, but until (if ever) the original problem is 
fixed, we have not found an alternative work around.


The good news is that even with the additional overhead of maintaining an 
extra index and the problem of vacuuming, pg 8.0.3 still performs 
significantly faster on Windows than MS Sql 2000 in our OLTP application 
testing so far.


Thanks to all for your help.

Best regards,
KC.

At 20:14 05/10/12, you wrote:

KC wrote:

 So I guess it all comes back to the basic question:

 For the query select distinct on (PlayerID) * from Player a where
 PlayerID='0' order by PlayerId Desc, AtDate Desc;
 can the optimizer recognise the fact the query is selecting by the
primary
 key (PlayerID,AtDate), so it can skip the remaining rows for that
 PlayerID,
 as if LIMIT 1 is implied?

 Best regards, KC.

Hi KC, have you tried:
select * from player where playerid = '0' and atdate  99
order by platerid desc, atdate desc limit 1;

??
Merlin



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-05 Thread Merlin Moncure

[to K C:] sorry, was out on vactation all last week.  I was visualizing
the problem incorrectly anyways...

Jim wrote:
 That function is not immutable, it should be defined as stable.

That is 100% correct: however now and then I declare stable functions as
immutable in some cases because the planner treats them differently with
no side effects...this is a hack of course...see my earlier suggestion
to try both immutable and stable versions.  I can give a pretty good
example of when this can make a big difference.
 
 PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and
 I'm not sure how much those are pre-compiled, though they are
 syntax-checked at creation). Do you get the same result time when you
 run it a second time? What time do you get from running just the
 function versus the SQL in the function?

plpgsql functions are at least partially compiled (sql functions afaik
are not), in that a internal state is generated following the first
execution.  This is the cause of all those infernal 'invalid table oid'
errors.
 
 Also, remember that every layer you add to the cake means more work
for
 the database. If speed is that highly critical you'll probably want to
 not wrap things in functions, and possibly not use views either.

The overhead of the function/view is totally inconsequential next to the
planner choosing a suboptimal plan.  The purpose of the function is to
coerce the planner into choosing the correct plan.

 Also, keep in mind that getting below 1ms doesn't automatically mean
 you'll be able to scale to 1000TPS. Things will definately change when
 you load the system down, so if performance is that critical you
should
 start testing with the system under load if you're not already.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-04 Thread Jim C. Nasby

On Fri, Sep 23, 2005 at 08:17:03PM +0800, K C Lau wrote:
 esdt= create or replace function player_max_atdate (varchar(32)) returns 
 varchar(32) as $$
 esdt$  select distinct on (PlayerID) AtDate from player where PlayerID= $1 
 order by PlayerID desc, AtDate desc limit 1;
 esdt$ $$ language sql immutable;
 CREATE FUNCTION

That function is not immutable, it should be defined as stable.

 esdt= create or replace view VCurPlayer3 as select * from Player where 
 AtDate = player_max_atdate(PlayerID);
 CREATE VIEW
 esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
 PlayerID='0';
 
  Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
 width=23) (actual time=76.660..76.664 rows=1 loops=1)
Index Cond: ((playerid)::text = '0'::text)
Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
  Total runtime: 76.716 ms
 
 Why wouldn't the function get the row as quickly as the direct sql does?

PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and
I'm not sure how much those are pre-compiled, though they are
syntax-checked at creation). Do you get the same result time when you
run it a second time? What time do you get from running just the
function versus the SQL in the function?

Also, remember that every layer you add to the cake means more work for
the database. If speed is that highly critical you'll probably want to
not wrap things in functions, and possibly not use views either.

Also, keep in mind that getting below 1ms doesn't automatically mean
you'll be able to scale to 1000TPS. Things will definately change when
you load the system down, so if performance is that critical you should
start testing with the system under load if you're not already.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-04 Thread Jim C. Nasby

On Fri, Sep 23, 2005 at 04:53:55PM +0800, K C Lau wrote:
 Thank you all for your suggestions. I' tried, with some variations too, but 
 still no success. The times given are the best of a few repeated tries on 
 an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.
 
 For reference, only the following gets the record quickly:
 
 esdt= explain analyze select PlayerID,AtDate from Player a
  where PlayerID='0' and AtDate = (select b.AtDate from Player b
  where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 
  1);
 
  Index Scan using pk_player on player a  (cost=0.75..4.26 rows=1 width=23) 
 (actual time=0.054..0.057 rows=1 loops=1)
Index Cond: (((playerid)::text = '0'::text) AND ((atdate)::text = 
 ($0)::text))
InitPlan
  -  Limit  (cost=0.00..0.75 rows=1 width=23) (actual 
 time=0.027..0.028 rows=1 loops=1)
-  Index Scan Backward using pk_player on player 
 b  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 
 loops=1)
  Index Cond: ((playerid)::text = '0'::text)
  Total runtime: 0.132 ms

If you're doing that, you should try something like the following:
decibel=# explain analyze select  * from t where ctid=(select ctid from rrs 
order by rrs_id desc limit 1);
 QUERY PLAN 
 
-
 Tid Scan on t  (cost=0.44..4.45 rows=1 width=42) (actual time=0.750..0.754 
rows=1 loops=1)
   Filter: (ctid = $0)
   InitPlan
 -  Limit  (cost=0.00..0.44 rows=1 width=10) (actual time=0.548..0.549 
rows=1 loops=1)
   -  Index Scan Backward using rrs_rrs__rrs_id on rrs  
(cost=0.00..3.08 rows=7 width=10) (actual time=0.541..0.541 rows=1 loops=1)
 Total runtime: 1.061 ms
(6 rows)

decibel=# select count(*) from t; count  

 458752

Note that that's on my nice slow laptop to boot (the count took like 10
seconds).

Just remember that ctid *is not safe outside of a transaction*!! So you can't
do something like

SELECT ctid FROM ...
store that in some variable...
SELECT * FROM table WHERE ctid = variable
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-10-04 Thread Jim C. Nasby

On Tue, Oct 04, 2005 at 04:15:41PM -0500, Jim C. Nasby wrote:
 Index Cond: ((playerid)::text = '0'::text)

Also, why is playerid a text field? Comparing ints will certainly be
faster...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-26 Thread K C Lau


At 20:17 05/09/23, K C Lau wrote:

At 19:15 05/09/23, Simon Riggs wrote:

select distinct on (PlayerID) PlayerID,AtDate from Player a
where PlayerID='0' order by PlayerId, AtDate Desc;

Does that work for you?

Best Regards, Simon Riggs


esdt= explain analyze select distinct on (PlayerID) PlayerID,AtDate from 
Player a where PlayerID='0' order by PlayerId Desc, AtDate Desc;
 Unique  (cost=0.00..1327.44 rows=2 width=23) (actual time=0.027..8.438 
rows=1 loops=1)
   -  Index Scan Backward using pk_player on player 
a  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..4.950 
rows=1743 loops=1)

 Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 8.499 ms

That is the fastest of all queries looping the 1743 rows.
I do get the desired result by adding LIMIT 1:

esdt= explain analyze select distinct on (PlayerID) PlayerID,AtDate from 
Player a where PlayerID='0' order by PlayerId Desc, AtDate Desc LIMIT 1;


 Limit  (cost=0.00..663.72 rows=1 width=23) (actual time=0.032..0.033 
rows=1 loops=1)
   -  Unique  (cost=0.00..1327.44 rows=2 width=23) (actual 
time=0.028..0.028 rows=1 loops=1)
 -  Index Scan Backward using pk_player on player 
a  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..0.022 
rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 0.094 ms

However, when I use that within a function in a view, it is slow again:

esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$
esdt$  select distinct on (PlayerID) AtDate from player where PlayerID= 
$1 order by PlayerID desc, AtDate desc limit 1;

esdt$ $$ language sql immutable;
CREATE FUNCTION
esdt= create or replace view VCurPlayer3 as select * from Player where 
AtDate = player_max_atdate(PlayerID);

CREATE VIEW
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';


 Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
width=23) (actual time=76.660..76.664 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
 Total runtime: 76.716 ms

Why wouldn't the function get the row as quickly as the direct sql does?


Results from the following query suggests that the explain analyze output 
above only tells half the story, and that the function is in fact called 
1743 times:


esdt= create or replace view VCurPlayer3 as select distinct on (PlayerID) 
* from Player a where OID = (select distinct on (PlayerID) OID from Player 
b where b.PlayerID = a.PlayerID and b.AtDate = 
player_max_atdate(b.PlayerID) order by PlayerID desc, AtDate desc limit 1) 
order by PlayerId Desc, AtDate desc;

CREATE VIEW
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';
 Subquery Scan vcurplayer3  (cost=0.00..1715846.91 rows=1 width=68) 
(actual time=0.640..119.124 rows=1 loops=1)
   -  Unique  (cost=0.00..1715846.90 rows=1 width=776) (actual 
time=0.633..119.112 rows=1 loops=1)
 -  Index Scan Backward using pk_player on player 
a  (cost=0.00..1715846.88 rows=9 width=776) (actual time=0.628..119.104 
rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: (oid = (subplan))
   SubPlan
 -  Limit  (cost=0.00..976.38 rows=1 width=27) (actual 
time=0.057..0.058 rows=1 loops=1743)
   -  Unique  (cost=0.00..976.38 rows=1 width=27) 
(actual time=0.052..0.052 rows=1 loops=1743)
 -  Index Scan Backward using pk_player on 
player b  (cost=0.00..976.36 rows=6 width=27) (actual  time=0.047..0.047 
rows=1 loops=1743)

   Index Cond: ((playerid)::text = ($0)::text)
   Filter: ((atdate)::text = 
(player_max_atdate(playerid))::text)

 Total runtime: 119.357 ms

It would also explain the very long time taken by the pl/pgsql function I 
posted a bit earlier.


So I guess it all comes back to the basic question:

For the query select distinct on (PlayerID) * from Player a where 
PlayerID='0' order by PlayerId Desc, AtDate Desc;
can the optimizer recognise the fact the query is selecting by the primary 
key (PlayerID,AtDate), so it can skip the remaining rows for that PlayerID, 
as if LIMIT 1 is implied?


Best regards, KC.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread K C Lau

Thank you all for your suggestions. I' tried, with some variations too, but 
still no success. The times given are the best of a few repeated tries on 
an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.


For reference, only the following gets the record quickly:

esdt= explain analyze select PlayerID,AtDate from Player a
 where PlayerID='0' and AtDate = (select b.AtDate from Player b
 where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

 Index Scan using pk_player on player a  (cost=0.75..4.26 rows=1 width=23) 
(actual time=0.054..0.057 rows=1 loops=1)
   Index Cond: (((playerid)::text = '0'::text) AND ((atdate)::text = 
($0)::text))

   InitPlan
 -  Limit  (cost=0.00..0.75 rows=1 width=23) (actual 
time=0.027..0.028 rows=1 loops=1)
   -  Index Scan Backward using pk_player on player 
b  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 
loops=1)

 Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 0.132 ms

At 02:19 05/09/23, Kevin Grittner wrote:

Have you tried the best choice pattern -- where you select the set of
candidate rows and then exclude those for which a better choice
exists within the set?  I often get better results with this pattern than
with the alternatives.


esdt= explain analyze select PlayerID,AtDate from Player a where 
PlayerID='0'
and not exists (select * from Player b where b.PlayerID = a.PlayerID and 
b.AtDate  a.AtDate);


 Index Scan using pk_player on player a  (cost=0.00..3032.46 rows=878 
width=23)

(actual time=35.820..35.823 rows=1 loops=1)
   Index Cond: ((playerid)::text = '0'::text)
   Filter: (NOT (subplan))
   SubPlan
 -  Index Scan using pk_player on player b  (cost=0.00..378.68 
rows=389 width=776) (actual time=0.013..0.013 rows=1 loops=1743)
   Index Cond: (((playerid)::text = ($0)::text) AND 
((atdate)::text  ($1)::text))

 Total runtime: 35.950 ms

Note that it is faster than the LIMIT 1:

esdt= explain analyze select PlayerID,AtDate from Player a where 
PlayerID='0' and AtDate = (select b.AtDate from Pl
ayer b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate 
desc LIMIT 1);


 Index Scan using pk_player on player a  (cost=0.00..2789.07 rows=9 
width=23) (actual time=41.366..41.371 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = ((subplan))::text)
   SubPlan
 -  Limit  (cost=0.00..0.83 rows=1 width=23) (actual 
time=0.013..0.014 rows=1 loops=1743)
   -  Index Scan Backward using pk_player on player 
b  (cost=0.00..970.53 rows=1166 width=23) (actual time=0.008..0.008 rows=1 
loops=1743)

 Index Cond: ((playerid)::text = ($0)::text)
 Total runtime: 41.490 ms

At 02:07 05/09/23, Merlin Moncure wrote:

 Here is a trick I use sometimes with views, etc.  This may or may not be
 effective to solve your problem but it's worth a shot.  Create one small
 SQL function taking date, etc. and returning the values and define it
 immutable.  Now in-query it is treated like a constant.


esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$
esdt$  select atdate from player where playerid = $1 order by playerid 
desc, AtDate desc limit 1;

esdt$ $$ language sql immutable;
CREATE FUNCTION
esdt= create or replace view VCurPlayer3 as select * from Player where 
AtDate = player_max_atdate(PlayerID);

CREATE VIEW
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';


 Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
width=23) (actual time=65.434..65.439 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
 Total runtime: 65.508 ms

While it says loops=1, the time suggests that it is going through all 1743 
records for that PlayerID.


I tried to simulate the fast subquery inside the function, but it is taking 
almost twice as much time:


esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$
esdt$  select atdate from player a where playerid = $1 and AtDate = 
(select b.AtDate from Player b

esdt$  where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1);
esdt$ $$ language sql immutable;
CREATE FUNCTION
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';


 Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
width=23) (actual time=119.369..119.373 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
 Total runtime: 119.441 ms

Adding another LIMIT 1 inside the function makes it even slower:

esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$
esdt$  select atdate from player where playerid = $1 and AtDate = (select 
b.AtDate from Player b

esdt$  where b.PlayerID = $1 order by b.PlayerID

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread Simon Riggs

On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote: 
   esdt= explain analyze select PlayerID,AtDate from Player a
 where PlayerID='0' and AtDate = (select b.AtDate from Player b
 where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc
   LIMIT 1);

I think you should try:

select distinct on (PlayerID) PlayerID,AtDate from Player a
where PlayerID='0' order by PlayerId, AtDate Desc;

Does that work for you?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread K C Lau


At 19:15 05/09/23, Simon Riggs wrote:

select distinct on (PlayerID) PlayerID,AtDate from Player a
where PlayerID='0' order by PlayerId, AtDate Desc;

Does that work for you?

Best Regards, Simon Riggs


esdt= explain analyze select distinct on (PlayerID) PlayerID,AtDate from 
Player a where PlayerID='0' order by PlayerId, AtDate Desc;
 Unique  (cost=1417.69..1426.47 rows=2 width=23) (actual 
time=31.231..36.609 rows=1 loops=1)
   -  Sort  (cost=1417.69..1422.08 rows=1756 width=23) (actual 
time=31.129..32.473 rows=1743 loops=1)

 Sort Key: playerid, atdate
 -  Index Scan using pk_player on player a  (cost=0.00..1323.05 
rows=1756 width=23) (actual time=0.035..6.575 rows=1743 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 36.943 ms

The sort was eliminated with: order by PlayerId Desc, AtDate Desc:

esdt= explain analyze select distinct on (PlayerID) PlayerID,AtDate from 
Player a where PlayerID='0' order by PlayerId Desc, AtDate Desc;
 Unique  (cost=0.00..1327.44 rows=2 width=23) (actual time=0.027..8.438 
rows=1 loops=1)
   -  Index Scan Backward using pk_player on player 
a  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..4.950 
rows=1743 loops=1)

 Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 8.499 ms

That is the fastest of all queries looping the 1743 rows.
I do get the desired result by adding LIMIT 1:

esdt= explain analyze select distinct on (PlayerID) PlayerID,AtDate from 
Player a where PlayerID='0' order by PlayerId Desc, AtDate Desc LIMIT 1;


 Limit  (cost=0.00..663.72 rows=1 width=23) (actual time=0.032..0.033 
rows=1 loops=1)
   -  Unique  (cost=0.00..1327.44 rows=2 width=23) (actual 
time=0.028..0.028 rows=1 loops=1)
 -  Index Scan Backward using pk_player on player 
a  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..0.022 rows=1 
loops=1)

   Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 0.094 ms

However, when I use that within a function in a view, it is slow again:

esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$
esdt$  select distinct on (PlayerID) AtDate from player where PlayerID= $1 
order by PlayerID desc, AtDate desc limit 1;

esdt$ $$ language sql immutable;
CREATE FUNCTION
esdt= create or replace view VCurPlayer3 as select * from Player where 
AtDate = player_max_atdate(PlayerID);

CREATE VIEW
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';


 Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
width=23) (actual time=76.660..76.664 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
 Total runtime: 76.716 ms

Why wouldn't the function get the row as quickly as the direct sql does?

Best regards, KC.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread Merlin Moncure

 At 02:07 05/09/23, Merlin Moncure wrote:
   Here is a trick I use sometimes with views, etc.  This may or may
not
 be
   effective to solve your problem but it's worth a shot.  Create
one
 small
   SQL function taking date, etc. and returning the values and
define it
   immutable.  Now in-query it is treated like a constant.
 
 esdt= create or replace function player_max_atdate (varchar(32))
returns
 varchar(32) as $$
 esdt$  select atdate from player where playerid = $1 order by
playerid
 desc, AtDate desc limit 1;
 esdt$ $$ language sql immutable;

Can you time just the execution of this function and compare vs. pure
SQL version?  If the times are different, can you do a exaplain analyze
of a prepared version of above?

prepare test(character varying) as select atdate from player where
playerid = $1 order by playerid desc, AtDate desc limit 1;

explain analyze execute test('0');

 CREATE FUNCTION
 esdt= create or replace view VCurPlayer3 as select * from Player
where
 AtDate = player_max_atdate(PlayerID);
 CREATE VIEW

This is wrong, it should have been 
create or replace view VCurPlayer3 as select *,
player_max_atdate(PlayerID) as max_date  from Player;

I did a test on a table with 124k records and a two part key, ID  date.
esp# select count(*) from parts_order_file;
count

 124158
(1 row)


esp=# select count(*) from parts_order_file where pr_dealer_no =
'000500';
 count
---
 27971
(1 row)

created same function, view v, etc.
esp=# explain analyze select * from v where pr_dealer_no = '000500'
limit 1;
 
QUERY PLAN




 Limit  (cost=0.00..3.87 rows=1 width=10) (actual time=1.295..1.297
rows=1 loops=1)
   -  Index Scan using parts_order_file_pr_dealer_no_key on
parts_order_file  (cost=0.00..109369.15
 rows=28226 width=10) (actual time=1.287..1.287 rows=1 loops=1)
 Index Cond: (pr_dealer_no = '000500'::bpchar)
 Total runtime: 1.413 ms
(4 rows)

Something is not jiving here.  However, if the server plan still does
not come out correct, try the following (p.s. why is function returning
varchar(32) and not date?):

create or replace function player_max_atdate (varchar(32)) returns date
as
$$
DECLARE
   player_record record;
   return date date;
BEGIN
   for player_record in execute
   'select atdate from player where playerid = \'' || $1 || '\'
order by playerid desc, AtDate desc limit 1;' loop
   return_date = player_record.atdate;
   end loop;
   
   return return_date;
END;
$ language plpgsql immutable;

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread K C Lau


Dear Merlin,

At 20:34 05/09/23, Merlin Moncure wrote:

Can you time just the execution of this function and compare vs. pure
SQL version?  If the times are different, can you do a exaplain analyze
of a prepared version of above?


esdt= prepare test(character varying) as select atdate from player where
esdt- playerid = $1 order by playerid desc, AtDate desc limit 1;
PREPARE
esdt= explain analyze execute test('0');
 Limit  (cost=0.00..0.83 rows=1 width=23) (actual time=0.032..0.033 rows=1 
loops=1)
   -  Index Scan Backward using pk_player on player  (cost=0.00..970.53 
rows=1166 width=23) (actual time=0.027..0.027 rows=1 loops=1)

 Index Cond: ((playerid)::text = ($1)::text)
 Total runtime: 0.088 ms

The prepared SQL timing is similar to that of a direct SQL.


 esdt= create or replace view VCurPlayer3 as select * from Player where
 AtDate = player_max_atdate(PlayerID);

This is wrong, it should have been
create or replace view VCurPlayer3 as select *,
player_max_atdate(PlayerID) as max_date  from Player;


Your suggestion returns all the records plus a max AtDate column for each 
PlayerID.
What I want to get with the view is the record that has the max value of 
AtDate for each PlayerID.
The AtDate is a varchar(23) field containing a string date of format 
'mmddhh', not the SQL Date field. Sorry if that confused you.



Something is not jiving here.  However, if the server plan still does
not come out correct, try the following (p.s. why is function returning
varchar(32) and not date?):


esdt= create or replace function player_max_atdate (varchar(32)) returns 
varchar(32) as $$

esdt$ DECLARE
esdt$player_record record;
esdt$return_date varchar(32);
esdt$ BEGIN
esdt$for player_record in execute
esdt$'select atdate from player where playerid = \'' || $1 || 
'\' order by playerid desc, AtDate desc limit 1;' loop

esdt$return_date = player_record.atdate;
esdt$end loop;
esdt$return return_date;
esdt$ END;
esdt$ $$ language plpgsql immutable;
CREATE FUNCTION
esdt= create or replace view VCurPlayer3 as select * from Player where 
AtDate = player_max_atdate(PlayerID);

CREATE VIEW
esdt= explain analyze select PlayerID,AtDate from VCurPlayer3 where 
PlayerID='0';


 Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9 
width=23) (actual time=849.021..849.025 rows=1 loops=1)

   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
 Total runtime: 849.078 ms

Your suggested plpgsql function seems to be even slower, with a best time 
of 849 ms after several tries. Is that expected?


Thanks again and best regards,
KC.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread Mark Kirkwood


K C Lau wrote:
Thank you all for your suggestions. I' tried, with some variations too, 
but still no success. The times given are the best of a few repeated 
tries on an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.




A small denormalization, where you mark the row with the latest atdate 
for each playerid may get you the performance you want.


e.g: (8.1beta1)

ALTER TABLE player ADD islastatdate boolean;

UPDATE player SET islastatdate = true where (playerid,atdate) IN
(SELECT playerid, atdate FROM vcurplayer);

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE islastatdate;

CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate)
WHERE islastatdate;

ANALYZE player;

Generating some test data produced:

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='0';

   QUERY PLAN 



 Index Scan using player_id_lastatdate on player a  (cost=0.00..4.33 
rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1)

   Index Cond: ((playerid = '0'::text) AND (lastatdate = true))
   Filter: lastatdate
 Total runtime: 0.272 ms
(4 rows)

Whereas with the original view definition:

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE a.atdate =
( SELECT max(b.atdate) FROM player b
  WHERE a.playerid = b.playerid);

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='0';

QUERY PLAN 


-
 Index Scan using player_id_date on player a  (cost=0.00..7399.23 
rows=11 width=13) (actual time=121.738..121.745 rows=1 loops=1)

   Index Cond: (playerid = '0'::text)
   Filter: (atdate = (subplan))
   SubPlan
 -  Result  (cost=1.72..1.73 rows=1 width=0) (actual 
time=0.044..0.047 rows=1 loops=2000)

   InitPlan
 -  Limit  (cost=0.00..1.72 rows=1 width=4) (actual 
time=0.028..0.031 rows=1 loops=2000)
   -  Index Scan Backward using player_id_date on 
player b  (cost=0.00..3787.94 rows=2198 width=4) (actual 
time=0.019..0.019 rows=1 loops=2000)

 Index Cond: ($0 = playerid)
 Filter: (atdate IS NOT NULL)
 Total runtime: 121.916 ms
(11 rows)

Note that my generated data has too many rows for each playerid, but the 
 difference in performance should illustrate the idea.


Cheers

Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread K C Lau


Dear Mark,

Thank you. That seems like a more manageable alternative if nothing else 
works out. It should cover many of the OLTP update transactions. But it 
does mean quite a bit of programming changes and adding another index on 
all such tables, and it would not cover those cases when we need to get the 
latest record before a certain time, for example.


I'm wondering if this performance issue is common enough for other users to 
merit a fix in pg, especially as it seems that with MVCC, each of the data 
records need to be accessed in addition to scanning the index.


Best regards,
KC.

At 09:40 05/09/24, Mark Kirkwood wrote:
A small denormalization, where you mark the row with the latest atdate for 
each playerid may get you the performance you want.


e.g: (8.1beta1)

ALTER TABLE player ADD islastatdate boolean;

UPDATE player SET islastatdate = true where (playerid,atdate) IN
(SELECT playerid, atdate FROM vcurplayer);

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE islastatdate;

CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate)
WHERE islastatdate;

ANALYZE player;

Generating some test data produced:

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='0';

   QUERY PLAN

 Index Scan using player_id_lastatdate on player a  (cost=0.00..4.33 
rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1)

   Index Cond: ((playerid = '0'::text) AND (lastatdate = true))
   Filter: lastatdate
 Total runtime: 0.272 ms
(4 rows)

Whereas with the original view definition:

CREATE OR REPLACE VIEW vcurplayer AS
SELECT * FROM player a
WHERE a.atdate =
( SELECT max(b.atdate) FROM player b
  WHERE a.playerid = b.playerid);

EXPLAIN ANALYZE
SELECT playerid,atdate
FROM vcurplayer
WHERE playerid='0';
QUERY PLAN
-
 Index Scan using player_id_date on player a  (cost=0.00..7399.23 rows=11 
width=13) (actual time=121.738..121.745 rows=1 loops=1)

   Index Cond: (playerid = '0'::text)
   Filter: (atdate = (subplan))
   SubPlan
 -  Result  (cost=1.72..1.73 rows=1 width=0) (actual 
time=0.044..0.047 rows=1 loops=2000)

   InitPlan
 -  Limit  (cost=0.00..1.72 rows=1 width=4) (actual 
time=0.028..0.031 rows=1 loops=2000)
   -  Index Scan Backward using player_id_date on player 
b  (cost=0.00..3787.94 rows=2198 width=4) (actual time=0.019..0.019 
rows=1 loops=2000)

 Index Cond: ($0 = playerid)
 Filter: (atdate IS NOT NULL)
 Total runtime: 121.916 ms
(11 rows)

Note that my generated data has too many rows for each playerid, but 
the  difference in performance should illustrate the idea.


Cheers

Mark



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-23 Thread Mark Kirkwood


K C Lau wrote:

I'm wondering if this performance issue is common enough for other users 
to merit a fix in pg, especially as it seems that with MVCC, each of the 
data records need to be accessed in addition to scanning the index.




Yes - there are certainly cases where index only access (or something 
similar, like b+tree tables) would be highly desirable.


From what I have understood from previous discussions, there are 
difficulties involved with producing a design that does not cause new 
problems...


regards

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs

On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote:

 Investigating further on this problem I brought up in June, the following 
 query with pg 8.0.3 on Windows scans all 1743 data records for a player:
 
 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc 
 LIMIT 1);
 

   Total runtime: 51.133 ms
 
 Using a static value in the subquery produces the desired result below, but 
 since we use views for our queries (see last part of this email), we cannot 
 push the static value into the subquery:
 
 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

   Total runtime: 0.149 ms
 
 The Player table has a primary key on PlayerID, AtDate. Is there a way to 
 stop the inner-most index scan looping all 1743 data records for that 
 player?  Is that a bug or known issue?

Currently the planner can't tell whether a subquery is correlated or not
until it has planned the query. So it is unable to push down the
qualification automatically in the way you have achieved manually. The
new min() optimisation doesn't yet work with GROUP BY which is what you
would use to reformulate the query that way, so no luck that way either.

If you don't want to do this in a view, calculate the values for all
players at once and store the values in a summary table for when you
need them.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau

We use similar views as base views throughout our OLTP system to get the 
latest time-based record(s). So it is quite impossible to use summary 
tables etc. Are there other ways to do it?


The subquery would pinpoint the record(s) with the composite primary key. 
Both MS Sql and Oracle do not have such performance problem. So this 
problem is effectively stopping us from migrating to PostgreSQL.


Any suggestions would be most appreciated.

Best regards,
KC.

At 16:40 05/09/22, Simon Riggs wrote:

On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote:

 Investigating further on this problem I brought up in June, the following
 query with pg 8.0.3 on Windows scans all 1743 data records for a player:

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc
 LIMIT 1);


   Total runtime: 51.133 ms

 Using a static value in the subquery produces the desired result below, 
but
 since we use views for our queries (see last part of this email), we 
cannot

 push the static value into the subquery:

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc 
LIMIT 1);


   Total runtime: 0.149 ms

 The Player table has a primary key on PlayerID, AtDate. Is there a way to
 stop the inner-most index scan looping all 1743 data records for that
 player?  Is that a bug or known issue?

Currently the planner can't tell whether a subquery is correlated or not
until it has planned the query. So it is unable to push down the
qualification automatically in the way you have achieved manually. The
new min() optimisation doesn't yet work with GROUP BY which is what you
would use to reformulate the query that way, so no luck that way either.

If you don't want to do this in a view, calculate the values for all
players at once and store the values in a summary table for when you
need them.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs

On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote:
 We use similar views as base views throughout our OLTP system to get the 
 latest time-based record(s). So it is quite impossible to use summary 
 tables etc. Are there other ways to do it?
 
 The subquery would pinpoint the record(s) with the composite primary key. 
 Both MS Sql and Oracle do not have such performance problem. So this 
 problem is effectively stopping us from migrating to PostgreSQL.
 
 Any suggestions would be most appreciated.

Even if this were fixed for 8.1, which seems unlikely, would you be able
to move to that release immediately?

ISTM you have two choices, in priority, complexity and time/cost order
1) custom mods to your app
2) custom mods to PostgreSQL

Maybe its possible to reconstruct your query with sub-sub-selects so
that you have a correlated query with manually pushed down clauses,
which also references a more constant base view?

Is a 51ms query really such a problem for you?

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Merlin Moncure

 I previously posted the following as a sequel to my SELECT DISTINCT
 Performance Issue question. We would most appreciate any clue or
 suggestions on how to overcome this show-stopping issue. We are using
 8.0.3 on Windows.
 
 Is it a known limitation when using a view with SELECT ... LIMIT 1?
 
 Would the forthcoming performance enhancement with MAX help when used
 within a view, as in:
 
 create or replace view VCurPlayer as select * from Player a
 where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID
=
 b.PlayerID);

Here is a trick I use sometimes with views, etc.  This may or may not be
effective to solve your problem but it's worth a shot.  Create one small
SQL function taking date, etc. and returning the values and define it
immutable.  Now in-query it is treated like a constant.

Another useful application for this feature is when you have nested
views (view 1 queries view 2) and you need to filter records based on
fields from view 2 which are not returned in view 1.  Impossible? 

in view 2 add clause where v2.f between  f_min() and f_max(), them being
immutable functions which can grab filter criteria based on inputs or
values from a table.

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau


At 20:48 05/09/22, Simon Riggs wrote:

On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote:
 We use similar views as base views throughout our OLTP system to get the
 latest time-based record(s). So it is quite impossible to use summary
 tables etc. Are there other ways to do it?

 The subquery would pinpoint the record(s) with the composite primary key.
 Both MS Sql and Oracle do not have such performance problem. So this
 problem is effectively stopping us from migrating to PostgreSQL.

 Any suggestions would be most appreciated.

Even if this were fixed for 8.1, which seems unlikely, would you be able
to move to that release immediately?


Yes. In fact when we first developed our system a few years ago, we tested 
on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem. When we 
try again with PG 8.0, the performance becomes unbearable, but other areas 
appear ok and other queries are often faster than MS Sql2k.



Maybe its possible to reconstruct your query with sub-sub-selects so
that you have a correlated query with manually pushed down clauses,
which also references a more constant base view?


We would be most happy to try them if we have some example views or pointers.


Is a 51ms query really such a problem for you?


Unfortunately yes, as our target performance is in the high hundreds of 
transactions per sec. And 51 ms is already the best case for a single 
select, with everything cached in memory immediately after the same select 
which took 390 ms on a quiet system.



Best Regards, Simon Riggs


Best regards,
KC. 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread K C Lau


At 22:37 05/09/22, Merlin Moncure wrote:


 create or replace view VCurPlayer as select * from Player a
 where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID=
 b.PlayerID);

Here is a trick I use sometimes with views, etc.  This may or may not be
effective to solve your problem but it's worth a shot.  Create one small
SQL function taking date, etc. and returning the values and define it
immutable.  Now in-query it is treated like a constant.


We don't use functions as a rule, but I would be glad to give it a try.
I would most appreciate if you could define a sample function and rewrite 
the VCurPlayer view above. Both PlayerID and AtDate are varchar fields.



Another useful application for this feature is when you have nested
views (view 1 queries view 2) and you need to filter records based on
fields from view 2 which are not returned in view 1.  Impossible?

in view 2 add clause where v2.f between  f_min() and f_max(), them being
immutable functions which can grab filter criteria based on inputs or
values from a table.

Merlin


Best regards,
KC. 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Tom Lane

K C Lau [EMAIL PROTECTED] writes:
 At 20:48 05/09/22, Simon Riggs wrote:
 Even if this were fixed for 8.1, which seems unlikely, would you be able
 to move to that release immediately?

 Yes. In fact when we first developed our system a few years ago, we tested 
 on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem.

It's really not credible that PG 7.1 did any better with this than
current sources do.  The subplan mechanism hasn't changed materially
since about 6.5.  It could be that 7.1's performance was simply so
bad across the board that you didn't notice ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Simon Riggs

On Thu, 2005-09-22 at 22:39 +0800, K C Lau wrote:
 Is a 51ms query really such a problem for you?
 
 Unfortunately yes, as our target performance is in the high hundreds of 
 transactions per sec. And 51 ms is already the best case for a single 
 select, with everything cached in memory immediately after the same select 
 which took 390 ms on a quiet system.

If the current value is used so often, use two tables - one with a
current view only of the row maintained using UPDATE. Different
performance issues maybe, but at least not correlated subquery ones.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Merlin Moncure

 Here is a trick I use sometimes with views, etc.  This may or may not
be
 effective to solve your problem but it's worth a shot.  Create one
small
 SQL function taking date, etc. and returning the values and define it
 immutable.  Now in-query it is treated like a constant.
 
 We don't use functions as a rule, but I would be glad to give it a
try.
 I would most appreciate if you could define a sample function and
rewrite
 the VCurPlayer view above. Both PlayerID and AtDate are varchar
fields.

 esdt= explain analyze select PlayerID,AtDate from Player a
   where PlayerID='0' and AtDate = (select b.AtDate from Player b
   where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc
LIMIT 1
try:

create function player_max_at_date (varchar) returns date as 
$$
select atdate from player where playerid = $1 order by playerid
desc, AtDate desc limit 1;
$$ language sql immutable;

create view v as select playerid, player_max_at_date(playerid) from
player;
select * from v where playerid = 'x'; --etc

note: this function is not really immutable.  try with both 'immutable'
and 'stable' if performance is same, do stable.

You're welcome in advance, ;)
Merlin



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-22 Thread Kevin Grittner

Have you tried the best choice pattern -- where you select the set of
candidate rows and then exclude those for which a better choice
exists within the set?  I often get better results with this pattern than
with the alternatives.  Transmuting your query to use this patter gives:
 
select PlayerID,AtDate from Player a where PlayerID='0'
  and not exists
(select * from Player b
  where b.PlayerID = a.PlayerID and b.AtDate  a.AtDate);
 
 K C Lau [EMAIL PROTECTED] 09/21/05 11:21 PM 

select PlayerID,AtDate from Player a
  where PlayerID='0' and AtDate = (select b.AtDate from Player b
  where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc 
LIMIT 1);


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

2005-09-21 Thread K C Lau


Hi All,

Investigating further on this problem I brought up in June, the following 
query with pg 8.0.3 on Windows scans all 1743 data records for a player:


esdt= explain analyze select PlayerID,AtDate from Player a
 where PlayerID='0' and AtDate = (select b.AtDate from Player b
 where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc 
LIMIT 1);


 Index Scan using pk_player on player a  (cost=0.00..2789.07 rows=9 
width=23) (a

ctual time=51.046..51.049 rows=1 loops=1)
   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = ((subplan))::text)
   SubPlan
 -  Limit  (cost=0.00..0.83 rows=1 width=23) (actual 
time=0.016..0.017 rows

=1 loops=1743)
   -  Index Scan Backward using pk_player on player 
b  (cost=0.00..970.

53 rows=1166 width=23) (actual time=0.011..0.011 rows=1 loops=1743)
 Index Cond: ((playerid)::text = ($0)::text)
 Total runtime: 51.133 ms

Using a static value in the subquery produces the desired result below, but 
since we use views for our queries (see last part of this email), we cannot 
push the static value into the subquery:


esdt= explain analyze select PlayerID,AtDate from Player a
 where PlayerID='0' and AtDate = (select b.AtDate from Player b
 where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

 Index Scan using pk_player on player a  (cost=0.75..4.26 rows=1 width=23) 
(actu

al time=0.054..0.058 rows=1 loops=1)
   Index Cond: (((playerid)::text = '0'::text) AND ((atdate)::text = 
($0)::t

ext))
   InitPlan
 -  Limit  (cost=0.00..0.75 rows=1 width=23) (actual 
time=0.028..0.029 rows

=1 loops=1)
   -  Index Scan Backward using pk_player on player 
b  (cost=0.00..1323

.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 loops=1)
 Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 0.149 ms

The Player table has a primary key on PlayerID, AtDate. Is there a way to 
stop the inner-most index scan looping all 1743 data records for that 
player?  Is that a bug or known issue?


BTW, I tried using 8.1 beta2 on Windows and its performance is similar, I 
have also tried other variants such as MAX and DISTINCT but with no success.


Any help is most appreciated.

Best regards,
KC.


At 10:46 05/06/15, K C Lau wrote:

Hi All,

I previously posted the following as a sequel to my SELECT DISTINCT 
Performance Issue question. We would most appreciate any clue or 
suggestions on how to overcome this show-stopping issue. We are using 
8.0.3 on Windows.


Is it a known limitation when using a view with SELECT ... LIMIT 1?

Would the forthcoming performance enhancement with MAX help when used 
within a view, as in:


create or replace view VCurPlayer as select * from Player a
where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID = 
b.PlayerID);


select PlayerID,AtDate from VCurPlayer where PlayerID='0';

Thanks and regards,
KC.

-

Actually the problem with LIMIT 1 query is when we use views with the 
LIMIT 1 construct. The direct SQL is ok:


esdt= explain analyze select PlayerID,AtDate from Player where 
PlayerID='0'

 order by PlayerID desc, AtDate desc LIMIT 1;

 Limit  (cost=0.00..1.37 rows=1 width=23) (actual time=0.000..0.000 
rows=1 loops=1)
   -  Index Scan Backward using pk_player on 
player  (cost=0.00..16074.23 rows=11770 width=23) (actual 
time=0.000..0.000 rows=1 loops=1)

 Index Cond: ((playerid)::text = '0'::text)
 Total runtime: 0.000 ms

esdt= create or replace view VCurPlayer3 as select * from Player a
where AtDate = (select b.AtDate from Player b where a.PlayerID = b.PlayerID
order by b.PlayerID desc, b.AtDate desc LIMIT 1);

esdt= explain analyze select PlayerID,AtDate,version from VCurPlayer3 
where PlayerID='0';
 Index Scan using pk_player on player a  (cost=0.00..33072.78 rows=59 
width=27)

(actual time=235.000..235.000 rows=1 loops=1)
   Index Cond: ((playerid)::text = '0'::text)
   Filter: ((atdate)::text = ((subplan))::text)
   SubPlan
 -  Limit  (cost=0.00..1.44 rows=1 width=23) (actual 
time=0.117..0.117 rows=1 loops=1743)
   -  Index Scan Backward using pk_player on player 
b  (cost=0.00..14023.67 rows=9727 width=23) (actual time=0.108..0.108 
rows=1 loops=1743)

 Index Cond: (($0)::text = (playerid)::text)
 Total runtime: 235.000 ms



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue

26 matches

Site Navigation

Mail list logo

Footer information