Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue
Hi All, Investigating further on this problem I brought up in June, the following query with pg 8.0.3 on Windows scans all 1743 data records for a player: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='0' and AtDate = (select b.AtDate from Player b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.00..2789.07 rows=9 width=23) (a ctual time=51.046..51.049 rows=1 loops=1) Index Cond: ((playerid)::text = '0'::text) Filter: ((atdate)::text = ((subplan))::text) SubPlan -> Limit (cost=0.00..0.83 rows=1 width=23) (actual time=0.016..0.017 rows =1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..970. 53 rows=1166 width=23) (actual time=0.011..0.011 rows=1 loops=1743) Index Cond: ((playerid)::text = ($0)::text) Total runtime: 51.133 ms Using a static value in the subquery produces the desired result below, but since we use views for our queries (see last part of this email), we cannot push the static value into the subquery: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='0' and AtDate = (select b.AtDate from Player b where b.PlayerID = '0' order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23) (actu al time=0.054..0.058 rows=1 loops=1) Index Cond: (((playerid)::text = '0'::text) AND ((atdate)::text = ($0)::t ext)) InitPlan -> Limit (cost=0.00..0.75 rows=1 width=23) (actual time=0.028..0.029 rows =1 loops=1) -> Index Scan Backward using pk_player on player b (cost=0.00..1323 .05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 loops=1) Index Cond: ((playerid)::text = '0'::text) Total runtime: 0.149 ms The Player table has a primary key on PlayerID, AtDate. Is there a way to stop the inner-most index scan looping all 1743 data records for that player? Is that a bug or known issue? BTW, I tried using 8.1 beta2 on Windows and its performance is similar, I have also tried other variants such as MAX and DISTINCT but with no success. Any help is most appreciated. Best regards, KC. At 10:46 05/06/15, K C Lau wrote: Hi All, I previously posted the following as a sequel to my SELECT DISTINCT Performance Issue question. We would most appreciate any clue or suggestions on how to overcome this show-stopping issue. We are using 8.0.3 on Windows. Is it a known limitation when using a view with SELECT ... LIMIT 1? Would the forthcoming performance enhancement with MAX help when used within a view, as in: create or replace view VCurPlayer as select * from Player a where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID = b.PlayerID); select PlayerID,AtDate from VCurPlayer where PlayerID='0'; Thanks and regards, KC. - Actually the problem with LIMIT 1 query is when we use views with the LIMIT 1 construct. The direct SQL is ok: esdt=> explain analyze select PlayerID,AtDate from Player where PlayerID='0' order by PlayerID desc, AtDate desc LIMIT 1; Limit (cost=0.00..1.37 rows=1 width=23) (actual time=0.000..0.000 rows=1 loops=1) -> Index Scan Backward using pk_player on player (cost=0.00..16074.23 rows=11770 width=23) (actual time=0.000..0.000 rows=1 loops=1) Index Cond: ((playerid)::text = '0'::text) Total runtime: 0.000 ms esdt=> create or replace view VCurPlayer3 as select * from Player a where AtDate = (select b.AtDate from Player b where a.PlayerID = b.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1); esdt=> explain analyze select PlayerID,AtDate,version from VCurPlayer3 where PlayerID='0'; Index Scan using pk_player on player a (cost=0.00..33072.78 rows=59 width=27) (actual time=235.000..235.000 rows=1 loops=1) Index Cond: ((playerid)::text = '0'::text) Filter: ((atdate)::text = ((subplan))::text) SubPlan -> Limit (cost=0.00..1.44 rows=1 width=23) (actual time=0.117..0.117 rows=1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..14023.67 rows=9727 width=23) (actual time=0.108..0.108 rows=1 loops=1743) Index Cond: (($0)::text = (playerid)::text) Total runtime: 235.000 ms ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
"Gurpreet Aulakh" <[EMAIL PROTECTED]> writes: > What is really interesting is the time it takes for the Hash to occur. For > the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes > 47ms. You haven't told us a thing about the column datatypes involved (much less what the query actually is) ... but I wonder if this is a textual datatype and the 8.0 installation is using a non-C locale where the 7.3 installation is using C locale. That could account for a considerable slowdown in text comparison speeds. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
I have started to break my query down and analyze each piece. What I have discovered is very interesting. First here is a small piece of my query. EXPLAIN ANALYZE SELECT doc.doc_documentid FROM document AS doc LEFT JOIN document as root ON doc.doc_internalRootXref = root.doc_documentId LEFT JOIN folder_document ON doc.doc_documentid = folder_document.doc_documentId LIMIT 500 OFFSET 0 When I run this on Postgres 8.0.3 running under windows this is the result QUERY PLAN Limit (cost=183.49..753.41 rows=500 width=40) (actual time=47.000..79.000 rows=500 loops=1) -> Hash Left Join (cost=183.49..6702.23 rows=5719 width=40) (actual time=47.000..79.000 rows=500 loops=1) Hash Cond: (("outer".doc_documentid)::text = ("inner".doc_documentid)::text) -> Merge Left Join (cost=0.00..6432.96 rows=5719 width=40) (actual time=0.000..16.000 rows=500 loops=1) Merge Cond: (("outer".doc_internalrootxref)::text = ("inner".doc_documentid)::text) -> Index Scan using doc_internalrootxref_index on document doc (cost=0.00..3172.64 rows=5719 width=80) (actual time=0.000..0.000 rows=500 loops=1) -> Index Scan using pk_document on document root (cost=0.00..3174.53 rows=5719 width=40) (actual time=0.000..0.000 rows=863 loops=1) -> Hash (cost=169.19..169.19 rows=5719 width=40) (actual time=47.000..47.000 rows=0 loops=1) -> Seq Scan on folder_document (cost=0.00..169.19 rows=5719 width=40) (actual time=0.000..16.000 rows=5719 loops=1) Total runtime: 79.000 ms Here is the result of running the same query on the Postgres 7.3 running under Cygwin QUERY PLAN Limit (cost=183.49..775.31 rows=500 width=160) (actual time=13.00..44.00 rows=500 loops=1) -> Hash Join (cost=183.49..6952.79 rows=5719 width=160) (actual time=13.00..44.00 rows=501 loops=1) Hash Cond: ("outer".doc_documentid = "inner".doc_documentid) -> Merge Join (cost=0.00..6612.03 rows=5719 width=120) (actual time=0.00..29.00 rows=775 loops=1) Merge Cond: ("outer".doc_internalrootxref = "inner".doc_documentid) -> Index Scan using doc_internalrootxref_index on document doc (cost=0.00..3254.39 rows=5719 width=80) (actual time=0.00..7.00 rows=775 loops=1) -> Index Scan using pk_document on document root (cost=0.00..3257.88 rows=5719 width=40) (actual time=0.00..15.00 rows=1265 loops=1) -> Hash (cost=169.19..169.19 rows=5719 width=40) (actual time=12.00..12.00 rows=0 loops=1) -> Seq Scan on folder_document (cost=0.00..169.19 rows=5719 width=40) (actual time=0.00..9.00 rows=5719 loops=1) Total runtime: 45.00 msec What is really interesting is the time it takes for the Hash to occur. For the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes 47ms. Now the databases are created from the same data and I have run vacuumdb -f -z on the databases. Now I have read something on the archives that stated that perhaps the data is in the filesystem (not database) cache. Would this be the case?. If so how would I improve the performance under WIN2K? Anyone have any ideas? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Gurpreet Aulakh Sent: September 21, 2005 12:38 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) I currently have a Postgres 7.3 database running under WIN2K using cygwin and want to move to Postgres 8.0.3 (native windows version). I am finding most simple queries are significantly faster on the native windows version compared to 7.3 (under cygwin). However, for a complex query, that involve multiple JOINs, the 7.3 version is actually faster (about 2X faster). The query that I am running was optimized to run under 7.3. It was specifically modified to control the planner with explicit JOINs. When I run the same query on the 8.0.3 version with the join_collapse_limit set to 1 the query is slower. Can someone tell me why setting the join_collapse_limit to 1 in the 8.0 version does not produce similar results to the 7.3 version? Does anyone have any suggestions on what I can do? Do I have to rewrite the query? Here are the results of an explain analyze on the query. Explain analyze Postgres 7.3 running on WIN2K using cygwin. Hash Join (cost=21808.27..1946264.80 rows=2982 width=1598) (actual time=2186.00..2320.00 rows=50 loops=1) Hash Cond: ("outer".doc_internalparentomxref = "inner".doc_documentid) -> Hash Join (cost=20948.78..1945323.29 rows=2982 width=1534) (actual time=2110.00..2227.00 rows=50 loops=1) Hash Cond: ("outer".doc_internalrootxref = "inner".doc_documentid) -> Hash Join (cost=20089.29..1944381.79 rows=2982 width=1484) (actual time=2067.00..2179.00 rows=50 loops=1) Hash Cond: ("outer".doc_documentid = "inner".doc_documentid) Join Filter: ("inner".dc_doccontacttype = 'FROM'::character varying
[PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
I currently have a Postgres 7.3 database running under WIN2K using cygwin and want to move to Postgres 8.0.3 (native windows version). I am finding most simple queries are significantly faster on the native windows version compared to 7.3 (under cygwin). However, for a complex query, that involve multiple JOINs, the 7.3 version is actually faster (about 2X faster). The query that I am running was optimized to run under 7.3. It was specifically modified to control the planner with explicit JOINs. When I run the same query on the 8.0.3 version with the join_collapse_limit set to 1 the query is slower. Can someone tell me why setting the join_collapse_limit to 1 in the 8.0 version does not produce similar results to the 7.3 version? Does anyone have any suggestions on what I can do? Do I have to rewrite the query? Here are the results of an explain analyze on the query. Explain analyze Postgres 7.3 running on WIN2K using cygwin. Hash Join (cost=21808.27..1946264.80 rows=2982 width=1598) (actual time=2186.00..2320.00 rows=50 loops=1) Hash Cond: ("outer".doc_internalparentomxref = "inner".doc_documentid) -> Hash Join (cost=20948.78..1945323.29 rows=2982 width=1534) (actual time=2110.00..2227.00 rows=50 loops=1) Hash Cond: ("outer".doc_internalrootxref = "inner".doc_documentid) -> Hash Join (cost=20089.29..1944381.79 rows=2982 width=1484) (actual time=2067.00..2179.00 rows=50 loops=1) Hash Cond: ("outer".doc_documentid = "inner".doc_documentid) Join Filter: ("inner".dc_doccontacttype = 'FROM'::character varying) -> Hash Join (cost=7455.14..1928613.59 rows=2982 width=1138) (actual time=1216.00..1539.00 rows=50 loops=1) Hash Cond: ("outer".doc_documentid = "inner".doc_documentid) Join Filter: ("inner".dc_doccontacttype = 'TO'::character varying) -> Hash Join (cost=183.49..1918519.06 rows=2860 width=792) (actual time=64.00..301.00 rows=50 loops=1) Hash Cond: ("outer".doc_documentid = "inner".doc_documentid) -> Seq Scan on document finaldoc (cost=0.00..1918256.94 rows=2860 width=717) (actual time=13.00..254.00 rows=50 loops=1) Filter: (subplan) SubPlan -> Materialize (cost=335.27..335.27 rows=50 width=160) (actual time=0.00..0.01 rows=50 loops=5719) -> Limit (cost=0.00..335.27 rows=50 width=160) (actual time=3.00..8.00 rows=50 loops=1) -> Nested Loop (cost=0.00..38347.95 rows=5719 width=160) (actual time=3.00..8.00 rows=51 loops=1) -> Merge Join (cost=0.00..3910.14 rows=5719 width=120) (actual time=3.00..3.00 rows=51 loops=1) Merge Cond: ("outer".doc_documentid = "inner".doc_documentid) -> Index Scan using pk_document on document doc (cost=0.00..3256.48 rows=5719 width=80) (actual time=1.00..1.00 rows=51 loops=1) -> Index Scan using pk_folder_document on folder_document (cost=0.00..553.91 rows=5719 width=40) (actual time=2.00..2.00 rows=51 loops=1) -> Index Scan using pk_document on document root (cost=0.00..6.01 rows=1 width=40) (actual time=0.10..0.10 rows=1 loops=51) Index Cond: ("outer".doc_internalrootxref = root.doc_documentid) -> Hash (cost=169.19..169.19 rows=5719 width=75) (actual time=31.00..31.00 rows=0 loops=1) -> Seq Scan on folder_document (cost=0.00..169.19 rows=5719 width=75) (actual time=0.00..11.00 rows=5719 loops=1) -> Hash (cost=1328.80..1328.80 rows=34280 width=346) (actual time=846.00..846.00 rows=0 loops=1) -> Seq Scan on doccontact dcto (cost=0.00..1328.80 rows=34280 width=346) (actual time=0.00..175.00 rows=34280 loops=1) -> Hash (cost=1328.80..1328.80 rows=34280 width=346) (actual time=445.00..445.00 rows=0 loops=1) -> Seq Scan on doccontact dcfrom (cost=0.00..1328.80 rows=34280 width=346) (actual time=0.00..223.00 rows=34280 loops=1) -> Hash (cost=845.19..845.19 rows=5719 width=50) (actual time=42.00..42.00 rows=0 loops=1) -> Seq Scan on document root (cost=0.00..845.19 rows=5719 width=50) (actual time=0.00..2.00 rows=5719 loops=1) -> Hash (cost=845.19..845.19 rows=5719 width=64) (actual time=73.00..73.00 rows=0 loops=1) -> Seq Scan on document parentom (cost=0.00..845.19 rows=5719 width=64) (actual time=0.00..30.00 rows=5719 loops=1) SubPlan -> Limit (co
Re: [PERFORM] Performance considerations for very heavy INSERT traffic
On Sep 12, 2005, at 6:02 PM, Brandon Black wrote:- splitting the xlog and the data on distinct physical drives or arraysThat would almost definitely help, I haven't tried it yet. Speaking of the xlog, anyone know anything specific about the WAL tuning parameters for heavy concurrent write traffic? What little I could dig up on WAL tuning was contradictory, and testing some random changes to the parameters hasn't been very conclusive yet. I would imagine the WAL buffers stuff could potentially have a large effect for us.you will want to make your pg_xlog RAID volume BIG, and then tell postgres to use that space: bump up checkpoint_segments (and suitably the checkpoint timeouts). I run with 256 segments and a timeout of 5 minutes. The timeout refletcs your expected crash recovery time, so adjust it wiselyAlso, you should consider how you split your drives across your RAID data channels on your test machine: I put each pair of the RAID10 mirrors on opposite channels, so both channels of my RAID controller are pretty evenly loaded during write. Vivek Khera, Ph.D. +1-301-869-4449 x806
Re: [PERFORM] Performance considerations for very heavy INSERT traffic
On Sep 12, 2005, at 6:02 PM, Brandon Black wrote:- using COPY instead of INSERT ?(should be easy to do from the aggregators)Possibly, although it would kill the current design of returning the database transaction status for a single client packet back to the client on transaction success/failure. The aggregator could put several clients' data into a series of delayed multi-row copy statements.buffer through the file system on your aggregator. once you "commit" to local disk file, return back to your client that you got the data. then insert into the actual postgres DB in large batches of inserts inside a single Postgres transaction.we have our web server log certain tracking requests to a local file. with file locks and append mode, it is extremely quick and has little contention delays. then every so often, we lock the file, rename it, release the lock, then process it at our leisure to do the inserts to Pg in one big transaction. Vivek Khera, Ph.D. +1-301-869-4449 x806
Re: [PERFORM] CHECK vs REFERENCES
On Sep 9, 2005, at 11:23 PM, Marc G. Fournier wrote: The case is where I just want to check that a value being inserted is one of a few possible values, with that list of values rarely (if ever) changing, so havng a 'flexible list' REFERENCED seems relatively overkill ... That's what I thought until the first time that list needed to be altered. At this point, it becomes a royal pain. point to take: do it right the first time, or you have to do it over, and over, and over... Vivek Khera, Ph.D. +1-301-869-4449 x806 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly