Re: [PERFORM] slow joining very large table to smaller ones
On 7/15/05, Bruno Wolff III <[EMAIL PROTECTED]> wrote: > On Thu, Jul 14, 2005 at 16:29:58 -0600, > Dan Harris <[EMAIL PROTECTED]> wrote: > > > > Ok, I tried this one. My ssh keeps getting cut off by a router > > somewhere between me and the server due to inactivity timeouts, so > > all I know is that both the select and explain analyze are taking > > over an hour to run. Here's the explain select for that one, since > > that's the best I can get. > > Are you using NAT at home? That's probably where the issue is. If you > have control of that box you can probably increase the timeout to a > couple of hours. Some versions of ssh have such a configuration option (in .ssh/config): Host * ServerAliveInterval 600 ...it means that ssh will send a "ping" packet to a sshd every 10 minutes of inactivity. This way NAT will see activity and won't kill the session. I'm using OpenSSH_4.1p1 for this... Oh, and it doesn't have anything to do with TCP keep alive, which is rather for finding dead connections than keeping connections alive. ;) Regards, Dawid ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: > > On Jul 14, 2005, at 10:12 PM, John A Meinel wrote: > >> >> My biggest question is why the planner things the Nested Loop would be >> so expensive. >> Have you tuned any of the parameters? It seems like something is out of >> whack. (cpu_tuple_cost, random_page_cost, etc...) >> > > here's some of my postgresql.conf. Feel free to blast me if I did > something idiotic here. > > shared_buffers = 5 > effective_cache_size = 1348000 > random_page_cost = 3 > work_mem = 512000 Unless you are the only person connecting to this database, your work_mem is very high. And if you haven't modified maintenance_work_mem it is probably very low. work_mem might be causing postgres to think it can fit all of a merge into ram, making it faster, I can't say for sure. > max_fsm_pages = 8 This seems high, but it depends how many updates/deletes you get in-between vacuums. It may not be too excessive. VACUUM [FULL] VERBOSE replies with how many free pages are left, if you didn't use that already for tuning. Though it should be tuned based on a steady state situation. Not a one time test. > log_min_duration_statement = 6 > fsync = true ( not sure if I'm daring enough to run without this ) > wal_buffers = 1000 > checkpoint_segments = 64 > checkpoint_timeout = 3000 > These seem fine to me. Can you include the output of EXPLAIN SELECT both with and without SET join_collapselimit? Since your tables have grown, I can't compare the estimated number of rows, and costs very well. EXPLAIN without ANALYZE is fine, since I am wondering what the planner is thinking things cost. John =:-> > > # FOR PG_AUTOVACUUM --# > stats_command_string = true > stats_row_level = true > signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Here's the explain select for that one, since that's the best I can get. one word : screen ! one of the most useful little command line utilities... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] slow joining very large table to smaller ones
On Thu, Jul 14, 2005 at 16:29:58 -0600, Dan Harris <[EMAIL PROTECTED]> wrote: > > Ok, I tried this one. My ssh keeps getting cut off by a router > somewhere between me and the server due to inactivity timeouts, so > all I know is that both the select and explain analyze are taking > over an hour to run. Here's the explain select for that one, since > that's the best I can get. Are you using NAT at home? That's probably where the issue is. If you have control of that box you can probably increase the timeout to a couple of hours. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 15, 2005, at 9:09 AM, Dan Harris wrote: On Jul 14, 2005, at 10:12 PM, John A Meinel wrote: My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) here's some of my postgresql.conf. Feel free to blast me if I did something idiotic here. shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true Sorry, I forgot to re-post my hardware specs. HP DL585 4 x 2.2 GHz Opteron 12GB RAM SmartArray RAID controller, 1GB hardware cache, 4x73GB 10k SCSI in RAID 0+1 ext2 filesystem Also, there are 30 databases on the machine, 27 of them are identical schemas. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 10:12 PM, John A Meinel wrote: My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) here's some of my postgresql.conf. Feel free to blast me if I did something idiotic here. shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: > > On Jul 14, 2005, at 7:15 PM, John A Meinel wrote: > >> >> >> Is the distribution of your rows uneven? Meaning do you have more rows >> with a later id than an earlier one? >> > > There are definitely some id's that will have many times more than the > others. If I group and count them, the top 10 are fairly dominant in > the table. That usually skews the estimates. Since the estimate is more of an average (unless the statistics are higher). > >>> >> >> Hmm.. How to do it permanantly? Well you could always issue "set >> join_collapse set 1; select * from " >> But obviously that isn't what you prefer. :) >> >> I think there are things you can do to make merge join more expensive >> than a nested loop, but I'm not sure what they are. > > > Maybe someone else has some ideas to encourage this behavior for future > work? Setting it on a per-connection basis is doable, but would add > some burden to us in code. My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) > >> >> What I really don't understand is that the estimates dropped as well. >> The actual number of estimate rows drops to 3k instead of > 1M. >> The real question is why does the planner think it will be so expensive? >> >> >>> select count(*) from k_b join k_r using (incidentid) where k_b.id=107 >>> and k_r.id=94; >>> count >>> --- >>>373 >>> >>> >> >> Well, this says that they are indeed much more selective. >> Each one has > 1k rows, but together you end up with only 400. >> > > Is this a bad thing? Is this not "selective enough" to make it much > faster? Yes, being more selective is what makes it faster. But the planner doesn't seem to notice it properly. > > Overall, I'm much happier now after seeing the new plan come about, if > I can find a way to make that join_collapse behavior permanent, I can > certainly live with these numbers. > I'm sure there are pieces to tune, but I've reached my limits of parameters to tweak :) > Thanks again for your continued efforts. > > -Dan > John =:-> signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 7:15 PM, John A Meinel wrote: Is the distribution of your rows uneven? Meaning do you have more rows with a later id than an earlier one? There are definitely some id's that will have many times more than the others. If I group and count them, the top 10 are fairly dominant in the table. Hmm.. How to do it permanantly? Well you could always issue "set join_collapse set 1; select * from " But obviously that isn't what you prefer. :) I think there are things you can do to make merge join more expensive than a nested loop, but I'm not sure what they are. Maybe someone else has some ideas to encourage this behavior for future work? Setting it on a per-connection basis is doable, but would add some burden to us in code. What I really don't understand is that the estimates dropped as well. The actual number of estimate rows drops to 3k instead of > 1M. The real question is why does the planner think it will be so expensive? select count(*) from k_b join k_r using (incidentid) where k_b.id=107 and k_r.id=94; count --- 373 Well, this says that they are indeed much more selective. Each one has > 1k rows, but together you end up with only 400. Is this a bad thing? Is this not "selective enough" to make it much faster? Overall, I'm much happier now after seeing the new plan come about, if I can find a way to make that join_collapse behavior permanent, I can certainly live with these numbers. Thanks again for your continued efforts. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 5:12 PM, John A Meinel wrote: Dan Harris wrote: Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; Once again, do this and post the results. We might just need to tweak your settings so that it estimates the number of rows correctly, and we don't need to do anything else. Ok, sorry I missed these the first time through: explain analyze select incidentid from k_b where id = 107; QUERY PLAN Index Scan using k_b_idx on k_b (cost=0.00..1926.03 rows=675 width=14) (actual time=0.042..298.394 rows=2493 loops=1) Index Cond: (id = 107) Total runtime: 299.103 ms select count(*) from k_b; count 698350 ( sorry! I think I said this one only had tens of thousands in it ) explain analyze select incidentid from k_r where id = 94; QUERY PLAN - Index Scan using k_r_idx on k_r (cost=0.00..2137.61 rows=757 width=14) (actual time=0.092..212.187 rows=10893 loops=1) Index Cond: (id = 94) Total runtime: 216.498 ms (3 rows) select count(*) from k_r; count 671670 That one is quite a bit slower, yet it's the same table structure and same index as k_b, also it has fewer records. I did run VACUUM ANALYZE immediately before running these queries. It seems a lot better with the join_collapse set. \ Well, the planner is powerful enough to flatten nested selects. To make it less "intelligent" you can do: SET join_collapse_limit 1; or SET join_collapse_limit 0; Which should tell postgres to not try and get tricky with your query. Again, *usually* the planner knows better than you do. So again just do it to see what you get. Ok, when join_collapse_limit = 1 I get this now: explain analyze select recordtext from eventactivity join ( select incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and k_b.id = 107 ) a using (incidentid ); QUERY PLAN --- Nested Loop (cost=0.00..156509.08 rows=2948 width=35) (actual time=1.555..340.625 rows=24825 loops=1) -> Nested Loop (cost=0.00..5361.89 rows=6 width=28) (actual time=1.234..142.078 rows=366 loops=1) -> Index Scan using k_b_idx on k_b (cost=0.00..1943.09 rows=681 width=14) (actual time=0.423..56.974 rows=2521 loops=1) Index Cond: (id = 107) -> Index Scan using k_r_idx on k_r (cost=0.00..5.01 rows=1 width=14) (actual time=0.031..0.031 rows=0 loops=2521) Index Cond: ((k_r.id = 94) AND ((k_r.incidentid)::text = ("outer".incidentid)::text)) -> Index Scan using eventactivity1 on eventactivity (cost=0.00..25079.55 rows=8932 width=49) (actual time=0.107..0.481 rows=68 loops=366) Index Cond: ((eventactivity.incidentid)::text = ("outer".incidentid)::text) Total runtime: 347.975 ms MUCH better! Maybe you can help me understand what I did and if I need to make something permanent to get this behavior from now on? If you have analyzed recently can you do: SELECT relname, reltuples FROM pg_class WHERE relname='eventactivity'; It is a cheaper form than "SELECT count(*) FROM eventactivity" to get an approximate estimate of the number of rows. But if it isn't too expensive, please also give the value from SELECT count(*) FROM eventactivity. Again, that helps us know if your tables are up-to-date. Sure: select relname, reltuples from pg_class where relname='eventactivity'; relname| reltuples ---+- eventactivity | 3.16882e+07 select count(*) from eventactivity; count -- 31871142 I don't know how selective your keys are, but one of these queries should probably structure it better for the planner. It depends a lot on how selective your query is. eventactivity currently has around 36 million rows in it. There should only be maybe 200-300 incidentids at most that will be matched with the combination of k_b and k_r. That's why I was thinking I could somehow get a list of just the incidentids that matched the id = 94 and id = 107 in k_b and k_r first. Then, I would only need to grab a few hundred out of 36 million rows from eventactivity. Well, you can also try: SELECT count(*) FROM k_b JOIN k_r USING (incidentid) WHERE k_b.id=?? AND k_r.id=?? ; That will tell you how many rows they have in common. selec
Re: [PERFORM] slow joining very large table to smaller ones
Tom Lane wrote: > John A Meinel <[EMAIL PROTECTED]> writes: > >>What I don't understand is that the planner is actually estimating that >>joining against the new table is going to *increase* the number of >>returned rows. > > > It evidently thinks that incidentid in the k_r table is pretty > nonunique. We really need to look at the statistics data to > see what's going on. > > regards, tom lane > Okay, sure. What about doing this, then: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN (SELECT DISTINCT incidentid FROM k_r JOIN k_b USING (incidentid) WHERE k_r.id = ?? AND k_b.id = ??) USING (incidentid) ; Since I assume that eventactivity is the only table with "recordtext", and that you don't get any columns from k_r and k_b, meaning it would be pointless to get duplicate incidentids. I may be misunderstanding what the query is trying to do, but depending on what is in k_r and k_b, is it possible to use a UNIQUE INDEX rather than just an index on incidentid? There is also the possibility of EXPLAIN ANALYZE SELECT recordtext FROM eventactivtity JOIN (SELECT incidentid FROM k_r WHERE k_r.id = ?? UNION SELECT incidentid FROM k_b WHERE k_b.id = ??) USING (incidentid) ; But both of these would mean that you don't actually want columns from k_r or k_b, just a unique list of incident ids. But first, I agree, we should make sure the pg_stats values are reasonable. John =:-> signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
John A Meinel <[EMAIL PROTECTED]> writes: > What I don't understand is that the planner is actually estimating that > joining against the new table is going to *increase* the number of > returned rows. It evidently thinks that incidentid in the k_r table is pretty nonunique. We really need to look at the statistics data to see what's going on. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: > > On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: > >> >> >> You might try giving it a little bit more freedom with: >> >> EXPLAIN ANALYZE >> SELECT recordtext FROM eventactivity, k_r, k_b >> WHERE eventactivity.incidentid = k_r.incidentid >>AND eventactivity.incidentid = k_b.incidentid >>AND k_r.id = 94 >>AND k_b.id = 107 >> -- AND k_r.incidentid = k_b.incidentid >> ; >> >> I'm pretty sure that would give identical results, just let the planner >> have a little bit more freedom about how it does it. >> Also the last line is commented out, because I think it is redundant. >> > > Ok, I tried this one. My ssh keeps getting cut off by a router > somewhere between me and the server due to inactivity timeouts, so all > I know is that both the select and explain analyze are taking over an > hour to run. Here's the explain select for that one, since that's the > best I can get. > > explain select recordtext from eventactivity,k_r,k_b where > eventactivity.incidentid = k_r.incidentid and eventactivity.incidentid > = k_b.incidentid and k_r.id = 94 and k_b.id = 107; > QUERY PLAN > > -- > Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) >Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") >-> Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) > Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") > -> Index Scan using eventactivity1 on eventactivity > (cost=0.00..4186753.16 rows=46029271 width=49) > -> Sort (cost=4766.92..4771.47 rows=1821 width=68) >Sort Key: (k_b.incidentid)::text >-> Index Scan using k_b_idx on k_b (cost=0.00..4668.31 > rows=1821 width=68) > Index Cond: (id = 107) >-> Sort (cost=4857.69..4862.39 rows=1879 width=68) > Sort Key: (k_r.incidentid)::text > -> Index Scan using k_r_idx on k_r (cost=0.00..4755.52 > rows=1879 width=68) >Index Cond: (id = 94) > (13 rows) > If anything, the estimations have gotten worse. As now it thinks there will be 1800 rows returned each, whereas you were thinking it would be more around 100. Since you didn't say, you did VACUUM ANALYZE recently, right? > ... >> >> You can also try disabling merge joins, and see how that changes things. >> > > Are there any negative sideaffects of doing this? If the planner is estimating things correctly, you want to give it the most flexibility of plans to pick from, because sometimes a merge join is faster (postgres doesn't pick things because it wants to go slower). The only reason for the disable flags is that sometimes the planner doesn't estimate correctly. Usually disabling a method is not the final solution, but a way to try out different methods, and see what happens to the results. Using: SET enable_mergejoin TO off; You can disable it just for the current session (not for the entire database). Which is the recommended way if you have a query that postgres is messing up on. (Usually it is correct elsewhere). >> >> Well, postgres is estimating around 500 rows each, is that way off? Try >> just doing: >> EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; >> EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; Once again, do this and post the results. We might just need to tweak your settings so that it estimates the number of rows correctly, and we don't need to do anything else. >> >> And see if postgres estimates the number of rows properly. >> >> I assume you have recently VACUUM ANALYZEd, which means you might need >> to update the statistics target (ALTER TABLE k_b ALTER COLUMN >> incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, >> higher is more accurate, but makes ANALYZE slower. >> ... >> EXPLAIN ANALYZE >> SELECT recordtext FROM eventactivity >> JOIN (SELECT incidentid FROM k_r JOIN k_b USING (incidentid) >> WHERE k_r.id = 94 AND k_b.id = 107) >> USING (incidentid); >> > > This one looks like the same plan as the others: > > explain select recordtext from eventactivity join ( select incidentid > from k_r join k_b using (incidentid) where k_r.id = 94 and k_b.id = 107 > ) a using (incidentid ); Well, the planner is powerful enough to flatten nested selects. To make it less "intelligent" you can do: SET join_collapse_limit 1; or SET join_collapse_limit 0; Which should tell postgres to not try and get tricky with your query. Again, *usually* the planner knows better than you do. So again just do it to see what you get. The problem is that if you are only using EXPLAIN SELECT, you will probably get something which *looks* worse. Because if it looked better, the planner would have used it. That is why you really need the EXPLAIN ANALYZE, so that you can see where the planner is incorrect
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris <[EMAIL PROTECTED]> writes: > Here's the explain select for that one, since > that's the best I can get. > explain select recordtext from eventactivity,k_r,k_b where > eventactivity.incidentid = k_r.incidentid and > eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id > = 107; >QUERY PLAN > > -- > Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) > Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") > -> Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) > Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") > -> Index Scan using eventactivity1 on eventactivity > (cost=0.00..4186753.16 rows=46029271 width=49) > -> Sort (cost=4766.92..4771.47 rows=1821 width=68) > Sort Key: (k_b.incidentid)::text > -> Index Scan using k_b_idx on k_b > (cost=0.00..4668.31 rows=1821 width=68) > Index Cond: (id = 107) > -> Sort (cost=4857.69..4862.39 rows=1879 width=68) > Sort Key: (k_r.incidentid)::text > -> Index Scan using k_r_idx on k_r (cost=0.00..4755.52 > rows=1879 width=68) > Index Cond: (id = 94) > (13 rows) There's something awfully fishy here. The 8.0 planner is definitely capable of figuring out that it ought to join the smaller relations first. As an example, using 8.0.3+ (CVS branch tip) I did regression=# create table eventactivity(incidentid varchar, recordtext text); CREATE TABLE regression=# create table k_r(incidentid varchar); CREATE TABLE regression=# create table k_b(incidentid varchar); CREATE TABLE regression=# explain select recordtext from eventactivity inner join (select incidentid from k_r) a using (incidentid) inner join (select incidentid from k_b) b using (incidentid); (Being too impatient to actually fill the eventactivity table with 36M rows of data, I just did some debugger magic to make the planner think that that was the table size...) The default plan looks like Merge Join (cost=16137814.70..36563453.23 rows=136170 width=32) Merge Cond: (("outer".incidentid)::text = "inner"."?column3?") -> Merge Join (cost=170.85..290.48 rows=7565 width=64) Merge Cond: ("outer"."?column2?" = "inner"."?column2?") -> Sort (cost=85.43..88.50 rows=1230 width=32) Sort Key: (k_r.incidentid)::text -> Seq Scan on k_r (cost=0.00..22.30 rows=1230 width=32) -> Sort (cost=85.43..88.50 rows=1230 width=32) Sort Key: (k_b.incidentid)::text -> Seq Scan on k_b (cost=0.00..22.30 rows=1230 width=32) -> Sort (cost=16137643.84..16227643.84 rows=3600 width=64) Sort Key: (eventactivity.incidentid)::text -> Seq Scan on eventactivity (cost=0.00..108.00 rows=3600 width=64) and if I "set enable_mergejoin TO 0;" I get Hash Join (cost=612.54..83761451.54 rows=136170 width=32) Hash Cond: (("outer".incidentid)::text = ("inner".incidentid)::text) -> Seq Scan on eventactivity (cost=0.00..108.00 rows=3600 width=64) -> Hash (cost=504.62..504.62 rows=7565 width=64) -> Hash Join (cost=25.38..504.62 rows=7565 width=64) Hash Cond: (("outer".incidentid)::text = ("inner".incidentid)::text) -> Seq Scan on k_r (cost=0.00..22.30 rows=1230 width=32) -> Hash (cost=22.30..22.30 rows=1230 width=32) -> Seq Scan on k_b (cost=0.00..22.30 rows=1230 width=32) which is the plan I would judge Most Likely To Succeed based on what we know about Dan's problem. (The fact that the planner is estimating it as twice as expensive as the mergejoin comes from the fact that with no statistics about the join keys, the planner deliberately estimates hash join as expensive, because it can be pretty awful in the presence of many equal keys.) So the planner is certainly capable of finding the desired plan, even without any tweaking of the query text. This means that what we have is mainly a statistical problem. Have you ANALYZEd these tables recently? If so, may we see the pg_stats rows for incidentid in all three tables? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: > > On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: ... Did you try doing this to see how good the planners selectivity estimates are? >> Well, postgres is estimating around 500 rows each, is that way off? Try >> just doing: >> EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; >> EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; These should be fast queries. John =:-> signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
On Thu, Jul 14, 2005 at 04:29:58PM -0600, Dan Harris wrote: Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Try running the query as a script with nohup & redirect the output to a file. Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: You might try giving it a little bit more freedom with: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity, k_r, k_b WHERE eventactivity.incidentid = k_r.incidentid AND eventactivity.incidentid = k_b.incidentid AND k_r.id = 94 AND k_b.id = 107 -- AND k_r.incidentid = k_b.incidentid ; I'm pretty sure that would give identical results, just let the planner have a little bit more freedom about how it does it. Also the last line is commented out, because I think it is redundant. Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Here's the explain select for that one, since that's the best I can get. explain select recordtext from eventactivity,k_r,k_b where eventactivity.incidentid = k_r.incidentid and eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4186753.16 rows=46029271 width=49) -> Sort (cost=4766.92..4771.47 rows=1821 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..4668.31 rows=1821 width=68) Index Cond: (id = 107) -> Sort (cost=4857.69..4862.39 rows=1879 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..4755.52 rows=1879 width=68) Index Cond: (id = 94) (13 rows) You might also try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN k_r USING (incidentid) JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107 ; Similar results here. The query is taking at least an hour to finish. explain select recordtext from eventactivity join k_r using ( incidentid ) join k_b using (incidentid ) where k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9542.77..4672831.12 rows=148391132 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=4726.61..4542825.87 rows=15930238 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4184145.43 rows=46000104 width=49) -> Sort (cost=4726.61..4731.13 rows=1806 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..4628.92 rows=1806 width=68) Index Cond: (id = 107) -> Sort (cost=4816.16..4820.82 rows=1863 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..4714.97 rows=1863 width=68) Index Cond: (id = 94) (13 rows) Also, if possible give us the EXPLAIN ANALYZE so that we know if the planner is making accurate estimates. (You might send an EXPLAIN while waiting for the EXPLAIN ANALYZE to finish) You can also try disabling merge joins, and see how that changes things. Are there any negative sideaffects of doing this? Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; And see if postgres estimates the number of rows properly. I assume you have recently VACUUM ANALYZEd, which means you might need to update the statistics target (ALTER TABLE k_b ALTER COLUMN incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, higher is more accurate, but makes ANALYZE slower. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. To me, it looks like it is doing an index scan (on k_b.id) through k_b first, sorting the results by incidentid, then merge joining that with eventactivity. I'm guessing you actually want it to merge k_b and k_r to get extra selec
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: > I'm trying to improve the speed of this query: > > explain select recordtext from eventactivity inner join ( select > incidentid from k_r where id = 94 ) a using ( incidentid ) inner join ( > select incidentid from k_b where id = 107 ) b using ( incidentid ); You might try giving it a little bit more freedom with: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity, k_r, k_b WHERE eventactivity.incidentid = k_r.incidentid AND eventactivity.incidentid = k_b.incidentid AND k_r.id = 94 AND k_b.id = 107 -- AND k_r.incidentid = k_b.incidentid ; I'm pretty sure that would give identical results, just let the planner have a little bit more freedom about how it does it. Also the last line is commented out, because I think it is redundant. You might also try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN k_r USING (incidentid) JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107 ; Also, if possible give us the EXPLAIN ANALYZE so that we know if the planner is making accurate estimates. (You might send an EXPLAIN while waiting for the EXPLAIN ANALYZE to finish) You can also try disabling merge joins, and see how that changes things. > QUERY PLAN > > -- > Merge Join (cost=2747.29..4249364.96 rows=11968693 width=35) >Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") >-> Merge Join (cost=1349.56..4230052.73 rows=4413563 width=117) > Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") > -> Index Scan using eventactivity1 on eventactivity > (cost=0.00..4051200.28 rows=44519781 width=49) > -> Sort (cost=1349.56..1350.85 rows=517 width=68) >Sort Key: (k_b.incidentid)::text >-> Index Scan using k_b_idx on k_b (cost=0.00..1326.26 > rows=517 width=68) > Index Cond: (id = 107) >-> Sort (cost=1397.73..1399.09 rows=542 width=68) > Sort Key: (k_r.incidentid)::text > -> Index Scan using k_r_idx on k_r (cost=0.00..1373.12 > rows=542 width=68) >Index Cond: (id = 94) > (13 rows) > > > There are many millions of rows in eventactivity. There are a few > ten-thousand rows in k_r and k_b. There is an index on 'incidentid' in > all three tables. There should only be less than 100 rows matched in > k_r and k_b total. That part on its own is very very fast. But, it > should have those 100 or so incidentids extracted in under a second and > then go into eventactivity AFTER doing that. At least, that's my > intention to make this fast. Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; And see if postgres estimates the number of rows properly. I assume you have recently VACUUM ANALYZEd, which means you might need to update the statistics target (ALTER TABLE k_b ALTER COLUMN incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, higher is more accurate, but makes ANALYZE slower. > > Right now, it looks like pg is trying to sort the entire eventactivity > table for the merge join which is taking several minutes to do. Can I > rephrase this so that it does the searching through k_r and k_b FIRST > and then go into eventactivity using the index on incidentid? It seems > like that shouldn't be too hard to make fast but my SQL query skills > are only average. To me, it looks like it is doing an index scan (on k_b.id) through k_b first, sorting the results by incidentid, then merge joining that with eventactivity. I'm guessing you actually want it to merge k_b and k_r to get extra selectivity before joining against eventactivity. I think my alternate forms would let postgres realize this. But if not, you could try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN (SELECT incidentid FROM k_r JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107) USING (incidentid); I don't know how selective your keys are, but one of these queries should probably structure it better for the planner. It depends a lot on how selective your query is. If you have 100M rows, the above query looks like it expects k_r to restrict it to 44M rows, and k_r + k_b down to 11M rows, which really should be a seq scan (> 10% of the rows = seq scan). But if you are saying the selectivity is mis-estimated it could be different. John =:-> > > Thanks > -Dan > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > signature.asc Description: OpenPGP digital signature
[PERFORM] slow joining very large table to smaller ones
I'm trying to improve the speed of this query: explain select recordtext from eventactivity inner join ( select incidentid from k_r where id = 94 ) a using ( incidentid ) inner join ( select incidentid from k_b where id = 107 ) b using ( incidentid ); QUERY PLAN -- Merge Join (cost=2747.29..4249364.96 rows=11968693 width=35) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Merge Join (cost=1349.56..4230052.73 rows=4413563 width=117) Merge Cond: (("outer".incidentid)::text = "inner"."?column2?") -> Index Scan using eventactivity1 on eventactivity (cost=0.00..4051200.28 rows=44519781 width=49) -> Sort (cost=1349.56..1350.85 rows=517 width=68) Sort Key: (k_b.incidentid)::text -> Index Scan using k_b_idx on k_b (cost=0.00..1326.26 rows=517 width=68) Index Cond: (id = 107) -> Sort (cost=1397.73..1399.09 rows=542 width=68) Sort Key: (k_r.incidentid)::text -> Index Scan using k_r_idx on k_r (cost=0.00..1373.12 rows=542 width=68) Index Cond: (id = 94) (13 rows) There are many millions of rows in eventactivity. There are a few ten-thousand rows in k_r and k_b. There is an index on 'incidentid' in all three tables. There should only be less than 100 rows matched in k_r and k_b total. That part on its own is very very fast. But, it should have those 100 or so incidentids extracted in under a second and then go into eventactivity AFTER doing that. At least, that's my intention to make this fast. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq