Re: [HACKERS] Correlation in cost_index()
On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden [EMAIL PROTECTED] wrote: the problem with your patch was that it picked an index less often than the current code when there was low correlation. Maybe bit rot? What version did you apply the patch against? Here is a new version for Postgres 7.3.4: http://www.pivot.at/pg/16d-correlation_734.diff The only difference to the previous version is that for (nKeys = 1; index-indexkeys[nKeys] != 0; nKeys++) is now replaced with for (nKeys = 1; nKeys index-ncolumns; nKeys++) Don't know whether the former just worked by chance when I tested the 7.3.2 version :-(. Tests with 7.4Beta1 showed that index correlation comes out too low with the old loop termination condition. Anyway, the latter version seems more robust. In my tests the new index_cost_algorithms (1, 2, 3, 4) gave consistently lower cost estimates than the old method (set index_cost_algorithm = 0), except of course for correlations of 1.0 or 0.0, because in these border cases you get always min_IO_cost or max_IO_cost, respectively. Care to re-evaluate? BTW, there's a version of the patch for 7.4Beta1 (http://www.pivot.at/pg/16d-correlation_74b1.diff) which also applies cleanly against cvs snapshot from 2003-08-17. Servus Manfred ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Correlation in cost_index()
Which suggests to me that line 3964 in ./src/backend/utils/adt/selfuncs.c isn't right for multi-column indexes, esp for indexes that are clustered. I don't know how to address this though... Tom, any hints? Yes, we knew that already. Oliver had suggested simply dropping the division by nKeys, thus pretending that the first-column correlation is close enough. That seems to me to be going too far in the other direction, But is it really? xbut clearly dividing by nKeys is far too pessimistic. I'd change this in a moment if someone could point me to a formula with any basis at all ... Got it, alright. I'd never paid attention to prior discussions as the planner had generally did the right thing (with a lowered random_page_cost ::grin::). In terms of statistics and setting indexCorrelation correctly, something like Spearman's rho calculation comes to mind, though I don't know how applicable that is to database theory. indexCorrelation is 1.0 for the 1st key in a multi-column index. The only thing different about a multi-column index and a single column index is the multi-column index takes up more space per key, resulting in fewer index entries per page and more pages being fetched than would be in a single column index, but the current cost_index() function takes increased number of page fetches into account when calculating cost. As things stand, however, if a multi-column key is used, the indexCorrelation is penalized by the size of the number of keys found in the multi-column index. As things stand the qual user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date) has an indexCorrelation of 0.5, when in fact the correlation is 1.0. indexCorrelation == number of random page fetches, which could be next to free on a solid state drive, in this case, the page fetches aren't random, they're perfectly sequential. If it were 'user_id = 42 AND utc_date = NOW()', the correlation of a lookup of the user_id would still be 1.0 and the utc_date would be 1.0 because both values are looked up in the index key. A lookup of just the utc_date can never use the index and the planner correctly uses a sequential scan. Cost != Correlation. They're proportional, but not the same and indexCorrelation is the wrong place to handle cost as that's done by the Mackert and Lohman formula. Under what circumstances would it be correct to pessimize the use of indexCorrelation? An indexCorrelation of 0.0 doesn't mean that the index is useless either, just that we take the full hit of a completely random page read as opposed to some fraction of a random page cost. I tossed a different index on my test table to see how well things fare with a low correlation, and this was a bit disturbing: # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes 800::BIGINT; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.00 INFO: cost_index: run_cost: 112165.065458 startup_cost: 0.00 indexCorrelation: 0.183380 QUERY PLAN -- Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=31893 width=64) (actual time=444.25..2489.27 rows=514 loops=1) Filter: (html_bytes 800::bigint) Total runtime: 2492.36 msec (3 rows) # SET enable_seqscan = false; SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes 800::BIGINT; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 1.00 INFO: cost_index: run_cost: 112165.065458 startup_cost: 0.00 indexCorrelation: 0.183380 QUERY PLAN - Index Scan using report_user_cat_count_html_bytes_idx on report_user_cat_count rucc (cost=0.00..112165.07 rows=31893 width=64) (actual time=68.64..85.75 rows=514 loops=1) Index Cond: (html_bytes 800::bigint) Total runtime: 97.75 msec (3 rows) *shrug* A low indexCorrelation overly pessimizes the cost of an index, but I'm not sure where to attribute this too. :-/ -sc -- Sean Chittenden ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Correlation in cost_index()
In both cases ANALYZE will calculate correlation 1.0 for column X, and something near zero for column Y. We would like to come out with index correlation 1.0 for the left-hand case and something much less (but, perhaps, not zero) for the right-hand case. I don't really see a way to do this without actually examining the multi-column ordering relationship during ANALYZE. The only way the second column correlation will be irrelevant is if the first column is already (nearly) unique (enough so, that the second column wont scatter fetches enough to fill the buffer before seeing cache hits). Thus I think when merging correlations you could take nunique into account. corr = corr_1 * (corr_2 * ( 1 - nunique_1 / nrows)) But, I think one (new) correlation metric for the whole index (whole key) and the data pages would actually be sufficient. This metric could imho always be used instead of the per column correlations to calculate index cost. This holds true as long as you walk an index range, and that is what it is all about, no ? ??? Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Correlation in cost_index()
On Fri, 8 Aug 2003 11:06:56 -0700, Sean Chittenden [EMAIL PROTECTED] wrote: [...] it'd seem as though an avg depth of nodes in index * tuples_fetched * (random_io_cost * indexCorrelation) would be closer than where we are now... Index depth does not belong here because we walk down the index only once per index scan not once per tuple. It might be part of the startup cost. The rest of your formula doesn't seem right, too, because you get higher costs for better correlation. Did you mean random_io_cost * (1 - abs(indexCorrelation))? FWIW, for small effective_cache_size max_IO_cost is almost equal to tuples_fetched * random_page_cost. So your formula (with the corrections presumed above) boils down to ignoring effective_cache_size and linear interpolation between 0 and max_IO_cost. It's very possible that cost_index() is wrong, but it seems as though after some testing as if PostgreSQL _overly_ _favors_ the use of indexes: Was this an unpatched backend? What were the values of effective_cache_size and random_page_cost? # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.00 INFO: cost_index: run_cost: 21154.308116 startup_cost: 0.00 indexCorrelation: 0.999729 QUERY PLAN --- Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc (cost=0.00..21154.31 rows=705954 width=64) (actual time=91.36..6625.79 rows=704840 loops=1) Index Cond: (utc_date '2002-10-01 00:00:00-07'::timestamp with time zone) Total runtime: 11292.68 msec (3 rows) actual time=91.36..6625.79 but Total runtime: 11292.68 msec! Where did those 4.7 seconds go? # SET enable_seqscan = true; SET enable_indexscan = false; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.00 INFO: cost_index: run_cost: 21154.308116 startup_cost: 1.00 indexCorrelation: 0.999729 QUERY PLAN --- Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840 loops=1) Filter: (utc_date '2002-10-01 00:00:00-07'::timestamp with time zone) Total runtime: 10506.44 msec (3 rows) Same here: actual time=1091.45..7441.19 but Total runtime: 10506.44 msec - more than 3 seconds lost. When we ignore total runtime and look at actual time we get seqidx estimated 21473 21154 actual 7441 6626 This doesn't look too bad, IMHO. BTW, I believe that with your example (single-column index, almost perfect correlation, index cond selects almost all tuples) all interpolation methods give an index cost estimation that is very close to seq scan cost, and the actual runtimes show that this is correct. Which I find surprising and humorous given the popular belief is, mine included, contrary to those results. How many tuples are in report_user_cat_count? What are the stats for report_user_cat_count.utc_date? I can say with pretty high confidence that the patch to use a geometric mean isn't correct after having done real world testing as its break even point is vastly incorrect and only uses an index when there are less than 9,000 rows to fetch, a far cry from the 490K break even I found while testing. Could you elaborate, please. The intention of my patch was to favour index scans more than the current implementation. If it does not, you have found a bug in my patch. Did you test the other interpolation methods? What I did find interesting, however, was that it does work better at determining the use of multi-column indexes, Yes, because it computes the correlation for a two-column-index as correlation_of_first_index_column * 0.95 instead of correlation_of_first_index_column / 2 but I think that's because the geometric mean pessimizes the value of indexCorrelation, which gets pretty skewed when using a multi-column index. I don't understand this. # CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date); # CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count; # ANALYZE report_user_cat_count; # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM
Re: [HACKERS] Correlation in cost_index()
Sean Chittenden [EMAIL PROTECTED] writes: indexCorrelation is 1.0 for the 1st key in a multi-column index. ... only if it's perfectly correlated. As things stand, however, if a multi-column key is used, the indexCorrelation is penalized by the size of the number of keys found in the multi-column index. As things stand the qual user_id = 42, on a CLUSTER'ed multi-column index (user_id,utc_date) has an indexCorrelation of 0.5, when in fact the correlation is 1.0. Right, in the perfectly-correlated case this calculation is clearly wrong. However, what of cases where the first column shows good correlation with the physical ordering, but the second does not? The nasty part of this is that the correlation stat that ANALYZE computed for the second column is of no value to us. Two examples: X Y X Y A A A B A B A C A C A A B A B A B B B C B C B B C A C C C B C A C C C B In both cases ANALYZE will calculate correlation 1.0 for column X, and something near zero for column Y. We would like to come out with index correlation 1.0 for the left-hand case and something much less (but, perhaps, not zero) for the right-hand case. I don't really see a way to do this without actually examining the multi-column ordering relationship during ANALYZE. I tossed a different index on my test table to see how well things fare with a low correlation, and this was a bit disturbing: Seems like most of the error in that estimate has to do with the poor rowcount estimation. There's very little percentage in trying to analyze the effect of index correlation in examples where we don't have the first-order stats correct ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
AFAICS (part of) the real problem is in costsize.c:cost_index() where IO_cost is calculated from min_IO_cost, pages_fetched, random_page_cost, and indexCorrelation. The current implementation uses indexCorrelation^2 to interpolate between min_IO_cost and max_IO_cost, which IMHO gives results that are too close to max_IO_cost. The indexCorrelation^2 algorithm was only a quick hack with no theory behind it :-(. I've wanted to find some better method to put in there, but have not had any time to research the problem. Could we quick hack it to a geometric mean instead since a mean seemed to yield better results than indexCorrelation^2? As nobody knows how each of these proposals performs in real life under different conditions, I suggest to leave the current implementation in, add all three algorithms, and supply a GUC variable to select a cost function. I don't think it's really a good idea to expect users to pick among multiple cost functions that *all* have no guiding theory behind them. I'd prefer to see us find a better cost function and use it. Has anyone trawled the database literature on the subject? Hrm, after an hour of searching and reading, I think one of the better papers on the subject can be found here: http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf Page 13, figure 3-12 is the ticket you were looking for Tom. It's an interesting read with a pretty good analysis and conclusion. The author notes that his formula begins to fall apart when the number of dimensions reaches 10 and suggests the use of a high dimension random cost estimate algo, but that the use of those comes at great expense to the CPU (which is inline with a few other papers that I read). The idea of precomputing values piqued my interest and I thought was very clever, albeit space intensive. *shrug* -sc -- Sean Chittenden ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Fri, 08 Aug 2003 18:25:41 -0400, Tom Lane [EMAIL PROTECTED] wrote: Two examples: [...] One more example: X Y A A a B A C b A B B b C C A c B C C Correlation for column X is something less than 1.0, OTOH correlation for an index on upper(X) is 1.0. I don't really see a way to do this without actually examining the multi-column ordering relationship during ANALYZE. So did we reach consensus to add a TODO item? * Compute index correlation on CREATE INDEX and ANALYZE, use it for index scan cost estimation Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Correlation in cost_index()
On Thu, 7 Aug 2003 13:44:19 -0700, Sean Chittenden [EMAIL PROTECTED] wrote: The indexCorrelation^2 algorithm was only a quick hack with no theory behind it :-(. I've wanted to find some better method to put in there, but have not had any time to research the problem. Could we quick hack it to a geometric mean instead since a mean seemed to yield better results than indexCorrelation^2? Linear interpolation on (1-indexCorrelation)^2 (algorithm 3 in http://members.aon.at/pivot/pg/16-correlation-732.diff) is almost as good as geometric interpolation (algorithm 4 in the patch, proposal 3 in this thread), and its computation is much cheaper because it does not call exp() and log(). Download http://members.aon.at/pivot/pg/cost_index.sxc and play around with your own numbers to get a feeling. (1-indexCorrelation)^2 suffers from the same lack of theory behind it as indexCorrelation^2. But the results look much more plausible. Well, at least to me ;-) Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Correlation in cost_index()
# SHOW effective_cache_size ; effective_cache_size -- 4456 (1 row) Only 35 MB? Are you testing on such a small machine? Testing on my laptop right now... can't hack on my production DBs the same way I can my laptop. The stats are attached bzip2 compressed. Nothing was attached. Did you upload it to your web site? Gah, not yet, forgot to send it. http://people.FreeBSD.org/~seanc/pg_statistic.txt.bz2 I can say with pretty high confidence that the patch to use a geometric mean isn't correct ... the problem with your patch was that it picked an index less often than the current code when there was low correlation. In cost_index.sxc I get lower estimates for *all* proposed new interpolation methods. Either my C code doesn't implement the same calculations as the spreadsheet, or ... I manually applied bits of it [...] ... could this explain the unexpected behaviour? Don't think so... the run_cost was correct, I didn't modify the indexCorrelation behavior beyond forcing it to 1.0. I'm currently downloading your dump. Can you post the query you mentioned above? SELECT * FROM report_user_cat_count AS rucc WHERE rucc.html_bytes 2000::BIGINT; SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42 AND utc_date = NOW(); SELECT * FROM report_user_cat_count AS rucc WHERE user_id = 42; SELECT * FROM report_user_cat_count AS rucc WHERE user_id 1000 AND utc_date '2003-01-01'::TIMESTAMP WITH TIME ZONE; And various timestamps back to 2002-09-19 and user_id's IN(1,42). -sc -- Sean Chittenden ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Correlation in cost_index()
Hrm, after an hour of searching and reading, I think one of the better papers on the subject can be found here: http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf Interesting paper, but I don't see the connection to index order correlation? Nothing that I found was nearly that specific, as close as I could find was the paper above on calculating the cost of fetching data from a disk, which I thought was the bigger problem at hand, but I digress... In one paper about large dimension index searches, they did suggest that cost was cumulative for the number of disk reads or nodes in the tree that weren't held in cache, which was the biggest hint that I had found on this specific topic. With that as a guiding light (or something faintly resembling it), it'd seem as though an avg depth of nodes in index * tuples_fetched * (random_io_cost * indexCorrelation) would be closer than where we are now... but now also think I/we're barking up the right tree with this thread. It's very possible that cost_index() is wrong, but it seems as though after some testing as if PostgreSQL _overly_ _favors_ the use of indexes: # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.00 INFO: cost_index: run_cost: 21154.308116 startup_cost: 0.00 indexCorrelation: 0.999729 QUERY PLAN --- Index Scan using report_user_cat_count_utc_date_id_idx on report_user_cat_count rucc (cost=0.00..21154.31 rows=705954 width=64) (actual time=91.36..6625.79 rows=704840 loops=1) Index Cond: (utc_date '2002-10-01 00:00:00-07'::timestamp with time zone) Total runtime: 11292.68 msec (3 rows) # SET enable_seqscan = true; SET enable_indexscan = false; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE utc_date '2002-10-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 21472.687500 startup_cost: 0.00 INFO: cost_index: run_cost: 21154.308116 startup_cost: 1.00 indexCorrelation: 0.999729 QUERY PLAN --- Seq Scan on report_user_cat_count rucc (cost=0.00..21472.69 rows=705954 width=64) (actual time=1091.45..7441.19 rows=704840 loops=1) Filter: (utc_date '2002-10-01 00:00:00-07'::timestamp with time zone) Total runtime: 10506.44 msec (3 rows) Which I find surprising and humorous given the popular belief is, mine included, contrary to those results. I can say with pretty high confidence that the patch to use a geometric mean isn't correct after having done real world testing as its break even point is vastly incorrect and only uses an index when there are less than 9,000 rows to fetch, a far cry from the 490K break even I found while testing. What I did find interesting, however, was that it does work better at determining the use of multi-column indexes, but I think that's because the geometric mean pessimizes the value of indexCorrelation, which gets pretty skewed when using a multi-column index. # CREATE INDEX report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count (user_id,utc_date); # CLUSTER report_user_cat_count_utc_date_user_id_idx ON report_user_cat_count; # ANALYZE report_user_cat_count; # SET enable_seqscan = true; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id 1000 AND utc_date '2002-01-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 23685.025000 startup_cost: 0.00 INFO: cost_index: run_cost: 366295.018684 startup_cost: 0.00 indexCorrelation: 0.50 QUERY PLAN Seq Scan on report_user_cat_count rucc (cost=0.00..23685.03 rows=133918 width=64) (actual time=0.28..6100.85 rows=129941 loops=1) Filter: ((user_id 1000) AND (utc_date '2002-01-01 00:00:00-08'::timestamp with time zone)) Total runtime: 6649.21 msec (3 rows) # SET enable_seqscan = false; SET enable_indexscan = true; SET SET # EXPLAIN ANALYZE SELECT * FROM report_user_cat_count AS rucc WHERE user_id 1000 AND utc_date '2002-01-01'::TIMESTAMP WITH TIME ZONE; INFO: cost_seqscan: run_cost: 23685.025000 startup_cost: 1.00 INFO: cost_index: run_cost:
Re: [HACKERS] Correlation in cost_index()
On Fri, 8 Aug 2003 16:53:48 -0700, Sean Chittenden [EMAIL PROTECTED] wrote: # SHOW effective_cache_size ; effective_cache_size -- 4456 (1 row) Only 35 MB? Are you testing on such a small machine? The stats are attached bzip2 compressed. Nothing was attached. Did you upload it to your web site? I can say with pretty high confidence that the patch to use a geometric mean isn't correct ... the problem with your patch was that it picked an index less often than the current code when there was low correlation. In cost_index.sxc I get lower estimates for *all* proposed new interpolation methods. Either my C code doesn't implement the same calculations as the spreadsheet, or ... I manually applied bits of it [...] ... could this explain the unexpected behaviour? I'm currently downloading your dump. Can you post the query you mentioned above? Servus Manfred ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Correlation in cost_index()
Sean Chittenden [EMAIL PROTECTED] writes: Which suggests to me that line 3964 in ./src/backend/utils/adt/selfuncs.c isn't right for multi-column indexes, esp for indexes that are clustered. I don't know how to address this though... Tom, any hints? Yes, we knew that already. Oliver had suggested simply dropping the division by nKeys, thus pretending that the first-column correlation is close enough. That seems to me to be going too far in the other direction, but clearly dividing by nKeys is far too pessimistic. I'd change this in a moment if someone could point me to a formula with any basis at all ... regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Correlation in cost_index()
Sean Chittenden [EMAIL PROTECTED] writes: Hrm, after an hour of searching and reading, I think one of the better papers on the subject can be found here: http://www.cs.ust.hk/faculty/dimitris/PAPERS/TKDE-NNmodels.pdf Interesting paper, but I don't see the connection to index order correlation? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Correlation in cost_index()
On Thu, 03 Oct 2002 14:50:00 -0400, Tom Lane [EMAIL PROTECTED] wrote: indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. Yeah, I concluded later that that was bogus. I've been thinking of just using the correlation of the first index column and ignoring the rest; that would not be great, but it's probably better than what's there. Have you got a better formula? Unfortunately not. I think such a formula does not exist for the information we have. What we'd need is a notion of correlation of the nth (n 1) index column for constant values of the first n-1 index columns; or a combined correlation for the first n index columns (1 n = number of index columns). I try to understand the problem with the help of use cases. [ Jump to the end, if this looks to long-winded. ] 1) Have a look at invoice items with an index on (fyear, invno, itemno). Invoice numbers start at 1 for each financial year, item numbers start at 1 for each invoice. In a typical scenario correlations for fyear, (fyear, invno), and (fyear, invno, itemno) are close to 1; invno correlation is expected to be low; and itemno looks totally chaotic to the analyzer. When we SELECT * FROM item WHERE fyear = 2001 AND invno 1000 dividing the correlation of the first column by the number of columns gives 1/3 which has nothing to do with what we want. (And then the current implementation of cost_index() squares this and gets 1/9 which is even farther away from the truth.) Just using the correlation of the first index column seems right here. 2) OTOH consider bookkeeping with enties identified by (fyear, account, entryno). Again fyear has a correlation near 1. For account we can expect something near 0, and entryno has a distribution comparable to invno in the first use case, i.e. starting at 1 for each year. SELECT * from entry WHERE fyear = 2001 AND account = whatever Taking the correlation of fyear would imply that the tuples we are looking for are close to each other, which again can turn out to be wrong. So what do we know now? Even less than before :-( I have just one idea that might help a bit: If correlation of the first index column is near +/1, cost_index() should not use baserel-pages, but baserel-pages * selectivity of the first column. Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002 10:45:08 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: effective cache size is the default (i.e. commented out) The default is 1000, meaning ca. 8 MB, which seems to be way too low. If your server is (almost) exclusively used by Postgres, try setting it to represent most of your OS cache (as reported by free on Linux). Otherwise you have to estimate the fraction of the OS cache that gets used by Postgres. I'm still trying to get a feeling for how these settings play together, so I'd be grateful if you report back the effects this has on your cost estimates. Caveat: effective_cache_size is supposed to be the number of cache pages available to one query (or is it one scan?). So if you have several concurrent queries (or complex queries with several scans), you should choose a lower value. OTOH if most of your queries operate on the same data, one query could benefit from pages cached by other queries ... You have to experiment a little. Servus Manfred ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Correlation in cost_index()
On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane [EMAIL PROTECTED] wrote: I don't think it's really a good idea to expect users to pick among multiple cost functions The idea is that PG is shipped with a default representing the best of our knowledge and users are not encouraged to change it. When a user sends a PG does not use my index or Why doesn't it scan sequentially? message to one of the support lists, we advise her/him to set index_cost_algorithm to 3 (or whatever we feel appropriate) and watch the feedback we get. We don't risk anything, if the default is the current behaviour. Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Great! Gotta patch? Not yet. I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results. May I ask for more information? What are your settings for effective_cache_size and shared_buffers? And which version are you running? Servus Manfred ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Gotta patch? Yes, see below. Disclaimer: Apart from make; make check this is completely untested. Use at your own risk. Have fun! Servus Manfred diff -ruN ../base/src/backend/optimizer/path/costsize.c src/backend/optimizer/path/costsize.c --- ../base/src/backend/optimizer/path/costsize.c 2002-07-04 18:04:57.0 +0200 +++ src/backend/optimizer/path/costsize.c 2002-10-03 09:53:06.0 +0200 @@ -72,6 +72,7 @@ double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; +intindex_cost_algorithm = DEFAULT_INDEX_COST_ALGORITHM; Cost disable_cost = 1.0; @@ -213,8 +214,8 @@ CostindexStartupCost; CostindexTotalCost; Selectivity indexSelectivity; - double indexCorrelation, - csquared; + double indexCorrelation; + CostIO_cost; Costmin_IO_cost, max_IO_cost; Costcpu_per_tuple; @@ -329,13 +330,62 @@ min_IO_cost = ceil(indexSelectivity * T); max_IO_cost = pages_fetched * random_page_cost; - /* -* Now interpolate based on estimated index order correlation to get -* total disk I/O cost for main table accesses. -*/ - csquared = indexCorrelation * indexCorrelation; + switch (index_cost_algorithm) { + case 1: { + /* + ** Use abs(correlation) for linear interpolation + */ + double absC = fabs(indexCorrelation); + + IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost; + } + + case 2: { + /* + ** First estimate number of pages and cost per page, + ** then multiply the results. min_IO_cost is used for + ** min_pages, because seq_page_cost = 1. + */ + double absC = fabs(indexCorrelation); + + double estPages = absC * min_IO_cost + (1 - absC) * pages_fetched; + double estPCost = absC * 1 + (1 - absC) * random_page_cost; + IO_cost = estPages * estPCost; + } + + case 3: { + /* + ** Interpolate based on independence squared, thus forcing the + ** result to be closer to min_IO_cost + */ + double independence = 1 - fabs(indexCorrelation); + double isquared = independence * independence; + + IO_cost = (1 - isquared) * min_IO_cost + isquared * max_IO_cost; + } + + case 4: { + /* + ** Interpolate geometrically + */ + double absC = fabs(indexCorrelation); + + IO_cost = exp(absC * log(min_IO_cost) + + (1 - absC) * log(max_IO_cost)); + } + + default: { + /* +* Interpolate based on estimated index order correlation +* to get total disk I/O cost for main table accesses. +*/ + double csquared = indexCorrelation * indexCorrelation; + + IO_cost = max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + } + } - run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + run_cost += IO_cost; /* * Estimate CPU costs per tuple. diff -ruN ../base/src/backend/utils/misc/guc.c src/backend/utils/misc/guc.c --- ../base/src/backend/utils/misc/guc.c2002-07-20 17:27:19.0 +0200 +++ src/backend/utils/misc/guc.c2002-10-03 10:03:37.0 +0200 @@ -644,6 +644,11 @@ }, { + { index_cost_algorithm, PGC_USERSET }, index_cost_algorithm, + DEFAULT_INDEX_COST_ALGORITHM, 0, INT_MAX, NULL, NULL + }, + + { { NULL, 0 }, NULL, 0, 0, 0, NULL, NULL } }; diff -ruN ../base/src/include/optimizer/cost.h src/include/optimizer/cost.h --- ../base/src/include/optimizer/cost.h2002-06-21 02:12:29.0 +0200 +++ src/include/optimizer/cost.h2002-10-03 09:56:28.0 +0200 @@ -24,6 +24,7 @@ #define DEFAULT_CPU_TUPLE_COST 0.01 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.001 #define DEFAULT_CPU_OPERATOR_COST 0.0025 +#define DEFAULT_INDEX_COST_ALGORITHM 3 /* defaults for function attributes used for expensive function calculations */ #define BYTE_PCT 100 @@ -43,6 +44,7 @@ extern double cpu_tuple_cost; extern double cpu_index_tuple_cost; extern double cpu_operator_cost; +extern int index_cost_algorithm; extern Cost disable_cost; extern bool enable_seqscan; extern
Re: [HACKERS] Correlation in cost_index()
On Thu, 03 Oct 2002 12:40:20 +0200, I wrote: Gotta patch? Yes, see below. Oh, did I mention that inserting some break statements after the switch cases helps a lot? :-( Cavus venter non laborat libenter ... Servus Manfred ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, Scott, yet another question: are multicolunm indices involved in your estimator problems? Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002, Manfred Koizar wrote: On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Great! Gotta patch? Not yet. I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results. May I ask for more information? What are your settings for effective_cache_size and shared_buffers? And which version are you running? I'm running 7.2.2 in production and 7.3b2 in testing. effective cache size is the default (i.e. commented out) shared buffers are at 4000. I've found that increasing shared buffers past 4000 (32 megs) to 16384 (128 Megs) has no great effect on my machine's performance, but I've never really played with effective cache size. I've got a couple of queries that join a 1M+row table to itself and to a 50k row table, and the result sets are usually 100 rows at a time. Plus some other smaller datasets that return larger amounts (i.e. sometimes all rows) of data generally. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002, Manfred Koizar wrote: On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, Scott, yet another question: are multicolunm indices involved in your estimator problems? No. Although I use them a fair bit, none of the problems I've encountered so far have involved them. But I'd be willing to setup some test indexes and get some numbers on them. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002 10:59:54 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: are multicolunm indices involved in your estimator problems? No. Although I use them a fair bit, none of the problems I've encountered so far have involved them. But I'd be willing to setup some test indexes and get some numbers on them. Never mind! I just stumbled over those lines in selfuncs.c where indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. I have a use case here where this clearly is not the right choice and was hoping to find some examples that help me investigate whether my case is somewhat uncommon ... Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
Manfred Koizar [EMAIL PROTECTED] writes: Never mind! I just stumbled over those lines in selfuncs.c where indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. Yeah, I concluded later that that was bogus. I've been thinking of just using the correlation of the first index column and ignoring the rest; that would not be great, but it's probably better than what's there. Have you got a better formula? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Correlation in cost_index()
You all know this FAQ: Why does Postgres not use my index? Half of the time this problem can easily be solved by casting a literal to the type of the respective column; this is not my topic here. In many other cases it turns out that the planner over-estimates the cost of an index scan. Sometimes this can be worked around by lowering random_page_cost. Of course, that's a hack that is quite unrelated to the real problem. I strongly agree ;-) AFAICS (part of) the real problem is in costsize.c:cost_index() where IO_cost is calculated from min_IO_cost, pages_fetched, random_page_cost, and indexCorrelation. The current implementation uses indexCorrelation^2 to interpolate between min_IO_cost and max_IO_cost, which IMHO gives results that are too close to max_IO_cost. This conjecture is supported by the fact, that often actual run times are much lower than estimated, when seqscans are disabled. So we have to find a cost function, so that . min_IO_cost = cost = max_IO_cost for -1 = indexCorrelation = 1 . cost -- min_IO_cost for indexCorrelation -- +/- 1 . cost -- max_IO_cost for indexCorrelation -- 0 . cost tends more towards min_IO_cost than current implementation After playing around a bit I propose three functions satisfying above conditions. All proposals use absC = abs(indexCorrelation). Proposal 1: Use absC for interpolation. IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost; Proposal 2: First calculate estimates for numbers of pages and cost per page, then multiply the results. estPages = absC * minPages + (1 - absC) * maxPages; estPCost = absC * 1 + (1 - absC) * random_page_cost; /* ^ sequential_page_cost */ IO_cost = estPages * estPCost; Proposal 3: Interpolate geometrically, using absC. IO-cost = exp( absC* ln(min_IO_Cost) + (1 - absC) * ln(max_IO_cost)); Here are some numbers for seq_page_cost = 1 (constant) random_page_cost = 4 (GUC) minPages = 61 maxPages = 1440 corr current p1p2p3 0 5760.00 5760.00 5760.00 5760.00 0.1 5703.01 5190.10 4817.77 3655.22 0.2 5532.04 4620.20 3958.28 2319.55 0.3 5247.09 4050.30 3181.53 1471.96 0.4 4848.16 3480.40 2487.52934.08 0.5 4335.25 2910.50 1876.25592.76 0.6 3708.36 2340.60 1347.72376.16 0.7 2967.49 1770.70901.93238.70 0.8 2112.64 1200.80538.88151.48 0.9 1143.81630.90258.57 96.13 0.95 616.65345.95149.44 76.57 0.99 174.41117.99 77.03 63.84 0.995 117.85 89.50 68.91 62.40 0.999 72.39 66.70 62.57 61.28 1 61.00 61.00 61.00 61.00 Another example for seq_page_cost = 1 (constant) random_page_cost = 10 (GUC) minPages = 20 maxPages = 938.58 corr current p1p2p3 0 9385.79 9385.79 9385.79 9385.79 0.1 9292.14 8449.21 7705.17 5073.72 0.2 9011.16 7512.64 6189.88 2742.73 0.3 8542.87 6576.06 4839.94 1482.65 0.4 7887.27 5639.48 3655.34801.48 0.5 7044.35 4702.90 2636.09433.26 0.6 6014.11 3766.32 1782.19234.21 0.7 4796.56 2829.74 1093.62126.61 0.8 3391.69 1893.16570.40 68.44 0.9 1799.50956.58212.53 37.00 0.95 933.16488.29 95.60 27.20 0.99 206.38113.66 31.81 21.27 0.995 113.42 66.83 25.70 20.62 0.999 38.72 29.37 21.11 20.12 1 20.00 20.00 20.00 20.00 (If you want to play around with your own numbers, I can send my OOo spreadsheet privately or to the list.) The second example shows that especially with proposal 3 we could afford to set random_page_cost to a *higher* value, which in contrast to previous recommendations seems to be appropriate, IIRC that benchmark results posted here showed values of up to 60. As nobody knows how each of these proposals performs in real life under different conditions, I suggest to leave the current implementation in, add all three algorithms, and supply a GUC variable to select a cost function. Comments? Ideas? Objections? Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002, Manfred Koizar wrote: As nobody knows how each of these proposals performs in real life under different conditions, I suggest to leave the current implementation in, add all three algorithms, and supply a GUC variable to select a cost function. I'd certainly be willing to do some testing on my own data with them. Gotta patch? I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Correlation in cost_index()
Manfred Koizar [EMAIL PROTECTED] writes: AFAICS (part of) the real problem is in costsize.c:cost_index() where IO_cost is calculated from min_IO_cost, pages_fetched, random_page_cost, and indexCorrelation. The current implementation uses indexCorrelation^2 to interpolate between min_IO_cost and max_IO_cost, which IMHO gives results that are too close to max_IO_cost. The indexCorrelation^2 algorithm was only a quick hack with no theory behind it :-(. I've wanted to find some better method to put in there, but have not had any time to research the problem. As nobody knows how each of these proposals performs in real life under different conditions, I suggest to leave the current implementation in, add all three algorithms, and supply a GUC variable to select a cost function. I don't think it's really a good idea to expect users to pick among multiple cost functions that *all* have no guiding theory behind them. I'd prefer to see us find a better cost function and use it. Has anyone trawled the database literature on the subject? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly