On Mon, Aug 13, 2012 at 1:11 AM, Alexander Korotkov <aekorot...@gmail.com>wrote:
> On Thu, Aug 9, 2012 at 12:44 AM, Alexander Korotkov > <aekorot...@gmail.com>wrote: > >> My conclusion is so, that current errors are probably ok for selectivity >> estimation. But taking into attention that generated datasets ideally fits >> assumptions of estimation, there could be room for improvement. Especially, >> it's unclear why estimate for "<@" and "@>" have much greater error than >> estimate for "&&". Possibly, it's caused by some bugs. >> > > ITSM, I found reason of inaccuracy. Implementation of linear interpolation > was wrong. Fixed version is attached. Now, need to rerun tests, possible > refactoring and comments rework. > After fixing few more bugs, I've a version with much more reasonable accuracy. Statistics target = 100. Relatively large result sets (>= 10) test=# select operator, avg(estimate_count::float8/actual_count::float8) as avg_ratio, avg(exp(abs(ln(estimate_count::float8/actual_count::float8)))) - 1.0 as avg_error from datasets d join test_results tr on tr.test_id = d.idwhere d.stat_target = 100 and actual_count >= 10 group by operator; operator | avg_ratio | avg_error ----------+------------------+-------------------- <@ | 1.00404179116863 | 0.0504415454560903 @> | 1.06364108531688 | 0.105646077989812 && | 1.00757984721409 | 0.0420984234933233 (3 rows) Small result sets (1 - 9) test=# select operator, avg(estimate_count::float8/actual_count::float8) as avg_ratio, avg(exp(abs(ln(estimate_count::float8/actual_count::float8)))) - 1.0 as avg_error from datasets d join test_results tr on tr.test_id = d.idwhere d.stat_target = 100 and actual_count between 1 and 9 group by operator; operator | avg_ratio | avg_error ----------+------------------+------------------- <@ | 1.31530838062865 | 0.654886592410495 @> | 2.78708078320147 | 1.94124123003433 && | 1.93268112525538 | 1.09904919063335 (3 rows) Empty result sets test=# select operator, avg(estimate_count) as avg_estimate, count(*) as tests_count from datasets d join test_results tr on tr.test_id = d.id where d.stat_target = 100 and actual_count = 0 group by operator; operator | avg_estimate | tests_count ----------+--------------------+------------- <@ | 1.1437670609645132 | 1099 @> | 1.0479430126460701 | 87458 (2 rows) Statistics target = 1000. Relatively large result sets (>= 10) test=# select operator, avg(estimate_count::float8/actual_count::float8) as avg_ratio, avg(exp(abs(ln(estimate_count::float8/actual_count::float8)))) - 1.0 as avg_error from datasets d join test_results tr on tr.test_id = d.idwhere d.stat_target = 1000 and actual_count >= 10 group by operator; operator | avg_ratio | avg_error ----------+------------------+-------------------- <@ | 1.00073999445381 | 0.045099762607524 @> | 1.05296320350853 | 0.0907489633452971 && | 1.00217602359039 | 0.0353421159150165 (3 rows) Small result sets (1 - 9) test=# select operator, avg(estimate_count::float8/actual_count::float8) as avg_ratio, avg(exp(abs(ln(estimate_count::float8/actual_count::float8)))) - 1.0 as avg_error from datasets d join test_results tr on tr.test_id = d.idwhere d.stat_target = 1000 and actual_count between 1 and 9 group by operator; operator | avg_ratio | avg_error ----------+------------------+------------------- <@ | 1.26946358795998 | 0.577803898836364 @> | 2.69000633430211 | 1.83165424646645 && | 1.48715184186882 | 0.577998652291105 (3 rows) Empty result sets test=# select operator, avg(estimate_count) as avg_estimate, count(*) as tests_count from datasets d join test_results tr on tr.test_id = d.id where d.stat_target = 1000 and actual_count = 0 group by operator; operator | avg_estimate | tests_count ----------+--------------------+------------- <@ | 1.0887096774193548 | 1364 @> | 1.0423876983771183 | 89224 && | 5.0000000000000000 | 1 (3 rows) ------ With best regards, Alexander Korotkov.
range_stat-0.6.patch.gz
Description: GNU Zip compressed data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers