I wrote:
> I spent a fair chunk of the weekend hacking on this patch to make
> it more understandable and fix up a lot of what seemed to me pretty
> clear arithmetic errors in the "upper layers" of the patch.  However,
> I couldn't quite convince myself to commit it, because the business
> around estimation for partial histogram-bucket matches still doesn't
> make any sense to me.

Actually, there's a second large problem with this patch: blindly
iterating through all combinations of MCV and histogram entries makes the
runtime O(N^2) in the statistics target.  I made up some test data (by
scanning my mail logs) and observed the following planning times, as
reported by EXPLAIN ANALYZE:

explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip;
explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip;

stats target            eqjoinsel       networkjoinsel

100                     0.27 ms         1.85 ms
1000                    2.56 ms         167.2 ms
10000                   56.6 ms         13987.1 ms

I don't think it's necessary for network selectivity to be quite as
fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning
time for a query that might need just milliseconds to execute :-(

It seemed to me that it might be possible to reduce the runtime by
exploiting knowledge about the ordering of the histograms, but
I don't have time to pursue that right now.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to