I wrote: > I spent a fair chunk of the weekend hacking on this patch to make > it more understandable and fix up a lot of what seemed to me pretty > clear arithmetic errors in the "upper layers" of the patch. However, > I couldn't quite convince myself to commit it, because the business > around estimation for partial histogram-bucket matches still doesn't > make any sense to me.
Actually, there's a second large problem with this patch: blindly iterating through all combinations of MCV and histogram entries makes the runtime O(N^2) in the statistics target. I made up some test data (by scanning my mail logs) and observed the following planning times, as reported by EXPLAIN ANALYZE: explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip; explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip; stats target eqjoinsel networkjoinsel 100 0.27 ms 1.85 ms 1000 2.56 ms 167.2 ms 10000 56.6 ms 13987.1 ms I don't think it's necessary for network selectivity to be quite as fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning time for a query that might need just milliseconds to execute :-( It seemed to me that it might be possible to reduce the runtime by exploiting knowledge about the ordering of the histograms, but I don't have time to pursue that right now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers