On Tue, Feb 28, 2012 at 10:35 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> The flaw in this logic, of course, is that the seqscan might be cheaper > than the parameterized indexscan, but *it produces a whole lot more > rows*, meaning that any subsequent join will be a lot more expensive. > Previously add_path didn't have to worry about that, because all > ordinary paths for a given relation produce the same number of rows > (and we studiously kept things like inner indexscan paths out of > add_path's view of the world). > > The most obvious thing to do about this is to consider that one path can > dominate another on cost only if it is both cheaper *and* produces no > more rows. But I'm concerned about the cost of inserting yet another > test condition into add_path, which is slow enough already. Has anyone > got an idea for a different formulation that would avoid that? It seems clear that we shouldn't be making that decision at that point. It would be better to default towards processing fewer rows initially and then swoop in later to improve decision making on larger plans. Can't we save the SeqScan costs at every node, then re-add SeqScan plans as a post-processing step iff the index/nestd loops plans appear costly? So have an additional post processing step that only cuts in with larger plans. Seqscan plans are bad for many reasons, such as pushing data out of cache, making the result more sensitive to growing data volumes or selectivity mistakes as well as producing confusing stats for people trying to add the right indexes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers