Robert Haas <robertmh...@gmail.com> writes: > On Fri, Aug 23, 2024 at 11:17 AM Jonathan S. Katz <jk...@postgresql.org> > wrote: >> We hit an issue with pgvector[0] where a regular `SELECT count(*) FROM >> table`[1] is attempting to scan the index on the vector column when >> `enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.
> It took me a moment to wrap my head around this: the cost estimate is > 312 decimal digits long. Apparently hnswcostestimate() just returns > DBL_MAX when there are no scan keys because it really, really doesn't > want to do that. Before e2225346, that kept this plan from being > generated because it was (much) larger than disable_cost. But now it > doesn't, because 1 disabled node makes a path more expensive than any > possible non-disabled path. Since that was the whole point of the > patch, I don't feel too bad about it. Yeah, I don't think it's necessary for v18 to be bug-compatible with this hack. > If you don't want to fix hnsw to work the way the core optimizer > thinks it should, or if there's some reason it can't be done, > alternatives might include (1) having the cost estimate function hack > the count of disabled nodes and (2) adding some kind of core support > for an index cost estimator refusing a path entirely. I haven't tested > (1) so I don't know for sure that there are no issues, but I think we > have to do all of our cost estimating before we can think about adding > the path so I feel like there's a decent chance it would do what you > want. It looks like amcostestimate could change the path's disabled_nodes count, since that's set up before invoking amcostestimate. I guess it could be set to INT_MAX to have a comparable solution to before. I agree with you that it is not great that hnsw is refusing this case rather than finding a way to make it work, so I'm not excited about putting in support for refusing it in a less klugy way. regards, tom lane