After some more thoughts about the diference of the two ideas, then I find we are resolving two different issues, just that in the wrong index choose cases, both of them should work generally.
Your idea actually adding some rule based logic named certainty_factor, just the implemenation is very grace. for the example in this case, it take effects *when the both indexes has the same cost*. I believe that can resolve the index choose here, but how about the rows estimation? issue due to the fact that the design will not fudge the cost anyway, I assume you will not fudge the rows or selectivity as well. Then if the optimizer statistics is missing, what can we do for both index choosing and rows estimation? I think that's where my idea comes out. Due to the fact that optimizer statistics can't be up to date by design, and assume we have a sistuation where the customer's queries needs that statistcs often, how about doing the predication with the history statistics? it can cover for both index choose and rows estimation. Then the following arguments may be arised. a). we can't decide when the missed optimizer statistics is wanted *automatically*, b). if we predicate the esitmiation with the history statistics, the value of MCV information is missed. The answer for them is a). It is controlled by human with the "alter table t alter column a set (force_generic=on)". b). it can't be resolved I think, and it only take effects when the real Const is so different from the ones in history. generic plan has the same issue I think. I just reviewed the bad queries plan for the past half years internally, I found many queries used the Nested loop which is the direct cause. now I think I find out a new reason for this, because the missed optimizer statistics cause the rows in outer relation to be 1, which make the Nest loop is choosed. I'm not sure your idea could help on this or can help on this than mine at this aspect. -- Best Regards Andy Fan