The histogram values seem completely meaningless in this context ---
for containment purposes, they are just ten or so randomly chosen
values.  I don't believe that the estimator works better with them.
Certainly, whether the column is unique or not is totally irrelevant
to whether they are representative.

Right, but if the column has a high number of stats, I think that the samples found in the histogram could put the estimator on the right way: i.e. in my case 80% of the values have '1041' as their root leaf and most of the values in the histogram reflect this.

You're right saying that the column uniqueness isn't relevant to the histogram, but if the column is unique, there won't be any mcv, and the patch becomes useless.

What would seem saner to me is to add a datatype-specific analyze
function that collects some statistics that are actually relevant
to containment, and then make use of those in the estimator.

Perhaps you're right, but unfortunately it's not a thing I can do myself, because of lack of knowledge about both pg and ltree internals :(

Best regards
Matteo Beccati

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to