Peter Geoghegan <> writes:
> On Mon, Jun 16, 2014 at 7:09 PM, Ian Barwick <> wrote:
>> Howver in this particular use case, as long as it doesn't produce false
>> positives (I haven't looked at the patch) I don't think it would cause
>> any problems (of the kind which would require actively excluding certain
>> languages/character sets), it just wouldn't be quite as useful.

> I'm not sure what you mean by false positives. The patch just shows a
> HINT, where before there was none. It's possible for any number of
> reasons that it isn't the most useful possible suggestion, since
> Levenshtein distance is used as opposed to any other scheme that might
> be better sometimes. I think that the hint given is a generally useful
> piece of information in the event of an ERRCODE_UNDEFINED_COLUMN
> error. Obviously I think the patch is worthwhile, but fundamentally
> the HINT given is just a guess, as with the existing HINTs.

Not having looked at the patch, but: I think the probability of
useless-noise HINTs could be substantially reduced if the code prints a
HINT only when there is a single available alternative that is clearly
better than the others in Levenshtein distance.  I'm not sure how much
better is "clearly better", but I exclude "zero" from that.  I see that
the original description of the patch says that it will arbitrarily
choose one alternative when there are several with equal Levenshtein
distance, and I'd say that's a bad idea.

You could possibly answer this objection by making the HINT list *all*
the alternatives meeting the minimum Levenshtein distance.  But I think
that's probably overcomplicated and of uncertain value anyhow.  I'd rather
have a rule that "we print only the choice that is at least K units better
than any other choice", where K remains to be determined exactly.

                        regards, tom lane

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to