On Tue, Jun 17, 2014 at 5:36 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Josh Berkus <j...@agliodbs.com> writes:
>> (2) If there are multiple columns with the same levenschtien distance,
>> which one do you suggest?  The current code picks a random one, which
>> I'm OK with.  The other option would be to list all of the columns.
> I objected to that upthread.  I don't think that picking a random one is
> sane at all.  Listing them all might be OK (I notice that that seems to be
> what both bash and git do).

What bash does is annoying and stupid, and any time I find a system
with that obnoxious behavior enabled I immediately disable it, so I
don't consider that a good precedent for anything.  I think what the
bash algorithm demonstrates is that while it may be sane to list more
than one option, listing 10 or 20 or 150 is unbearably obnoxious.
Filling the user's *entire terminal window* with a list of suggestions
when they make a minor typo is more like a punishment than an aid.
git's behavior of limiting itself to one or two options, while
somewhat useless, is at least not annoying.

> Another issue is whether to print only those having exactly the minimum
> observed Levenshtein distance, or to print everything less than some
> cutoff.  The former approach seems to me to be placing a great deal of
> faith in something that's only a heuristic.

Well, we've got lots of heuristics.  Many of them serve us quite well.
I might do something like this:

(1) Set the maximum levenshtein distance to half the length of the
string, rounded down, but not more than 3.
(2) If there are more than 2 matches, reduce the maximum distance by 1
and repeat this step.
(3) If there are no remaining matches, print no hint; else print the 1
or 2 matching items.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to