On Mon, Dec 22, 2014 at 5:50 AM, Robert Haas <robertmh...@gmail.com> wrote: > Looking over the latest patch, I think we could simplify the code so > that you don't need multiple FuzzyAttrMatchState objects. Instead of > creating a separate one for each RTE and then merging them, just have > one. When you find an inexact-RTE name match, set a field inside the > FuzzyAttrMatchState -- maybe with a name like rte_penalty -- to the > Levenshtein distance between the RTEs. Then call scanRTEForColumn() > and pass down the same state object. Now let > updateFuzzyAttrMatchState() work out what it needs to do based on the > observed inter-column distance and the currently-in-force RTE penalty.
I'm afraid I don't follow. I think doing things that way makes things less clear. Merging is useful because it allows us to consider that an exact match might exist, which this searchRangeTableForCol() is already tasked with today. We now look for the best match exhaustively, or magically return immediately in the event of an exact match, without caring about the alias correctness or distance. Having a separate object makes this pattern apparent from the top level, within searchRangeTableForCol(). I feel that's better. updateFuzzyAttrMatchState() is the wrong place to put that, because that task rightfully belongs in searchRangeTableForCol(), where the high level diagnostic-report-generating control flow lives. To put it another way, creating a separate object obfuscates scanRTEForColumn(), since it's the only client of updateFuzzyAttrMatchState(). scanRTEForColumn() is a very important function, and right now I am only making it slightly less clear by tasking it with caring about distance of names on top of strict binary equality of attribute names. I don't want to push it any further. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers