On Fri, 7 May 2010, Lionello Lunesu wrote: > On 6-5-2010 22:37, Michel Fortin wrote: > > On 2010-05-05 23:45:50 -0400, Walter Bright <newshou...@digitalmars.com> > > said: > > > >> Walter Bright wrote: > >>> Alex Makhotin wrote: > >>>> It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 > >>>> kernel 2.6.32.4), to get the actual error messages about the > >>>> undefined identifier. > >>> > >>> Definitely there's a problem. > >> > >> The problem is the spell checker is O(n*n) on the number of characters > >> in the undefined identifier. > > > > That's an algorithm that can't scale then. > > > > Checking the Levenshtein distance for each known identifier within a > > small difference in length would be a better idea. (Clang is said to use > > the Levenshtein distance, it probably does something of the sort.) > > > > http://en.wikipedia.org/wiki/Levenshtein_distance > > > and especially this line: > > # If we are only interested in the distance if it is smaller than a > threshold k, then it suffices to compute a diagonal stripe of width 2k+1 > in the matrix. In this way, the algorithm can be run in O(kl) time, > where l is the length of the shortest string.
The source for this is pretty isolated.. anyone want to volunteer take a shot at improving this part of dmd? Later, Brad