Hey Peter, Thanks for the feedback. Some comments inline...
-Nick On Tue, Mar 24, 2009 at 8:25 PM, Peter Kasting <pkast...@chromium.org>wrote: > On Tue, Mar 24, 2009 at 2:26 PM, Siddhartha Chattopadhyay < > sidc...@chromium.org> wrote: > >> A new feature to add to Chromium would be automatic spelling correction. A >> design doc for this feature can be found at >> http://sites.google.com/a/chromium.org/dev/developers/design-documents/automaticspellingcorrection. >> It would be great if you could go over it and comment. >> > > As I mentioned when this came up in a real-life meeting: > > I think it would be advantageous to get the Hunspell suggestions for a > misspelled word, in preference order, and then calculate how permuted each > is from your target word. If the "permute score" is low enough, consider > auto-correcting (perhaps you wouldn't do this if another suggestion has a > similarly low score). > > There are a couple reasons to prefer this method over your proposed > algorithm: > * It limits the search space, which may make a difference if the machine is > slow or in pathological cases (and saving CPU always seems nice). For > example, if someone pastes a string of 100,000 consecutive characters into a > textfield, will your algorithm bog the browser down? > We could easily handle this with a limit (above ten characters, don't bother). Agreed that this would be a good thing to have. > > * It is more easily extensible to other types of common mistypings we might > want to later correct for, e.g. replacement of one letter with an adjacent > letter on the keyboard, accidental insertion or omission of a letter, > missing uppercase, missing/inserted punctuation, etc. All these can be > added merely by including them in the scoring function, rather than writing > another iteration that checks various things. > We want to start out with a very conservative approach so as to not be annoying. We considered using the hunspell suggestions, but we wanted something that captured one particular, common type of misspelling. However, I believe Sid has built it in a way that lets us test other algorithms, including using hunspell's algorithm. > > Also, I think this method could be more suitable for reusing existing > low-level Hunspell methods, or perhaps even including in Hunspell and > sending upstream. > I agree that it would be great to build this in such a way that it can get upstreamed. Not sure if that's the case, or what the effort would be to make it so. > > I have some limited experience in using both methods to solve this precise > problem in a programming challenge at my past employer, so this suggestion > is not made in the abstract :) > > PK > > > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---