Hey Peter,

Thanks for the feedback. Some comments inline...

-Nick

On Tue, Mar 24, 2009 at 8:25 PM, Peter Kasting <pkast...@chromium.org>wrote:

> On Tue, Mar 24, 2009 at 2:26 PM, Siddhartha Chattopadhyay <
> sidc...@chromium.org> wrote:
>
>> A new feature to add to Chromium would be automatic spelling correction. A
>> design doc for this feature can be found at
>> http://sites.google.com/a/chromium.org/dev/developers/design-documents/automaticspellingcorrection.
>> It would be great if you could go over it and comment.
>>
>
> As I mentioned when this came up in a real-life meeting:
>
> I think it would be advantageous to get the Hunspell suggestions for a
> misspelled word, in preference order, and then calculate how permuted each
> is from your target word.  If the "permute score" is low enough, consider
> auto-correcting (perhaps you wouldn't do this if another suggestion has a
> similarly low score).
>
> There are a couple reasons to prefer this method over your proposed
> algorithm:
> * It limits the search space, which may make a difference if the machine is
> slow or in pathological cases (and saving CPU always seems nice).  For
> example, if someone pastes a string of 100,000 consecutive characters into a
> textfield, will your algorithm bog the browser down?
>

We could easily handle this with a limit (above ten characters, don't
bother). Agreed that this would be a good thing to have.

>
> * It is more easily extensible to other types of common mistypings we might
> want to later correct for, e.g. replacement of one letter with an adjacent
> letter on the keyboard, accidental insertion or omission of a letter,
> missing uppercase, missing/inserted punctuation, etc.  All these can be
> added merely by including them in the scoring function, rather than writing
> another iteration that checks various things.
>

We want to start out with a very conservative approach so as to not be
annoying. We considered using the hunspell suggestions, but we wanted
something that captured one particular, common type of misspelling. However,
I believe Sid has built it in a way that lets us test other algorithms,
including using hunspell's algorithm.

>
> Also, I think this method could be more suitable for reusing existing
> low-level Hunspell methods, or perhaps even including in Hunspell and
> sending upstream.
>

I agree that it would be great to build this in such a way that it can get
upstreamed. Not sure if that's the case, or what the effort would be to make
it so.

>
> I have some limited experience in using both methods to solve this precise
> problem in a programming challenge at my past employer, so this suggestion
> is not made in the abstract :)
>
> PK
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to