Quoting "Kevin B. Hendricks" <[EMAIL PROTECTED]>:

> Hi,
>
> > 1. pernament (instead of permanent)
> >
> > Hunspell 1.0.9: ornamental, ornament, tournament
> >
> > Hunspell 1.1.0: permanent
> >
> > Note: swap character detection
>
>
> Actually, the union of the suggestions from 1.0.9 and 1.1.0 would
> probably be better yet.
>
> pernament and ornament are actually so close one either could easily
> be generated by a bad touch typist (the "o" and "p" are right beside
> each other on the keyboard as are "e" and "r" so one could have
> wanted to type "ornament" and instead hit the "p" in place of the "o"
> and hit both "e" and "r" simultaneously.

Hi,

Nice observation!

I continuously plan a better final comparison between ngram suggestions.

>
> In fact, many of the first real suggestion/correction studies
> actually looked at common "typing mistakes" as opposed to simply
> "weak spelling".  These of course depend on the layout of the keyboard.

It would be not problem with the KEYBOARD affix parameter:

KEYBOARD 4
KEYBOARD 0123456789öüó
KEYBOARD _qwertzuiopőú
KEYBOARD _asdfghjkléáű
KEYBOARD _íycvbnm,.-

(Perhaps one keyboard definition per affix file is enough, because
national keyboard layouts are usually well standardized.)

>
> Perhaps we need an interface element button labeled "More Choices"
> that could force the highly aggressive suggestion mechanisms to be
> run that would provide a longer list of suggestions (force ngram and
> other non single edit distance algos to run)

I think, we need more accurate and less suggestions, that is, we
need more information.

Perhaps Ngram-based uppercase and swap character suggestion would be default.

There are a lot of useful things for improving quality of suggestions:

- longest maching characters at beginning of the words (I think, it would
  be astonishingly good measure.)
- keyboard layout data
- experimental data of software ergonomy
- frequency data (from a big corpus and from the actual document)
- POS-tagging, sentence analysis (with a sentence or
  paragraph based spell checking)
- etc.

>
> BTW:  Nice Job with 1.1.0!!!

Thanks for your help!

Best regards

Laci


>
> Kevin
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to