Status: Assigned Owner: [email protected] CC: [email protected], [email protected], [email protected] Labels: Type-Bug Pri-2 OS-All Area-BrowserBackend I18N
New issue 8487 by [email protected]: The SpellcheckWordBreakIterator class should use ICU http://code.google.com/p/chromium/issues/detail?id=8487 Moved from internal bug b/1224367. The current SpellcheckWordBreakIterator class does not use ICU because its character attibutes does not match with the one of hunspell and this inconsistency causes a problem in filtering out non-word characters. On the other hand, ICU implements an excellent algorithm for word segmentation of internationalized texts and we should use it to improve the quality of the spell checker. Comments from Jungshik: ----------------------------- Hironori, it might be better to change WordAwareIterator in Webkit (and pass it along to the upstream) than making a change on our side. If I understand correctly, for a "word" (passed from webkit to us) actually made up of multiple words, we can only check the spelling for the first of them. By modifying Webkit to pass a real single word, we can avoid that problem, I believe. BTW, I realized that it's a little tricky for Thai because Thai segmenter is based on dictionary. And, I'm not sure how ICU word-break behaves coming across misspelled words. -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings --~--~---------~--~----~------------~-------~--~----~ Automated mail from issue updates at http://crbug.com/ Subscription options: http://groups.google.com/group/chromium-bugs -~----------~----~----~----~------~----~------~--~---
