Status: Assigned
Owner: [email protected]
CC: [email protected],  [email protected],  [email protected]
Labels: Type-Bug Pri-2 OS-All Area-BrowserBackend I18N

New issue 8487 by [email protected]: The SpellcheckWordBreakIterator  
class should use ICU
http://code.google.com/p/chromium/issues/detail?id=8487

Moved from internal bug b/1224367.

The current SpellcheckWordBreakIterator class does not use ICU because its
character attibutes does not match with the one of hunspell and this
inconsistency causes a problem in filtering out non-word characters.
On the other hand, ICU implements an excellent algorithm for word
segmentation of internationalized texts and we should use it to improve the
quality of the spell checker.


Comments from Jungshik:
-----------------------------
Hironori,  it might be better to change WordAwareIterator in Webkit (and
pass it along to the upstream) than making a change on our side. If I
understand correctly,  for a "word" (passed from webkit to us) actually
made up of multiple words, we can only check the spelling for the first of
them.  By modifying Webkit to pass a real single word,  we can avoid that
problem, I believe.

BTW, I realized that it's a little tricky for Thai because Thai segmenter
is based on dictionary.   And, I'm not sure how ICU word-break behaves
coming across misspelled words.



--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--~--~---------~--~----~------------~-------~--~----~
Automated mail from issue updates at http://crbug.com/
Subscription options: http://groups.google.com/group/chromium-bugs
-~----------~----~----~----~------~----~------~--~---

Reply via email to