What's the best strategy to be able to spellcheck abbreviated words that
require dots?
E.g. in Ukrainian there'll be often word "англ." (which is shortened for
English) usually followed by some English word. I'd like for speller to
know about англ but only if it's followed by the dot.
I was able to handle such cases in segment.srx for sentence tokenizing but
I would like to do something similar for spellchecking (and potentially
tagger/other rules)
It seems that currently spellcheker does not know about following dot.
I looked at the code and it seems like there's couple of options:
* use disambiguator and immunize them (I don't quite like it as I would
still like to spellcheck them and potentially check in other rules)
* use disamgiguator and tag them appropriately (and then use special code
in spellchecker to match them)
* override the spellchecker code to do some look-ahead if word is in
abbreviated-with-dot list
which is the best/right way? or is there something else I missed?
Thanks
Andriy
P.S. I know it's sometimes impossible to tell dot in abbreviation vs full
stop in the end of the sentence but I am just trying to fix majority of
cases
------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel