Hello,

I've been trying to develop HFST and TWOL files for the Uzbek language
by looking at how other similar languages (Tatar, Kazakh, etc.) have
done it. Those language rules are very complex, at least for someone
who doesn't know where to start reading. I usually look for a word and
then go backwords deciphering the rule chain to make sense of it. The
chain gets so long that I start forgetting the start of the rules. So
copying and pasting existing solutions and modifying them didn't appeal
to me. That's why I started adding simple rules first and then
expanding them for each use case. You can see my progress at [1] and
[2] (My previous work using the DIX format got so out of hand that I
gave up developing it.).

As I keep adding or changing more and more rules to fit new usecases, I
realize that I maybe breaking old usecases. That's why I'd like to
create test cases first and then change the rules and not be worried
that I broke any previous work. Are there any such tools that you use?


[1] https://github.com/bmansurov/apertium-uzb/blob/master/apertium-uzb.
uzb_Cyrl.lexc
[2] https://github.com/bmansurov/apertium-uzb/blob/master/apertium-uzb.
uzb_Cyrl.twol


-- 
Bahodir

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to