Dear Mojca, I have looked at the material. Currently I am half way through Liang's thesis. My first impressions are that one needs to be able to compose a program in binary in order to solve this.
The three Turkish/Basque files give some hope. I noticed that they are all in Ruby. Is that necessary? Could they be in Perl? I have some minimal Perl exposure. Also, the files are dealing with hyphenation whereas Lao doesn't hyphenate. We need to identify where word breaks can and can't occur. I know the rules, but translating them into Ruby or any other computer language is another story. I am not sure how to move forward. Also, some of the characters will not display properly on their own, is it better to write the unicode numbers? Brian On Mon, May 3, 2010 at 10:46 PM, Mojca Miklavec < [email protected]> wrote: > On Thu, Apr 29, 2010 at 06:53, Brian Wilson wrote: > > It seems that I may have reinvented the wheel (and created an inferior > > model.) > > > > For a pdf explanation of Lao syllabification check this link > > http://www.tcllab.org/events/uploads/valaxay-lao.pdf > > Thank you, > > Brian Wilson > > Thanks a lot for the link. > > I would not dare to create patterns myself (I would need to study the > letters, their encoding and rules into deeper detail and install the > appropriate fonts), but my suggestion would be ... > > 1.) Do you know how the hyphenation algorithm works? If you want, I > can send you some links and some material that I have on my computer. > > 2.) Your example calls for rule-based patterns. > > Here are some examples of how such patterns are being generated (they > are only of help once you understand what's under point "1"): > > > http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tk/generate_patterns_tk.rb > > > http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tr/generate_patterns_tr.rb > > > http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/eu/generate_patterns_eu.rb > > They all start wih something like ... > # h is not here. > consonants=%w{b c d f g j k l m n ñ p q r s t v w x y z} > # Open vowels: a e o > vop=%w{a e o} > # Closed vowels: i u > vcl=%w{i u} > > Maybe Arthur would be interested in exotic scripts, but it's best if > you do a headstart and start with a few simple patterns and then we > can help you when reach a step when you won't know how to proceed. > > Mojca > -- Brian Wilson, Director Asia-Pacific International University Translation Center _____________ I have a new blog!! http://tc4asia.org/wpblog "He hath shewed thee, O man, what is good; and what doth the LORD require of thee , but to do justly, and to love mercy, and to walk humbly with thy God." Micah 6:8
