Hi Hieu, should be ISO 639-1, see this link: - http://www.loc.gov/standards/iso639-2/php/code_list.php ca: Catalan el: Greek is: Icelandic pl: Polish ro: Romanian sk: Slovak sl: Slovenian su: Sundanese Cheers, Chris
Hieu Hoang <[email protected]> hat am 30. November 2011 um 07:43 geschrieben: > Hi all > > I'm looking at the non-breaking prefix files for the tokenizer in the > directory > /scripts/tokenizer/nonbreaking_prefixes > > i don't quite know what languages the 2-letter file suffixes stand for. I can > hazzard a guest but it's prob be better to be sure and write it down > somewhere. Can anybody enlighten me? Specifically > ca > el > is > pl > ro > sk > sl > su -- Dipl.-Inf. Christian Federmann, Researcher, Language Technology Lab Office +1.09 -- Phone +49-681/857-75-5353, Fax +49-681/857-75-5338 DFKI GmbH, Campus D3 2, Stuhlsatzenhausweg 3, 66123 Saarbruecken http://www.dfki.de/~cfedermann ------------------------------------------------------------------- Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ------------------------------------------------------------------- _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
