Hi Hieu,
 
should be ISO 639-1, see this link:
- http://www.loc.gov/standards/iso639-2/php/code_list.php
 
ca: Catalan
el: Greek
is: Icelandic
pl: Polish
ro: Romanian
sk: Slovak
sl: Slovenian
su: Sundanese
 
Cheers,
   Chris
 
 


Hieu Hoang <[email protected]> hat am 30. November 2011 um 07:43 geschrieben:


> Hi all
> 
> I'm looking at the non-breaking prefix files for the tokenizer in the
> directory
>    /scripts/tokenizer/nonbreaking_prefixes
> 
> i don't quite know what languages the 2-letter file suffixes stand for. I can
> hazzard a guest but it's prob be better to be sure and write it down
> somewhere. Can anybody enlighten me? Specifically
>   ca
>   el
>   is
>   pl
>   ro
>   sk
>   sl
>   su

 

--
Dipl.-Inf. Christian Federmann, Researcher, Language Technology Lab
Office +1.09 -- Phone +49-681/857-75-5353,  Fax +49-681/857-75-5338
DFKI GmbH,  Campus D3 2,  Stuhlsatzenhausweg 3,  66123 Saarbruecken
http://www.dfki.de/~cfedermann

-------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------------

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to