> X-Mailer: MessagingEngine.com Webmail Interface > Date: Sun, 28 Oct 2012 19:35:58 +0100 > From: Per Tunedal <[email protected]> > To: Apertium Stuff <[email protected]> > Reply-To: [email protected] > Subject: [Apertium-stuff] Paradigms in Bidixes > > > Hi, > can someone, please, explain the concept of paradigms in bidixes? I just > cannot understand what they are supposed to do. > > In a monodix it's pretty obvious that you need paradigms. But in the > bidix!? >
I answer a litle as a newbie (or quite). For me, in a new language pair, paradigms in the bidix would make it easier to generate it from wordlists. Several months ago, I added words in fr-es and after eo-fr pair. I quickly found tedious to use a text editor to put at the good place to keep alphabetic order, in a (around) 40000 lines file informations with the good XML syntax for new entries. So I as much quickly wrote two shell for that : - one to generate new entries - one to put them in files at the good place. The first shell need 4 parameters : - word in the 2 languages, - paradigms for these words in the 2 languages. These informations are enough to generate monodices entries if these words are not yet present. For the bidix, a small and easy transformation is needed from paradigms names to keep only the suffix n, np, adj, vblex. For fs-es pair, bidixes done like that often work. But sometimes, it is necessary to tell in the bidix the word is masculine in one language ans feminine in the other. For eo-fr pair, gender of French nouns must always be put in the bidix. So, when transforming the name of the paradigm to keep only the word category, you loose information. But after that you still need to tell extra things in the bidix to permit tranfer to work correctly ! An the information you just lost was in the monodix paradigm. In the wiki (for instance) paradigms for monodixes are presented as a simple ways to indicate the way to built the different surface forms from a lemma, and to indicate specific attributes for each surface forms. But sometimes, there are common attributes for each surface form. For instance, French nouns are masculine OR feminine. So, there are paradigms for masculine nouns, and paradigms for feminine nouns. The information "add a 's' to go from singular to plural" is usefull in a monodix for analysis or generation. The information "this noun is masculine" is rather usefull in a bidix. I think during the next months to start working (again) on en-fr pair. One of the first thing to to will be to build a large bidix. For that, I think to work several months on wordlists, and when it will be ready, to use a program to build the bidix using the paradigm name nouns and adjectives have in their monodix. What will there be in the paradigms of the bidix ? The common part for every surface form of the corresponding paradigm in the monodix. For French nouns, that will be the gender. For French adjectives, that will be a facultative attribute "preadj" to say "this adjective must be before the noun". As it is not yet in French monodices, it will have to be added (in eo-fr pair where eo -> fr direction is not yet written). For French verbs, as I don't think to common attributes for each surface form, I don't think to use paradigms. -------------------------------- Bernard Chardonneau (France) Phone : [33] 1 64 90 87 04 (from Sept to June except holidays) GSM phone : [33] 6 49 95 13 95 (french scholl holidays, C zone) Multilingual websites for my free softwares : http://libremail.free.fr and http://libremail.tuxfamily.org http://cyloop.tuxfamily.org (mainly translated with Apertium) My general website (in french only) http://bech.free.fr ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
