Salut Bernard, Je ne comprends pas pourquoi faudrait-il changer les paradigmes académique__adj, mois__n et fois__n comme tu le dis. Je n'ai jamais rencontré aucun problème avec ces définitions. Elles sont aussi standards dans Apertium dans d'autres langues romaines. Peux-tu donner des exemples ? Ce qui, par contre, me pose parfois des problèmes ce sont les paradigmes anglais__adj et affectueu/x__adj, mais très rarement.
Hèctor Missatge de Bernard Chardonneau <b...@tuxfamily.org> del dia dg., 29 de març 2020 a les 0:52: > My point of view now. > > About French<->English translations. > > There are a lot of candidates for working for Apertium during GSOC and few > (I don't know what part) of them are taken. So, for the reason Hèctor > indicated, the English-French pair my not be taken for GSOC. > > But for me, any language pair not available as a free software is > interesting > for Apertium project, so, it will be good to develop English-French pair in > the future, even if it is outside GSOC. The fact Systran (a French > enterprise) > proposed several years ago it's translator for around 50 € the Windows > version, > but around 5000 $ the version for GNU/Linux / UNIX is a shame that gave me > a > good reason to join the apertium project. > > So, for the pairs including Esperanto. > > When these pairs were developed, each pair included any data files needed > to > compile a language pair. So these pairs includes at least : > - one morphological dictionary for each language, > - one file for disambiguating analysis for each language, > - a bilingual dictionary, > - transfer files, > - generally one post-generation file for each language (not really needed > when > target language is Esperanto, but there is one of these files in eo-en > pair. > > For the nature and the content of these files, let watch the wiki pages. > Here > is a list of pages > http://wiki.apertium.org/wiki/Traductions_en_fran%C3%A7ais > generally available both in English and French. > For installing Apertium, there was a lot of changes and English pages are > the > most up to date. > > A presentation in French about how Apertium works is also available here : > http://imagesn.free.fr/apertium/pres-atelier-2014.odp (for the slides) > > https://rmll.ubicast.tv/videos/developper-des-paires-de-langues-pour-la-traduction-automatique-avec-apertium/ > for the sound. > (As I only used my personal computer with a video-projector 3 times in my > life (always for Apertium), I was not used to first plug the > video-projector > cable, and secondly start or reboot my computer). > > Now, morphological dictionaries, disambiguation files and post-generation > files > are in the language branch and any (new) language pair using one particular > language access the same files for it. > > So, the new language pairs includes only specifics files for this pair > which > are : > - bilingual dictionary > - transfer files. > > For the French Esperanto pair, there are 2 accesses on Apertium repository > : > apertium-eo-fr > apertium-epo-fra > > apertium-eo-fr is the original version which only translates French into > Esperanto. I updated dictionaries when adding words, but I finally stopped > doing it. > > apertium-epo-fra has quite the same files. But last updates of dictionaries > were done only in this pair. Transfer files for translating French into > Esperanto are exactly the same (or possibly only their visual aspect was > changed). But this pair also include transfer files for translating > Esperanto into French. This side is not finished, was never released and > must be improved. > > So, for this pair, here is what could be done : > > In apertium-fra/apertium-fra.fra.metadix > > When a paradigm analyzes a word with gender mf or number sp, add RL > lines to accept also generation with gender m ou f or number sg or pl. > > 2 examples of paradigms to change : > > In apertium-fra.fra.metadix we find : > > <pardef n="académique__adj"> > <e> <p><l>s</l> <r><s n="adj"/><s n="mf"/><s > n="pl"/></r></p></e> > <e> <p><l></l> <r><s n="adj"/><s n="mf"/><s > n="sg"/></r></p></e> > </pardef> > > <pardef n="mois__n"> > <e> <p><l></l> <r><s n="n"/><s n="m"/><s > n="sp"/></r></p></e> > </pardef> > > <pardef n="fois__n"> > <e> <p><l></l> <r><s n="n"/><s n="f"/><s > n="sp"/></r></p></e> > </pardef> > > A better form present (and necessary) in apertium-epo-fra (and > apertium-fra-por) is : > > <pardef n="académique__adj"> > <e r="LR"><p><l></l> <r><s n="adj"/><s n="mf"/><s > n="sg"/></r></p></e> > <e r="LR"><p><l>s</l> <r><s n="adj"/><s n="mf"/><s > n="pl"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="adj"/><s n="m"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l>s</l> <r><s n="adj"/><s n="m"/><s > n="pl"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="adj"/><s n="f"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l>s</l> <r><s n="adj"/><s n="f"/><s > n="pl"/></r></p></e> > </pardef> > > <pardef n="fois__n" > <e><p><l></l> <r><s n="n"/><s n="f"/><s > n="sp"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="n"/><s n="f"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="n"/><s n="f"/><s > n="pl"/></r></p></e> > </pardef> > > <pardef n="mois__n" > <e><p><l></l> <r><s n="n"/><s n="m"/><s > n="sp"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="n"/><s n="m"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="n"/><s n="m"/><s > n="pl"/></r></p></e> > </pardef> > > For the first paradigm, the more simple syntax in fra-por pair is even a > little > better : > > <pardef n="académique__adj"> > <e><p><l></l> <r><s n="adj"/><s n="mf"/><s > n="sg"/></r></p></e> > <e><p><l>s</l> <r><s n="adj"/><s n="mf"/><s > n="pl"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="adj"/><s n="m"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l>s</l> <r><s n="adj"/><s n="m"/><s > n="pl"/></r></p></e> > <e r="RL"><p><l></l> <r><s n="adj"/><s n="f"/><s > n="sg"/></r></p></e> > <e r="RL"><p><l>s</l> <r><s n="adj"/><s n="f"/><s > n="pl"/></r></p></e> > </pardef> > > So, this kind of change will have to be done everywhere in > apertium-fra.fra.metadix > a word is analysed as mf (don't know if masculine or feminine) or sp > (don't know if > singular ou plural). > > That id already done everywhere in apertium-epo-fra.epo.dix (as in > apertium-fra-por.fra.metadix), so, you will just have to report these > changes. > > Add the words presents into apertium-epo-fra.fra.dix which are not yet > also in apertium-fra.fra.metadix > > Note : At least for one paradigm there is a difference between the two > files. > masculine names on which a "s" is added for plural use paradigm livre__n in > apertium-fra.fra.metadix but accessoire__n in apertium-epo-fra.fra.dix > > I prefer accessoire__n that would do for the two most commons paradigms > for names, the reference name or the paradigm appearing very early in the > alphabetically sorted list of words. > > So, let change everywhere livre__n by accessoire__n > > I don't know if there are other paradigms doing the same with different > names > in the two files, but if you find them, let take as the reference word the > first of these names in alphabetical order. > > Like that, the most frequently used paradigms will be the ones who appear > early in the full list of words alphabetically sorted. And that could be > a help for choosing a paradigm without generally having to read the content > of a large number of them. > > Now for the language epo. > > I found a horrible file of more than 200 000 lines of paradigms, and no > word for using them ! Completely useless. Only comments in the sdef section > could be usefull. > > So, this file will have to be built again from apertium-epo-fra.epo.dix and > apertium-eo-en.eo.dix.xml (+ eventually other files of that kind) to get > all the Esperanto word used in these pairs. Paradigm used seem to work the > same in both pairs. > > After that, you will have to test if tranfer rules still work and correct > them. > > As he said, for eo-en pair, ask to Jacob Nordfalk > > For fra => epo translation direction, ask to Hèctor Alòs > > For epo => fra translation direction, ask to me. > > For this translation direction, the 0 step (apertium-epo-fra.epo-fra.t0x) > add "unu" to names without the determinant "la". That allows to use the > same transfer rules for names with determinant "le" "la" "les" (or > sometimes > "l'" after post generation) and for names with determinant "un" "une" > "des". > > After that, only one stage of transfer is used. Presently, there are no > rules > for adverbs, or for pronouns in accusative form. Adding them would reduce a > lot the number of # in a translation. > > I also did a lot of tranfer rules for sentences like > <det>? <adj>* <n> "de" <det>? <adj>* <n> estas <adj> > <det>? <adj>* <n> "de" <det>? <adj>* <n> <verb> > > Example : > la kato de la najbarino estas blanka > la malgranda katino de la najbaro estas blanka > la kanino de la dika najbarino ne estas nigra > katoj de la dika granda najbaro estas blankaj > .. > > With the possibility of having 0, 1 or 2 adjectives for each name, that > makes plenty of similar transfer rules, even if in that case, the 0 step > divides them by 4. > > A good change should be to rewrite transfer rules for this kind of > sentences using a 3 stage transfer. That allows to process shorter > lists of words send gender and number (or other informations) of one > group to another. > > > > > Date: Wed, 25 Mar 2020 13:48:17 +0300 > > From: Hèctor Alòs i Font <hectora...@gmail.com> > > To: "[apertium-stuff]" <apertium-stuff@lists.sourceforge.net>, > > Bernard Chardonneau <bechapert...@free.fr>, > > Jacob Nordfalk <jacob.nordf...@gmail.com> > > Reply-To: apertium-stuff@lists.sourceforge.net > > Subject: Re: [Apertium-stuff] GSoC > > Pièce(s) jointes(s) probable(s)> > > Saluton, Andrew! > > Mi ĝojas legi pri propono rilata al esperanto. Mi daŭrigas angle, por ke > al > > ĉiuj estu kompreneble. > > > > It probably doesn't make any sense to work on the English-French pair in > > Apertium, since these are two of the languages with the most resources in > > the world (linguistic and non-linguistic). As a result, there are quite a > > lot of good translators between them, although most of them commercial. > > > > Esperanto is also included in Google Translator, but I think the > > Esperanto-French translation can be done at a similar level in Apertium. > > Moreover, the translation from Esperanto to French could be used to test > > the new apertium-recursive module. > > > > In fact, the current versions of Apertium's four Esperanto pairs > (English ⇆ > > Esperanto, French → Esperanto, Spanish → Esperanto and Catalan → > > Esperanto) > > were released ten years ago. They all use the old all-in-one-repository > > structure. Porting these four pairs into the new structure which shares > > language resources (using apertium-eng, apertium-fra, apertium-spa and > > apertium-cat) would result in a big improvement, because a lot of work > has > > been done in Apertium on these languages in the last ten years. But > porting > > is not automatic, since there are differences betweem the monodixes of > the > > current pairs and the ones in the given four repositories. There are even > > differences between the Esperanto-monodixes in these four pairs. > > > > Another question is that @Bernard Chardonneau <bechapert...@free.fr> has > > been working in his own branch of apertium-fra-esp. So, it'd be > interesting > > to read to what he thinks on a GSoC that would include this pair. Maybe > he > > could find time to mentor the project (and maybe @Jacob Nordfalk > > <jacob.nordf...@gmail.com>, who created the Esperanto-English pair, > too). > > > > So, in short, in my opinion: > > > > - You should evaluate the quality of the current Google translation, > > especially for the French-Esperanto pair, in which Google translates in > two > > steps (this > > < > http://wiki.apertium.org/wiki/Hectoralos/GSOC_2019_proposal:_Catalan-Italian_and_Catalan-Portuguese#Current_situation_of_the_language_pairs > > > >> is what I did in a similar case last year) > > > > - A part of the project could be a kind of elementary "make a language > pair > > state-of the art > > < > http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Make_a_language_pair_state-of-the-art > >". > >> At a minimum this would include the French-Esperanto pair > > > > - Maybe half of the project could be developing a translator from > Esperanto > > into French > > > > Hèctor > > > > Missatge de Andrew Briand <atb8...@comcast.net> del dia dc., 25 de març > > 2020 a les 11:46: > > > > > Hello, > > > > > > > > > > > > I am an undergrad interested in adopting an Apertium language pair for > > > Google Summer of Code 2020. I am most interested in French<->English, > > > Esperanto->English, and Esperanto<->French. What might a project for > those > > > language pairs look like? > > > > > > > > > > > > Thank you, > > > > > > > > > > > > Andrew Briand > > > > > > > > > > > > > > > _______________________________________________ > > > Apertium-stuff mailing list > > > Apertium-stuff@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > > > > > -------------------------------- > Bernard Chardonneau (France) > Phone : [33] 9 72 36 32 90 > GSM phone : [33] 7 69 46 16 31 > > An alternative Apertium translation website : > http://apertiumtrad.tuxfamily.org > > Multilingual websites for my free softwares : > http://libremail.free.fr and http://libremail.tuxfamily.org > http://cyloop.tuxfamily.org (mainly translated with Apertium) > > My general website (in french only) > http://bech.free.fr > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff