On 16 May 2010 17:46, Mikel L. Forcada <[email protected]> wrote: > Dear Ian, >> Mr. Forcada, >> > Please call me Mikel... >> Thank you. I am not sure that I would not have the technical skills required >> to initiate such a project, but I have fairly good organizational and >> linguistic skills to contribute to coordination of such a project. > That's excellent news.
You don't really need a whole lot of technical skill; as long as you're willing to contribute, there will usually be somebody willing to help you to contribute. >> My brother >> Mark Shaw is also a Linux expert and has developed his own Linux >> distribution. >> > Which one? >> He could help in this project and has fairly extensive contacts throughout >> the >> world with programmers. > Our project has also capable programmers and linguists that could help. > I am indeed sending a copy of this message to our mailing list, > [email protected], so that the Apertium community is informed > of your intentions and to see if someone out there is also able to lend > a hand. >> What is your protocol in developing language pairing? >> > There are many ways in which languages are paired, depending on the data > available. In the case of en-ia or es-ia, we already have rather > extensive es and en dictionaries. Structural transfer rules for es-ia > would be rather easy to write, and those for en-ia could be based in > those we already have for en-ca or en-es. One would have to build > complete monolingual dictionaries for Interlingua, and also bilingual > es-ia or en-ia dictionaries. If you wanted translation from ia, then one > would have to train a part-of-speech tagger for ia. > >> How would we be able to work with you to put this on the Apertium platform? > I can sign you up as developers in our Sourceforge platform, and you can > start working on the language pair there. I think initially the new > pairs go to some kind of incubator and then to the trunk. Can anyone on > the list help us on what would be the protocol? > > Tu pote leger anque > http://wiki.apertium.org/wiki/Apertium_New_Language_Pair_HOWTO > The usual IRC experience is: * Make contact (done) * Read and adapt the New Language Pair HOWTO to the new language pair * Ask any questions you may have * Send the work you did on the new language pair to an existing developer; that developer then adds it to the incubator and gives you the URL to the language pair (mostly, that's because Apertium's SVN repository is huge and nobody wants to expect anyone to check out the full ~4Gb) * You are nominated as a developer; if the nomination is seconded you're added (this is a relatively new rule; nobody has ever not been seconded) Usually, someone will know some source of data for the language pair and/or pitch in to help you get under way. Language pairs are moved to trunk either after they have been released, or if there is a funded group working on them (more or less, if the disappearance of a single person won't be the end of the pair). I would recommend es-ia first; from Mikel's writing, it looks close enough to Spanish that apertium-transfer-tools could be used (given an analyser, a bilingual lexicon, and a small corpus); similar languages also yield better translations. Converting en-es rules is something an experienced user can do in a matter of days, but you need to get the experience first (which you could, by editing the generated rules) - otherwise, it may become overwhelming quite quickly. -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. ------------------------------------------------------------------------------ _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
