Ah, another artificial language enters the scene ... bonenon al Apertium! :-)
Hector and I've done the Esperanto<-->English (and Hector also eo-es), which might be another source of inspiration, although a lot of things could be improved. Let me know if you need help, or if you need something clarified, if you decide to look at eo-en for inspiration. A word of caution before you plan something about English: English->Esperanto quality suffers from imperfections in the so-called "tagger" step (disambiguation of the part-of-speech, for example whether the word "saw" is a noun or a verb in a given context). English just can't be tagged very well with the current tools in Apertium (a bigram tagger), and until now no-one has solved that in Apertium. Yours, Jacob 2010/5/17 Jimmy O'Regan <[email protected]> > On 16 May 2010 18:52, Jimmy O'Regan <[email protected]> wrote: > > On 16 May 2010 17:46, Mikel L. Forcada <[email protected]> wrote: > >> Dear Ian, > >>> Mr. Forcada, > >>> > >> Please call me Mikel... > >>> Thank you. I am not sure that I would not have the technical skills > required > >>> to initiate such a project, but I have fairly good organizational and > >>> linguistic skills to contribute to coordination of such a project. > >> That's excellent news. > > > > You don't really need a whole lot of technical skill; as long as > > you're willing to contribute, there will usually be somebody willing > > to help you to contribute. > > > >>> My brother > >>> Mark Shaw is also a Linux expert and has developed his own Linux > distribution. > >>> > >> Which one? > >>> He could help in this project and has fairly extensive contacts > throughout the > >>> world with programmers. > >> Our project has also capable programmers and linguists that could help. > >> I am indeed sending a copy of this message to our mailing list, > >> [email protected], so that the Apertium community is informed > >> of your intentions and to see if someone out there is also able to lend > >> a hand. > >>> What is your protocol in developing language pairing? > >>> > >> There are many ways in which languages are paired, depending on the data > >> available. In the case of en-ia or es-ia, we already have rather > >> extensive es and en dictionaries. Structural transfer rules for es-ia > >> would be rather easy to write, and those for en-ia could be based in > >> those we already have for en-ca or en-es. One would have to build > >> complete monolingual dictionaries for Interlingua, and also bilingual > >> es-ia or en-ia dictionaries. If you wanted translation from ia, then one > >> would have to train a part-of-speech tagger for ia. > >> > >>> How would we be able to work with you to put this on the Apertium > platform? > >> I can sign you up as developers in our Sourceforge platform, and you can > >> start working on the language pair there. I think initially the new > >> pairs go to some kind of incubator and then to the trunk. Can anyone on > >> the list help us on what would be the protocol? > >> > >> Tu pote leger anque > >> http://wiki.apertium.org/wiki/Apertium_New_Language_Pair_HOWTO > >> > > > > The usual IRC experience is: > > > > * Make contact (done) > > * Read and adapt the New Language Pair HOWTO to the new language pair > > * Ask any questions you may have > > * Send the work you did on the new language pair to an existing > > developer; that developer then adds it to the incubator and gives you > > the URL to the language pair (mostly, that's because Apertium's SVN > > repository is huge and nobody wants to expect anyone to check out the > > full ~4Gb) > > * You are nominated as a developer; if the nomination is seconded > > you're added (this is a relatively new rule; nobody has ever not been > > seconded) > > > > Usually, someone will know some source of data for the language pair > > and/or pitch in to help you get under way. > > > > Language pairs are moved to trunk either after they have been > > released, or if there is a funded group working on them (more or less, > > if the disappearance of a single person won't be the end of the pair). > > > > > > I would recommend es-ia first; from Mikel's writing, it looks close > > enough to Spanish that apertium-transfer-tools could be used (given an > > analyser, a bilingual lexicon, and a small corpus); similar languages > > also yield better translations. Converting en-es rules is something an > > experienced user can do in a matter of days, but you need to get the > > experience first (which you could, by editing the generated rules) - > > otherwise, it may become overwhelming quite quickly. > > I started to put together a rudimentary es-ia translator: > > $ echo "veo el gato" |apertium -d . test-es-ia > Io vide #le catto > > As you can see (by the # mark), there's a small problem straight away > - there are many more where that came from. In any case, it's a > starting point. You can check it out from SVN: > > svn co > https://apertium.svn.sourceforge.net/svnroot/incubator/apertium-es-ia/ > apertium-es-ia > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > -- Jacob Nordfalk एस्पेरान्तो के हो? http://www.esperanto.org.np/. Memoraĵoj de KEF -. http://kef.saluton.dk/memorajoj/
------------------------------------------------------------------------------
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
