+1 Ilnar Salimzianov <ilnar.salimzia...@posteo.de> schrieb am Fr., 15. März 2019, 08:56:
> > > On 2019 ж. 15 наурыз 02:11:19 GMT+03:00, Jonathan Washington < > jonathan.n.washing...@gmail.com> wrote: > >Сәлем, Данияр! Қауымымызға қош келдіңіз! > > > >Thanks for getting in touch with Ilnar and the rest of the Apertium > >community about your project idea. > > > >Memduh is right that Kazakh-to-Turkish MT is receiving a lot of > >attention > >right now in Apertium, and an additional project on it would likely > >create > >a bit of a mess. However, I think Turkish-to-Kazakh MT (i.e., the > >other > >direction) would be a good way for you to contribute, given your > >linguistic > >knowledge. The translation pair and language modules are the same, but > >a > >lot of the work would be editing a complementary set of files: > >disambiguation for Turkish and not Kazakh, and lexical selection and > >structural transfer for the Turkish-Kazakh direction instead of the > >Kazakh-Turkish direction. > > +1 > > >I don't see any problems with this, but perhaps others on this list > >have > >deeper insights. > > > >Another thought is that our Kazakh-to-Tatar MT system is one of our > >oldest > >"stable" Turkic pairs, but it does a poor job in the other direction. > >Perhaps a coherent GSoC proposal could be assembled from making these > >two > >existing pairs (kaz-tat and kaz-tur) stable in the opposite directions. > >I'd be interested to hear what other mentors think about this. > >(Knowing > >Kazakh and Turkish well should make Tatar fairly easy to work with.) > > > >Two additional little tidbits: > > > >Regarding your question about the pipeline involved, you can take a > >look at > >how the Apertium pipeline comes together here: > >http://wiki.apertium.org/wiki/Apertium_system_architecture > > > >This page could be updated some, but is probably still helpful as is. > > > >Also, I see you managed to catch Ilnar on IRC. Feel free to stay > >logged in > >when you can—you'll find different people available at different times. > > > >Сөйлескенше, > > > >-- > >Jonathan > > > > > >чт, 14 мар. 2019 г. в 14:03, Memduh Gökırmak <memd...@gmail.com>: > > > >> Hi Nariman, > >> > >> > >> The structure of the system is more or less the same across all > >pairs, but > >> there are some components that we use in some and don't use in > >others. For > >> example, the statistical system for choosing the correct rule to > >imply when > >> there is ambiguity is a work in progress, and is only in a few pairs. > >> > >> > >> Your question regarding breaking some system by making changes is a > >valid > >> one, but GSoC students don't typically make changes to programs we > >have in > >> production. When a new component is written it is tested and > >introduced in > >> a few pairs at first and so on. > >> > >> > >> There are a number of ways to increase the quality of a system but > >what is > >> usually most urgent is things like expanding the dictionary and > >writing > >> more transfer rules. Kazakh-Turkish would have been a nice domain for > >you > >> to work on given your proficiency in both, but it has been getting > >quite a > >> lot of attention recently and perhaps it would be better to choose > >some > >> other Turkic pair (I've been thinking about Bashkurt-Turkish). > >> > >> > >> So to recap: > >> > >> > >> For improving/creating language pairs, the tools are already there > >and you > >> will be making/improving things like a dictionary of words in both > >> languages, rules to choose the right words, rules to reorder and > >change up > >> the words so they make sense in the target language. This is > >something akin > >> to developing language resources and doesn't require a whole lot of > >> programming expertise, but some scripting is useful. > >> > >> > >> If you are a hardcore programmer, you can develop a new component or > >> improve some features of the system. > >> > >> > >> I'm sure someone has sent you this link, but here is a list of ideas > >for > >> projects we'd like to do this summer: > >> http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code > >> > >> > >> Best, > >> > >> Memduh > >> > >> > >> > >> On 14-03-2019 15:26, Daniyar Nariman via Apertium-stuff wrote: > >> > >> Hi Sevilay, > >> > >> In my message, I meant that Kazakh and Turkish languages are similar > >in > >> terms of affixes and sentence structure, and Kazakh and Russian are > >more > >> different. So if I will increase the translation quality of the first > >pair, > >> by adding some additional functionality to the pipeline, there is a > >chance > >> that the same might not work on the second pair. Finally, the > >question is, > >> Is this pipeline has to be the same for all language pairs, or it can > >> differ? > >> ------------------------------ > >> *From:* Sevilay Bayatlı <sevilaybaya...@gmail.com> > >> <sevilaybaya...@gmail.com> > >> *Sent:* Thursday, March 14, 2019 1:13:18 PM > >> *To:* apertium-stuff@lists.sourceforge.net > >> *Subject:* Re: [Apertium-stuff] Fwd: RBMT from Kazakh to Turkish > >> > >> Hi Daniyar, > >> , > >> Could tell us how can increase accuracy on one pair and decrease for > >other > >> pair by modifying some parts of pipeline? > >> > >> Sevilay > >> > >> > >> On Thu, Mar 14, 2019 at 11:26 AM Ilnar Salimzianov > ><il...@selimcan.org> > >> wrote: > >> > >>> > >>> > >>> > >>> -------- Forwarded Message -------- > >>> Subject: RBMT from Kazakh to Turkish > >>> Date: Wed, 13 Mar 2019 19:07:42 +0000 > >>> From: Daniyar Nariman <n.dani...@innopolis.ru> > >>> To: il...@selimcan.org <il...@selimcan.org> > >>> > >>> > >>> > >>> Dear Ilnar Salimzianov, > >>> > >>> > >>> My name is Nariman. I am a third-year bachelor student at > >>> Innopolis University(Russia, Tatarstan). I am studying Data Science > >and > >>> really interested in disciplines such as machine learning, natural > >>> language processing, information retrieval etc. > >>> > >>> > >>> Recently I read your paper, RBMT from Kazakh to Turkish, which was > >>> published in EAMT 2018. It was really interesting to read. The thing > >is, > >>> I am applying to GSoC(Google Summer of Code) this year to Apertium, > >but > >>> I am still thinking on the topic which I would like to deal with. > >One of > >>> the topics was to bring the defined language pair to > >state-of-the-art > >>> quality and I would like to deal with Kazakh-Turkish pair as the > >>> Kazakh language my mother tongue and I studied the Turkish language > >in > >>> the high school for 5 years. > >>> > >>> > >>> I would like to ask If there any restrictions on how to increase the > >>> quality of this pair? > >>> > >>> Excluding adding a large number of rules or by expanding the > >>> dictionary(taken for granted). For instance by optimizing the > >algorithms > >>> given in the pipeline. I am asking this question because by > >modifying > >>> some part of the pipeline, we can increase accuracy on our pair of > >>> languages, but decrease on another pair and constructing a different > >>> pipeline for different pairs is not a good idea in my opinion. > >>> > >>> > >>> > >>> Thanks in advance! > >>> > >>> > >>> Best Regards, > >>> > >>> Daniyar Nariman > >>> > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> Apertium-stuff mailing list > >>> Apertium-stuff@lists.sourceforge.net > >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > >>> > >> > >> > >> > >> > >> _______________________________________________ > >> Apertium-stuff mailing > >listApertium-stuff@lists.sourceforge.nethttps:// > lists.sourceforge.net/lists/listinfo/apertium-stuff > >> > >> > >> _______________________________________________ > >> Apertium-stuff mailing list > >> Apertium-stuff@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > >> > > -- > Простите за краткость, создано в K-9 Mail. > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff