Thanks a lot Hèctor for the feedback. I will change my proposal to Creation of a language pair (Hindi-Bengali) that is ready for publication. Also I'm working on the corpus coverage of -ben as Daniel suggested. I'm focusing on apertium-ben for now, for the Hindi-Bengali language pair. Once again, thanks a lot for the feedback!
On Tue, Mar 23, 2021 at 9:41 AM Hèctor Alòs i Font <hectora...@gmail.com> wrote: > Hi Gourab, > > There has been, long time ago, some work on Bengali: > Faridee AZM, Tyers FM (2009) Development of a morphological analyser for > Bengali. In: Pérez-Ortiz J, Sánchez- > Martínez F, Tyers F (eds) Proceedings of the First International Workshop > on Free/Open-Source Rule-Based Ma- > chine Translation, Universidad de Alicante. Departamento de Lenguajes y > Sistemas Informáticos, Alicante, Spain, pp 43–50. > > You should see how much it covers, as Daniel said. If the basis is done, > as I imagine, it would be more interesting to orient the proposal towards > the creation of a pair that is ready for publication. We have quite a few > parsers in different states of evolution, in particular for Indian > languages, but relatively few realised pairs. It would be very interesting > to have a "Bengali - another Indo-Iranian language" pair. Hindi-Bengali > would probably be the best option, as Hindi and Urdu are, to date, the only > languages that have been released in Apertium. Given that there is much > less time available in GSoC this year, one option would be to work mainly > in one direction. From Hindi to Bengali would be the easiest option because > it would also avoid having to work a lot on morphological disambiguation > (which should be more or less satisfactorily solved for Hindi). This would > make the project concentrate on 1) finishing the morphological analysis of > Bengali, 2) creating/expanding the transfer rules, 3) creating the lexical > selection rules, 4) adding several thousand words in the bidix, 5) testing > on real texts to fine-tune the translator and presenting a finished > translator with a WER of less than 25%, ready for publication, at the end > of the project. Least but not last, a Hindi-to-Bengali translator should > be, as a rule, easier for a Bengali-speaker than creating the opposite > direction. > > Hèctor > > Missatge de Daniel Swanson <awesomeevildu...@gmail.com> del dia dt., 23 > de març 2021 a les 0:11: > >> Hi Gourab, >> >> My recommendation would be to evaluate the current status -ben and >> -bn-en in terms of corpus coverage and WER and then incorporate into >> your proposal what those numbers are now and how much you think you >> can improve them. >> >> A pull request to one of the repositories involved would also be >> worthwhile, both in terms of your understanding of how to accomplish >> the tasks in your proposal and for the mentors to be able to evaluate >> your proposal. >> >> Daniel >> >> On Mon, Mar 22, 2021 at 3:06 PM Gourab Chakraborty IIIT Dharwad >> <19bcs...@iiitdwd.ac.in> wrote: >> > >> > >> > Hi, >> > I would like to participate in GSoC and am interested in contributing >> in improving the transfer system for apertium-bn-en. My work would fall in >> the "Develop a morphological analyser" category of the idea-list. I'm a >> native speaker of Bengali and am really excited for the project. >> > >> > I have gone through the official documentation, and have already setup >> apertium in my ubuntu system. >> > >> > I have prepared a draft for my GSoC proposal ( >> https://docs.google.com/document/d/1S5EY6Eddu4v1ZMqgkM0Kjl_27kBhZkDkEz0Ddmnrotk/edit?usp=sharing). >> Since this is my first proposal for GSoC, I would really appreciate any >> feedback. Also what should I do next? >> > >> > Thank you >> > -- >> > Gourab Chakraborty (IRC: gourab337) >> > _______________________________________________ >> > Apertium-stuff mailing list >> > Apertium-stuff@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> >> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > -- Gourab Chakraborty 2nd year, CSE @ IIIT Dharwad
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff