Alright sir ! Thanks a lot for your response.
[image: Mailtrack] <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> Email delivery certified by Mailtrack <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> 02/25/23, 05:42:07 PM On Sat, 25 Feb 2023 at 17:11, Hèctor Alòs i Font <hectora...@gmail.com> wrote: > Hi Khushi, > > As for Hindi, you should first test the coverage. > > According to this page (last edited in 2019), the dictionary was some > 37,000 (which is quite good, in principle) but only some 83.1% : > https://wiki.apertium.org/wiki/Languages. > So, you should see what is the current state of the package. > > You should install Apertium and the Hindi package. A corpus is need: we > usually get Wikipedia, and select randomly several million sentences of it. > With this, you can calculate the naive coverage, and see if the dictionary > has grown significatively since 2019. > > Once you have this, you can analyse where the problem comes: This low > coverage is basically due to missing words or morphological forms that are > not recognised, although the words do exist in the dictionary? With 37,000 > words and 83% coverage, the latter seems likely (regardless of the fact > that it is always good to have more words in dictionaries). It is a > question of understanding what is missing: nominal morphological forms, > verbals? > > It is also interesting to see if there are free sources from which the > dictionary could be expanded automatically or semi-automatically. > > On this basis one can see if there is work for a project. Most probably > there is for a small or, at most, a medium-sized one. > > Hèctor > > > Missatge de Khushi - <12khushi...@gmail.com> del dia ds., 25 de febr. > 2023 a les 10:02: > >> Thanks a lot for your feedback! >> It would be great if you could tell me how should I get started with this >> and what milestones should I aim to achieve in order to improve it. >> >> Regards, >> Khushi Harsure >> >> [image: Mailtrack] >> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >> Email >> delivery certified by >> Mailtrack >> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >> 25/02/23, >> 16:46:50 >> >> On Fri, 24 Feb 2023 at 23:46, Hèctor Alòs i Font <hectora...@gmail.com> >> wrote: >> >>> Hi Kushi, >>> >>> First: Hindi-Marathi is already available on Google. I think you should >>> reason out the usefulness of developing it in Apertium. A priori, it does >>> not seem like a project that is going to be especially promising. >>> >>> As for your current question, why should the pair be created again from >>> scratch? Have you seen something wrong on it? In principle, I don't see at >>> all why the work that has been done before should be wasted. I would do it >>> only if, after analysing it, it turns out that it is appalling (which would >>> be weird). >>> >>> I don't know Hindi, but from what I saw two years ago, the morphological >>> analyser seems to have a lot of room for improvement. It might make sense >>> to concentrate on it and its morphological disambiguator. This would help >>> to subsequently develop translators between low-resource Indo-Aryan >>> languages and Hindi. >>> >>> Hèctor >>> >>> Missatge de Khushi - <12khushi...@gmail.com> del dia dv., 24 de febr. >>> 2023 a les 19:58: >>> >>>> Respected sir, >>>> Thanks a lot for your response. I am glad that you appreciate it. I >>>> wanted to clear up some doubts before I start working on it. >>>> I would like to know whether you want me to work on the existing >>>> marathi - hindi translator or should i create a new one from scratch. In >>>> the former case, what kind of improvements or contributions will be >>>> expected ? >>>> Looking forward to hearing from you soon ! >>>> >>>> Regards, >>>> Khushi Harsure >>>> >>>> >>>> >>>> [image: Mailtrack] >>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>> Email >>>> delivery certified by >>>> Mailtrack >>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>> 25/02/23, >>>> 02:48:57 >>>> >>>> On Fri, 24 Feb 2023 at 20:05, Daniel Swanson < >>>> awesomeevildu...@gmail.com> wrote: >>>> >>>>> Hi Khushi, >>>>> >>>>> Yeah, that sounds like a good project to me. >>>>> >>>>> Next steps would be opening a pull request on >>>>> https://github.com/apertium/apertium-mar-hin and requesting a wiki >>>>> account to write your workplan. >>>>> >>>>> Daniel >>>>> >>>>> On Fri, Feb 24, 2023 at 4:09 AM Khushi - <12khushi...@gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> ---------- Forwarded message --------- >>>>>> From: Khushi - <12khushi...@gmail.com> >>>>>> Date: Fri, 24 Feb 2023 at 14:24 >>>>>> Subject: Re : [Apertium-stuff] GSOC 2023 >>>>>> To: <unham...@fsfe.org> >>>>>> >>>>>> >>>>>> Hello ! >>>>>> >>>>>> This is Khushi Harsure, an undergraduate student from India pursuing >>>>>> Computer Science. I'd like to participate in Google Summer Of Code 2023 >>>>>> at >>>>>> Apertium. The project involving addition of a new language pair has >>>>>> caught >>>>>> my interest and being a native speaker, I was planning to work on >>>>>> addition >>>>>> of Hindi-Marathi pair. Previously Hindi-English and English-Marathi pairs >>>>>> have been added by past Gsoccers however Hindi-Marathi pair remains >>>>>> unworked upon. Before starting off, I wanted to get a confirmation >>>>>> whether >>>>>> this would be a potential Gsoc project. >>>>>> I would also like to know the steps that >>>>>> should be followed after doing the installation of Apertium other than >>>>>> giving the coding challenge. Looking forward to hearing from you. >>>>>> >>>>>> Regards, >>>>>> Khushi Harsure >>>>>> >>>>>> >>>>>> >>>>>> [image: Mailtrack] >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>>>> Email >>>>>> delivery certified by >>>>>> Mailtrack >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>>>> 24/02/23, >>>>>> 14:23:55 >>>>>> >>>>>> [image: Mailtrack] >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>>>> Email >>>>>> delivery certified by >>>>>> Mailtrack >>>>>> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality13&> >>>>>> 24/02/23, >>>>>> 14:38:51 >>>>>> _______________________________________________ >>>>>> Apertium-stuff mailing list >>>>>> Apertium-stuff@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>>> >>>>> _______________________________________________ >>>>> Apertium-stuff mailing list >>>>> Apertium-stuff@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> >>>> _______________________________________________ >>>> Apertium-stuff mailing list >>>> Apertium-stuff@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>> >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff