Thank your Hector, for the valuable feedbacks on the draft application, I will work on the coding challenge before application period and I can dedicate more than 30 hours per week to the project. I agree with you regarding the number of words because 50,000 words I think will be really difficult in case of Torwali. But, I have access to lexical data of Torwali with more than 10,000 words and around 1000 example sentences with some other running text. I also have text files listing different categories of words. I will update my proposal and add the necessary things with a couple of weeks for disambiguation.
Regards, Naeem On Thu, Apr 8, 2021, 9:41 AM Hèctor Alòs i Font <hectora...@gmail.com wrote: > Hi Naeem, > > Thanks a lot for your very good and interesting draft application. Torwali > is an excellent language for Apertium. You know the challenges it presents > and the work on it, and you prove to be committed to the language and the > project. I am not a specialist on lexc-twol, but I see a few general things > to improve your application: > > * The coding challenge is very important. It proves you understand how > Apertium works (not only theoretically) and that you can do the job. So, do > it as well as you can now. Don't leave it until after the application > period. > > * Your 30 hours commitment per week is to be welcome, but bear in mind > that it is much more than what Google is asking for this year. > > * You want to enter 50,000+ words in the morphological analyser. That's a > huge amount. But in your work plan you don't say when you are going to do > it. It would be necessary to show how many words and which grammatical > categories you would add in each time slot (two weeks in your case). > Usually we start with the closed categories. When you detail these numbers > in your proposal, we will see how many words you will be able to reach. > > * I have no idea how it is in the case of Dardic languages, but the > assignment of words to categories is not usually trivial in Indo-European > languages. Do existing works already have lists of words assigned to > paradigms? For example: lists of verbs following one model or another. If > not, the time needed for assignment increases. It is necessary to know this > in order to calculate the feasibility of introducing 50,000, 30,000 or > 20,000 words. > > * Are there extensive lists of words available in electronic format, with > their grammatical category, which you could use for your work? They should > be free. If they were copyrighted they could not be (semi-)automatically > uploaded to Apertium. > > * It is very likely that, with the very limited time we have this year for > GSoC projects, a complete morphological analyser from scratch is perfectly > reasonable. Still, before putting so many words into it (especially if you > have to add them manually), I think it would be reasonable to spend a > couple of weeks training a morphological disambiguator. > > Hèctor > > Missatge de Naeemuddin Hadi <naeemuddinh...@gmail.com> del dia dj., 8 > d’abr. 2021 a les 1:46: > >> Hello everyone, >> >> I am Naeem, a student of UET Peshawar. I want to participate in GSoC >> 2021. I am working to create a morphological analyzer for an endangered >> language of northern Pakistan called Torwali. >> I have prepared a draft proposal and will appreciate feedbacks before >> final submission. links related to coding challenge are included in the >> draft. >> >> link (Draft) : >> https://drive.google.com/file/d/1hnu6gRWVN3LjjxOj0BvimvJ56AIKfe6q/view?usp=sharing >> >> >> Regards, >> Naeem >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff