Hi Naeem, Thanks a lot for your very good and interesting draft application. Torwali is an excellent language for Apertium. You know the challenges it presents and the work on it, and you prove to be committed to the language and the project. I am not a specialist on lexc-twol, but I see a few general things to improve your application:
* The coding challenge is very important. It proves you understand how Apertium works (not only theoretically) and that you can do the job. So, do it as well as you can now. Don't leave it until after the application period. * Your 30 hours commitment per week is to be welcome, but bear in mind that it is much more than what Google is asking for this year. * You want to enter 50,000+ words in the morphological analyser. That's a huge amount. But in your work plan you don't say when you are going to do it. It would be necessary to show how many words and which grammatical categories you would add in each time slot (two weeks in your case). Usually we start with the closed categories. When you detail these numbers in your proposal, we will see how many words you will be able to reach. * I have no idea how it is in the case of Dardic languages, but the assignment of words to categories is not usually trivial in Indo-European languages. Do existing works already have lists of words assigned to paradigms? For example: lists of verbs following one model or another. If not, the time needed for assignment increases. It is necessary to know this in order to calculate the feasibility of introducing 50,000, 30,000 or 20,000 words. * Are there extensive lists of words available in electronic format, with their grammatical category, which you could use for your work? They should be free. If they were copyrighted they could not be (semi-)automatically uploaded to Apertium. * It is very likely that, with the very limited time we have this year for GSoC projects, a complete morphological analyser from scratch is perfectly reasonable. Still, before putting so many words into it (especially if you have to add them manually), I think it would be reasonable to spend a couple of weeks training a morphological disambiguator. Hèctor Missatge de Naeemuddin Hadi <naeemuddinh...@gmail.com> del dia dj., 8 d’abr. 2021 a les 1:46: > Hello everyone, > > I am Naeem, a student of UET Peshawar. I want to participate in GSoC > 2021. I am working to create a morphological analyzer for an endangered > language of northern Pakistan called Torwali. > I have prepared a draft proposal and will appreciate feedbacks before > final submission. links related to coding challenge are included in the > draft. > > link (Draft) : > https://drive.google.com/file/d/1hnu6gRWVN3LjjxOj0BvimvJ56AIKfe6q/view?usp=sharing > > > Regards, > Naeem > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff