El dl 22 de 04 de 2013 a les 14:55 +0530, en/na Nikit Saraf va escriure: > > Hello
Hello! > I am Nikit Saraf, sophomore Computer Science Undergraduate from > Dhirubhai Ambani Institute of Information and Communication > Technology, India Nice to meet you! :) > I have strong interest in Natural Language Processing and Information > Retrieval. I am part of "Information Retrieval LAB (IR-LAB)" at my > college. As a part of IR-LAB, I have worked for CLIA (Cross Lingual > Information Access) funded projects which involved developing > "Part-Of-Speech (POS) and Named Entity Taggers" for Gujarati Language. > Currently, there is no NE Tagger available for Gujarati Language. Both > the taggers were CRF based. > > I hope my experience come in handy for this project > > > Out of the many ideas listed on the Apertium GSOC Page, I found > "Improved bilingual dictionary induction" to be particulary > interesting and in my reach. > > > I have completed all the four Coding Challenge for this particular > Idea. Great! > The Alignment file was generated using "Europarl" parallel corpus with > Bulgarian-English pair, using only first 1000 lines from each of the > corpus to save on the time and computer resource. The Alignment file > https://gist.github.com/nikitsaraf/ffb5e57c267bc05084fc This looks ok, but it would be better if you aligned tagged versions of the corpora. > I have also attached the rewritten "generate-bidex-template.py" > script, which now uses ElementTree instead of 4suite. The script is > here https://gist.github.com/nikitsaraf/b6a1c8c792314aff3a46 This looks mostly ok, it has some strange issues with Unicode, but I'd say you've passed. > Please give me comments, on the tasks. > > > There are still small things in the idea, about which I am not clear > now, but I am trying to grasp those minute details. And once I am > clear with those things, I will start writing the proposal. > > > Just one more thing, where does this project stand on the priority > list of "Apertium" ? We don't really have a priority list, it really depends on which other projects get proposals and how strong your proposal is. > Looking forward for your reply. Fran ------------------------------------------------------------------------------ Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
