Hello
I am Nikit Saraf, sophomore Computer Science Undergraduate from Dhirubhai
Ambani Institute of Information and Communication Technology, India
I have strong interest in Natural Language Processing and Information
Retrieval. I am part of "Information Retrieval LAB (IR-LAB)" at my college.
As a part of IR-LAB, I have worked for CLIA (Cross Lingual Information
Access) funded projects which involved developing "Part-Of-Speech (POS) and
Named Entity Taggers" for Gujarati Language. Currently, there is no NE
Tagger available for Gujarati Language. Both the taggers were CRF based.
I hope my experience come in handy for this project
Out of the many ideas listed on the Apertium GSOC Page, I found "Improved
bilingual dictionary induction" to be particulary interesting and in my
reach.
I have completed all the four Coding Challenge for this particular Idea.
The Alignment file was generated using "Europarl" parallel corpus with
Bulgarian-English pair, using only first 1000 lines from each of the corpus
to save on the time and computer resource. The Alignment file
https://gist.github.com/nikitsaraf/ffb5e57c267bc05084fc
I have also attached the rewritten "generate-bidex-template.py" script,
which now uses ElementTree instead of 4suite. The script is here
https://gist.github.com/nikitsaraf/b6a1c8c792314aff3a46
Please give me comments, on the tasks.
There are still small things in the idea, about which I am not clear now,
but I am trying to grasp those minute details. And once I am clear with
those things, I will start writing the proposal.
Just one more thing, where does this project stand on the priority list of
"Apertium" ?
Looking forward for your reply.
Thanking You
Regards
Nikit Saraf
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff