Please let us know when your coding challenge is ready, Sachin.

Mikel

Al 03/19/2014 12:31 PM, En/na Sachin Shastri ha escrit:

Here is a copy of my GSOC proposal. I would love to get some feedback on some changes I can make.

Name: Sachin Shastri
E-mail address:sachinsu...@gmail.com  <mailto:sachinsu...@gmail.com>
Other information that may be useful to contact you: Contact No: +917760118343 Irc 
nick: sachinsurfs in #apertium atirc.freenode.net  <http://irc.freenode.net>

*Why is it you are interested in machine translation?*
I am a computer science student hailing from Bangalore,South-India which is 
known for its rich cultural diversity(8+ languages are spoken in my college 
alone). Due to my background, and my dual-interest in computer science, 
computational linguistic was a subject I always wanted to pursue. Initially, it 
was the thought of a machine being to translate a very uncommon language like 
tulu fascinated me. Hence since a young age,I have been studying on Machine 
translation and always felt I have a strong connection to this subject.I got my 
first actual basic formal education on this subject when our college included a 
special course on Finite Automata and formal Languages in our syllabus( I am 
proud to say, that PESIT (the college I am enrolled in) is the only college in 
the entire state which offers this subject for 2nd year students). After this I 
took a course on Natural language processing in course era,and my inerest in 
this subject has been elevating since.

W*hy is it that you are interested in the Apertium project?*
  I am interestd in Apertium foremost because it is open-source(and free). 
Secondly due to the fact that I have always had a lot of interest in machine 
translation. I always wanted to do a project under a MT organization and so 
Apertium was instantly one of my first choice in the list of organization for 
gsoc.Although I was immensely interested in working for Apertium, I was 
intially worried that I might not have the specific knowledge in MT required 
for doing
a project here(Not confusing interest with knowledge) , but after finding a 
project that is exactly right for my skill level, I knew this was the right 
organization for me.

*Which of the published tasks are you interested in? What do you plan to do?* Everything above being said, I would like to take up the task of*Integrating Apertium in various chat clients.*
Telegram -- I plan to integrate the Apertium web service(using scaleMT based on 
Apertium Scalable service) to Telegram(using the source provided on 
github).(This probably will include the usual tasks of issueing the HTTP 
requests, parsing the JSON result strings,using Async task,etc).
Xchat & pidgin- Make plugins that will interface the machine-translation system 
.(I will most probably make use of the python scripting interface provided in both 
these chat clients)
(Suggestion -- I could might as well do more plugins on other popular chat 
clients like adium and finch, if I am going to be making use of the libpurple 
libraries)

*Reason for choosing the selected task over other tasks-*
The reason I have chosen this task over others(Adopting a language pair, 
rule-based finite-state disambiguation,etc) is that, although I have 
considerable knowledge of(and lot of interest in) computational linguistics and 
constraint grammer, I don't have that much experience in them as much as some 
of my prospective peers.(I am not discrediting myself, I am just appriciating 
the knowledge other people have w.r.t that) In fact, I have learned a huge deal 
from the IRC channel and while waiting for the coding challenge.(Will take up 
task of adopting a language pair next year, when I am ready) However, I have 
had years of experience with Java, Python and android and so I felt this task 
is right for me.(This way I get to also work with Apertium)


*Proposal Title*  --*Mathrubhasha*
        (Hindi for mother tongue)
*Reasons why Google and Apertium should sponsor it-*

The number of Pidgin users was estimated to be over 3 million in 2007. The 
number has been growing at a steady rate since. The number of Xchat users is 
also increasing at a good rate. Hence, there are a large number of users using 
these chat clients everyday. Also, Telegram logged 5 million downloads in one 
day following WhatsApp sale. So this would be a prime-time to integrate 
apertium to these chat clients, and thereby adding a powerful tool to these 
messengers which will not only help popularize machine translation platforms 
but will also help in better communication and make these chat client more user 
friendly.
(Since Pidgin is derived from "pidgin language" , having a machine translation 
tool inorder to break the language barrier between people makes a lot of sense).

*A description of how and who it will benefit in society-* There are 10+ million users using atleast one of these chat clients*everyday*. Though English is a universal language,a majority of the people are mostly comfortable in their own mother tongue. This tool will help this majority help express themselves better.(which is a highly desired quality in chat clients). Also this will help breaking the language barrier which will help different user from different parts of the world communicate with each other effectively.
It will mainly help the rural and urban communities(esp. in India) since, many 
here know how to operate a computer and a phone, but don't know English.

Although the other projects(like developing language pairs) benifit specific 
societies, the number of people in those societies who are benifited is small, 
unless the end result of the project is made use in day to day situations. 
Since chat messengers have become quite common and has become almost one of the 
prime means of communication, it will be benificial to majority of people in 
different societies. This is especially true for mobile messaging app, since 
people carry their phones everywhere. Last but not the least , due to the same 
reason, it will also help in awareness and expansion of the open souce 
communities since Apertium and all the chat clients are all open-source 
softwares.


    Work plan



*WEEK*

        

*TASK*

Pre-week 1-4

        

Getting to know mentor better.

Analyzing the source code of the different chat clients and Apertium.

Forecast some of the more common constraints, that I will face and decide plan of action for these.

Collect necessary information, read documentations, do in-depth research , analyze and completely prepare for starting the project.

Get the source code and reading more on Apertium-caffeine (Which will give a much better idea for making the plugins)

Week 1

        

Start with Telegram. Use the available source code, make modifications in manifests for integration for Apertium web service avaiable.

Week 2

        

Work on the code , while make use of already present API's ike the JSON REST API ( for issueing http requests, parsing,etc)

Week 3

        

Final UI work including making for use of AsyncTask for the threads and finally Debugging. (If everything goes well, I will have Apertium ready in Telegram during this time)

Week 4

        

More Debugging and making any changes required.

Deliverable #1

        

Integration of Apertium with Telegram

Week 5

        

Start making the plugin for Xchat. Start writiting scripts.

Week 6

        

Coding. Working on inertface.

Week 7

        

More Coding and debugging.(If everything goes right, I will have plugin code ready by this time)

Week 8

        

Debugging and making of makefiles and config files for easy compilation.


Start working on plugin for Pidgin.

Deliverable #2

        

Apertium plugin for Xchat Chat Client

Week 9

        

Continue working on plugin for Pidgin.

Week 10

        

Coding.

Week 11

        

Debugging.

Deliverable #3

        

Apertium plugin for Pidgin Chat Client

Week 12

        

First 5 days: Extra time, In case I come across some major issue.

Last 2 days : Final Presentation .

Post-week

        

Tidying up.



*Important dates*: April 22nd- Commencing work on the project

May 19th -- Commencing work on the project

June 16th -- Deliverable #1

July 12th -- Deliverable #2

~August 6th- Deliverable #3

August 10th -Project completition

August 18th-22th -- Project Evaluation

*Time commitments:*

Preweek 1-3-- 3-5 hours per day(I have my Semester End Exams then which ends by 4th week)

Preweek 4 -- atleast 12 hours per day

Week 1-3: 7-9 hours per day

Week 5-7 :7-9 hours per day

Week 9-10 :12 hours per day(Since I am alloting relatively less time for this part)

Week 4,8,11:10-12 hours per day (Since debugging usually takes the most amount of time)

Post week- 4-6 hours per day(Since my summer holidays end at this time)



*List your skills and give evidence of your qualifications*.
I am currently doing my B.Tech(Branch- Computer science and engineering) in 
PES,Institute of technology, Bangalore.
Programming Skills related to this project: C, Java, Python,Xml, JavaScripts
I have taken up various courses on Java including advance data structures and Algorthim design. I been working on android app development for few years now and I have I have taken up few android project(I can provide scanned copy of certificates). I have worked on application integration and web services before. I have done my research on the different chat clients and feel this project is do-able with only knowledge of python(and maybe javascripts) as scripting language. Also, from the coding challenge, I have now a good idea on making plugins(esp for pidgin and xchat) and writing scripts for these clients. Hence, I think this project can be done by me, for the alloted time.
(I am also fluent in 6+ natural languages, although I doubt if that will be of 
much use in the specific task i have chosen )

Coding Challenge: In progress. Expected to complete before deadline.
Link 
:https://github.com/sachinsurfs/apertium-code-challenge-chat-client-plugins.git


*Previous experience in open-source project*: No.:( However, I am currently 
working on a open-source cloud-benchmarking tool, making use of apache-geronimo 
and daytrader which benchmarks the performance of public,private and hybrid 
clouds, and gives values, while changing various parameters like no of clients 
and WLAN speed.

*List any non-Summer-of-Code plans you have for the Summer-*
None. Therefore I am ready to devote entire 12 weeks on this porject.



------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to