Hello,

I am really interested in joining Apertium to participate in the Google Summer 
of Code. I am a 2nd year Electronics and Computer Science student at the 
University of Edinburgh.

I find the task described at the title of this thread particularly appealing. I 
have read its description on the GSoC ideas page, and I have some doubts and 
thoughts I would like to discuss here. 

The information we want to extract from the log files, would it be used to 
automatically generate new translation rules? or would it be presented (perhaps 
in some summarised form, for example giving the most frequent/significant 
changes) to someone in charge of the translation rules? 

Would it be possible to use statistical tools on this information? For example, 
we could calculate the probability of the correctness of a word in a given 
context. If the users always change a given translated word when it occurs in a 
certain context, then the engine could use this information to improve future 
translations. In this way the generation of new rules could be automated in 
some way. Is this a sensible idea?

However this task seems to concentrate on the data mining rather than on its 
potential uses, is this right? in what ways should this information be 
presented? Would the graphical environment (described in the task page) be 
similar to the way Google translate works at the moment, where users get a drop 
down list of popular alternatives when they click on a word? 

I am not sure about this, but why does Apertium AWI not seem to support all of 
Apertium language pairs?

What programming languages would I need to help develop this task? What would 
you recommend me to do in order to gain a better understanding of Apertium and 
this task?

Please ask me any information you may need. More details on my programming 
knowledge and past experiences can be found in the thread "GSoC" posted by 
Jacob Nordfalk on 2012-04-02 
(https://sourceforge.net/mailarchive/forum.php?thread_name=CAKckPXZF0Bq_PWkk9s1rzFtZN%3DQ9s254fnQgWDgBwx35weU0kA%40mail.gmail.com&forum_name=apertium-stuff)

Thank you very much for your help,
José Emilio Muñoz

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to