Thank you very much. I have now submitted my application.
El 05/04/2012, a las 02:47, Luis Villarejo escribió:
> Hello José,
>
> thank you for your interesting comments on the task. Below you will find my
> notes on your comments.
>
> 2012/4/5 José Emilio Muñoz López <[email protected]>
> Hello,
>
> I am really interested in joining Apertium to participate in the Google
> Summer of Code. I am a 2nd year Electronics and Computer Science student at
> the University of Edinburgh.
>
> I find the task described at the title of this thread particularly appealing.
> I have read its description on the GSoC ideas page, and I have some doubts
> and thoughts I would like to discuss here.
>
> The information we want to extract from the log files, would it be used to
> automatically generate new translation rules? or would it be presented
> (perhaps in some summarised form, for example giving the most
> frequent/significant changes) to someone in charge of the translation rules?
>
> We can think of both things. Dealing with rules is not always understandable
> for most users, so a summarized form may be interesting. And once the rule is
> validated, the automatic generation of the rule in the Apertium format would
> be very interesting. Take into account that we may generate new rules that
> could conflict with existing ones, we must also think on that.
>
>
> Would it be possible to use statistical tools on this information? For
> example, we could calculate the probability of the correctness of a word in a
> given context. If the users always change a given translated word when it
> occurs in a certain context, then the engine could use this information to
> improve future translations. In this way the generation of new rules could be
> automated in some way. Is this a sensible idea?
>
> I think we should rely on statistical thresholds to filter the information we
> want to deal with.
>
>
> However this task seems to concentrate on the data mining rather than on its
> potential uses, is this right?
>
> the approach should be complete, we should think in both things.
>
> in what ways should this information be presented? Would the graphical
> environment (described in the task page) be similar to the way Google
> translate works at the moment, where users get a drop down list of popular
> alternatives when they click on a word?
>
> I am not really seeing this kind of interaction for this task. The
> interaction you describe is the one that takes place when users use the AWI
> and select the correct words for their translations. I think this task should
> concentrate on extracting valuable information from the post-edition logs and
> then explore the ways to trasnform it into valuable material to improve the
> Apertium engine.
>
>
> I am not sure about this, but why does Apertium AWI not seem to support all
> of Apertium language pairs?
>
> It is just a matter of installing other pairs in the server.
>
>
> What programming languages would I need to help develop this task? What would
> you recommend me to do in order to gain a better understanding of Apertium
> and this task?
>
> The task would require from knowledge of scripting language, a little bit of
> XML to understand apertium rules and dictionaries and whatever would be
> useful to build a small environment where the user could interact with the
> extracted information and generate rules or dictionary entries.
>
> Best,
> Luis
>
>
> Please ask me any information you may need. More details on my programming
> knowledge and past experiences can be found in the thread "GSoC" posted by
> Jacob Nordfalk on 2012-04-02
> (https://sourceforge.net/mailarchive/forum.php?thread_name=CAKckPXZF0Bq_PWkk9s1rzFtZN%3DQ9s254fnQgWDgBwx35weU0kA%40mail.gmail.com&forum_name=apertium-stuff)
>
> Thank you very much for your help,
> José Emilio Muñoz
>
>
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev_______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff