Hi all,

What might some of the deliverables look like for this task?

Deliverable examples:List of words added to improve coverage,
 rules added to take into account erroneous target constructions
source text and post-edited translations used as reference, as recommended
in the coding challenge
Could developing code used to graph quality, e.g. the word coverage of
language with the WER quality, and another correlation--correct me if this
is misguided--between transfer rules and the PWER? I can probably set up
some visualization of quality in Octave/Matlab if not available yet.

From
http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Make_a_language_pair_state-of-the-art
:
*This will involve working with dictionaries, transfer rules, scripting,
corpora.*
Is scripting using the calls to lt-proc, apertium-eval-translator and other
tools in lt-toolbox?

I am trying to figure out how to run these more verbose scripts lttoolbox
scripts to get pos tags and rules at command line, and also save the output
to text files and will continue to troubleshoot on IRC, so as not to
balloon this e-mail message.

So far, I have evaluated the translation of a ~800 word article from elpais
into English and found some issues with future tense ("realizar and some
vocab. Though the coverage was well over 90%, the grammar could be much
better, but I need to see the tags and rules used at command line. This
could be easier for me to do in Apertium-Viewer, but I prefer the control
of command line.

For sure, I would want to save my steps and commands and measures used to
improve quality, and develop a mini-wiki to this effect as a step toward
getting others to develop their pairs to a competitive level of quality
could be nice, if it doesn't already exist!

Many thanks,
Alex



On 5 February 2014 13:09, Kevin Brubeck Unhammer <[email protected]> wrote:

> Francis Tyers <[email protected]> writes:
>
>
> [...]
>
> > In general we try to have no more than 50% language pairs as GSOC
> > projects, which I think is a pretty good idea. If we think that this
> > task is a good idea then we could decide to make at least one language
> > pair a "state of the art" one.
>
> +1
>
>
> --
> Kevin Brubeck Unhammer
>
> GPG: 0x766AC60C
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>


-- 
Alex
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to