Alex Aruj <[email protected]> writes:

> Hello, I was still unable to see the updates to dictionaries taking
> full effect even after trying the -d . es-en solution, but I will try
> running lt-comp again, checking the lr and rl directionality and
> automorf and autogen bin files.

en-es has a complete Makefile.am, so running ./autogen.sh at least once
and then make after each change to .dix files should be enough. That is,
you don't need to manually call lt-comp with en-es, the make system does
that for you.

> I have shared part of the GSOC proposal that I think is most directly
> relevant to the task. I would like some feedback on it if anyone has
> time. If any ideas about the project are misguided, please suggest
> alternatives. The formatting options are a little wacky on Windows 8
> MSWord--will certainly adjust later.
>
> https://drive.google.com/file/d/0By8YUPGatqZZb2NxV0NCUHdWTlU/edit?

"accuracy of grammar, not just in terms of vocabulary matching, which 
I will already attempt to increase by 15-20%."

What does this mean? Increasing vocabulary by 15-20% in itself could be
totally useless if that increase is in infrequent forms. What we want is
to increase the _coverage_ on real text. The naive method is to run a
large text corpus through the translator and count the *'s compared to
words that get a translation.

Since your task is to make the pair state of the art, you should also
look at the true coverage: if one form gets an analysis as adjective, it
might still not be covered if that form can also be an adverb. (This
requires more manual work than naive coverage.)


"• Write script that allows quick recompilation of dix and bins, thus avoiding 
user to input all the lt-comp calls to update the dix."

already done, see above ;) 


Regarding a "Quick Vocab Add" program, that's either a very simple
script, or a huge project. We've had earlier GsoC projects on this[1],
making it work in general is a huge task. 

Making helper scripts and such for quickly adding words is a good idea,
you should certainly do it, but it's not really a deliverable.



[1] http://wiki.apertium.org/wiki/Easy_dictionary_maintenance

-- 
Kevin Brubeck Unhammer

GPG: 0x766AC60C

Attachment: pgpkc1AliCUOM.pgp
Description: PGP signature

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to