Re: [Moses-support] Running Giza++ on subsets of data

Kenneth Heafield Wed, 15 Jun 2011 10:52:43 -0700

Try using MGIZA: http://geek.kyloo.net/software/doku.php/mgiza:overview


On 06/15/11 04:51, Prasanth K wrote:
> Hello All,
> 
> I am conducting a series of experiments to build translation systems
> using Moses in which the corpus of the current experiment is a subset of
> the corpora used in the previous experiment. I have started with the
> Europarl corpora and am likely to repeat this process about 20 times.
> Unless I am mistaken, this is going to take me nearly a month and I am
> looking for ways to speeden up the whole process.
> 
> Is there any optimal way to run Giza++ on these different subsets of
> data without having to run it again and again? 
> "I do not want to use the alignments obtained from running Giza++ on the
> entire Europarl corpora, for the other experiments (by selecting the
> alignment information from aligned.grow-final-and-diag for the sentences
> in the subsets)."
> 
> The order of the experiments does not matter, so the experiments can be
> done on the smallest dataset followed by supersets of the previous
> dataset, provided there is a way to modify the translation probabilities
> from Giza++ using just the additional data alone and this does not
> affect the performance of Giza++ in comparison to when Giza++ is run on
> the corpus in stand-alone mode. 
> 
> Kindly let me know if there is some way to do this and I am missing it.
> 
> - regards,
> Prasanth 
> 
> 
> -- 
> "Theories have four stages of acceptance. i) this is worthless nonsense;
> ii) this is an interesting, but perverse, point of view, iii) this is
> true, but quite unimportant; iv) I always said so."
> 
>   --- J.B.S. Haldane
> 
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Running Giza++ on subsets of data

Reply via email to