Hi,

yes, it is correct that step 1 is doing just the data preparation for GIZA++.
The most time-consuming step is running mkcls to creake the classes
for the relative distortion models.

-phi

On Mon, Aug 31, 2009 at 4:39 PM, James Read<[email protected]> wrote:
> Hi,
>
> does anyone know what step 1 of the moses training script does other
> than produce the dictionaries and the numerical sentences that enable
> GIZA++ to do its job. The reason I ask is that on my machine step 1
> takes just over 70 mins for en-fr Europarl corpus.
>
> My optimised version of data preparation and EM IBM Model 1 completes
> is 121 seconds for five iterations of EM, that's just over 2 minutes.
> Before publishing these results I just wanted to make sure there's
> nothing I've missed about step 1 of the training process. Does it do
> anything at all that influences GIZA++ other than preparing the
> digital sentences?
>
> James
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to