hi marcin

yeah, splitting it where the source phrase change & scoring them in 
parallel is good.

This breaks the count-of-counts to do Good-Turing smoothing, however, if 
you're not using it this can be fixed later.

If you need any help, let me know. I've also added some scripts to 
parallelize the extract, and using parallel IRSTLM training, to
    scripts/generic

On 09/05/2012 07:18, Marcin Junczys-Dowmunt wrote:
> Hi all,
> my extract.sorted and extract.inv.sorted files are around 250G each, can
> I split them manually at points where the source phrase changes and
> score the parts independently? Will the scores be the same as for a
> single run?
>
> Thanks,
> Marcin
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to