yes, the phrase penalty can be implemented internally, rather than a 
constant in the phrase table.

however, if you want to reduce memory in the binary phrase table, there 
are other things which will reduce it even more:
    1. all vocabulary entries are 64 bit. Reducing this to 32 bit or 
less will probably half the size
    2. The target phrases are duplicated over-and-over for every source 
phrase. Deduplicating the target phrases will also substantially reduce 
the file size

On 17/02/2012 13:54, Marcin Junczys-Dowmunt wrote:
> Hi all,
> I am looking for different ways to decrease the file size of phrase
> tables and only now I realized that the phrase penalty is always exp(1).
> Stored in a float, this one value alone takes up around 1,3G of disk
> space for a phrase-table with 330M phrase pairs. Do you think, there is
> a good reason for keeping the phrase penalty in the phrase table instead
> of the moses.ini file, similar like the word penalty?
>
> Best,
> Marcin
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to