yes, the phrase penalty can be implemented internally, rather than a
constant in the phrase table.
however, if you want to reduce memory in the binary phrase table, there
are other things which will reduce it even more:
1. all vocabulary entries are 64 bit. Reducing this to 32 bit or
less will probably half the size
2. The target phrases are duplicated over-and-over for every source
phrase. Deduplicating the target phrases will also substantially reduce
the file size
On 17/02/2012 13:54, Marcin Junczys-Dowmunt wrote:
> Hi all,
> I am looking for different ways to decrease the file size of phrase
> tables and only now I realized that the phrase penalty is always exp(1).
> Stored in a float, this one value alone takes up around 1,3G of disk
> space for a phrase-table with 330M phrase pairs. Do you think, there is
> a good reason for keeping the phrase penalty in the phrase table instead
> of the moses.ini file, similar like the word penalty?
>
> Best,
> Marcin
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support