Hi Barry,

That's what I meant by filtering. 300 GB is the memory usage of the
pruned phrase table during runtime when loaded as an instance of
PhraseDictionaryMemory. The file size of the text file containing the
pruned phrase table is roughly 20 GB. The original unpruned phrase table
takes up around 50 GB as a text file, but we are not using that one
anyway.

19/9/2011, "Barry Haddow" <[email protected]> napisaƂ/a:

>> Pruning is also not enough, our filtered phrase table still takes around
>> 300 GB when loaded into memory, I did not even dare to try and load the
>> unfiltered phrase-table into memory :). But I will take a look at the
>> implementation from the marathon, thanks.
>
>I think Hieu was referring to this
>http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc16
>rather than filtering, which may be of some use. It's hard to imagine that a
>500G phrase table doesn't contain a lot of noise. I'm surprised that filtering
>doesn't remove more though - are you decoding large batches of sentences?

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to