Hi, a much better solution is the use of sparse feature functions that compute the feature values on the fly and store them efficiently in the decoder.
We created already some such sparse feature function in the MIRA branch of the decoder. I am currently not sure about in which repository a version of this could be found - maybe Barry Haddow or Eva Hasler have a better answer. -phi On Wed, Sep 7, 2011 at 8:34 AM, Anne Schuth <[email protected]> wrote: > Hi all, > > We are in the process of reimplementing some of the 11,001 new features of > the Chiang et al. 2009 paper. We are adding a few thousand features to our > phrase table, causing it to blow up significantly. For tuning purposes we > filter the table to only include phrases used by our tuning dataset which > brings the size on disk down to about 200MB (gzipped). However, as soon as > we load this table into memory with Moses, it takes more than 60GB. This is > not really a surprise I guess since Moses will represent all our 0's as > floating points, but it is a problem since not all machines I would like to > run this on have that much memory. > This leads to my question: does Moses support some form of sparse > representation of phrase tables? Or, how is this issue generally solved, as > I am quite sure we are not the first to try this. > > Any comments, pointers to documentation are very much appreciated! > > Best, > Anne > > -- > Anne Schuth > ILPS - ISLA - FNWI > University of Amsterdam > Science Park 904, C3.230 > 1098 XH AMSTERDAM > The Netherlands > 0031 (0) 20 525 5357 > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
