Hi Hieu On Dec 1, 2011, at 8:34 AM, Hieu Hoang wrote:
> hi all > > can anyone tell me if creating LM with the IRST toolkit is integrated into > the EMS yet? > I let anyone else to answer this point. > if not, is this the entirety of what has to be run? > cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c > > temp/monolingual.setagged.gz > $IRSTLM/bin/build-lm.sh -t stat4 -i "gunzip -c > temp/monolingual.setagged.gz" -n 5 -p -o temp/iarpa.gz -k 10 > $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c > > $LMFILE > yes, this is the procedure to train a LM with IRSTLM. If your corpus is not too big and fits in the memory, you can use the tlm command to esimate the LM and directly store it in binary format (skipping the compile-lm step). Please, see the IRSTLM manual for details on its usage, and send further questions directly to the irstlm mailing list: user-irs...@list.fbk.eu best Nicola > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support