Hi Hieu

On Dec 1, 2011, at 8:34 AM, Hieu Hoang wrote:

> hi all
> 
> can anyone tell me if creating LM with the IRST toolkit is integrated into 
> the EMS yet?
> 

I let anyone else to answer this point.

> if not, is this the entirety of what has to be run?
>   cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c > 
> temp/monolingual.setagged.gz 
>   $IRSTLM/bin/build-lm.sh -t stat4 -i "gunzip -c 
> temp/monolingual.setagged.gz" -n 5 -p -o temp/iarpa.gz -k 10 
>   $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c > 
> $LMFILE
> 

yes, this is the procedure to train a LM with IRSTLM.
If your corpus is not too big and fits in the memory, you
can use the tlm command to esimate the LM  and directly
store it in binary format (skipping the compile-lm step).
Please, see the IRSTLM manual for details on its usage,
and send further questions directly to the irstlm mailing list:
user-irs...@list.fbk.eu


best
Nicola

> 
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to