Hi Hieu

On Dec 1, 2011, at 8:34 AM, Hieu Hoang wrote:

> hi all
> 
> can anyone tell me if creating LM with the IRST toolkit is integrated into 
> the EMS yet?
> 

I let anyone else to answer this point.

> if not, is this the entirety of what has to be run?
>   cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c > 
> temp/monolingual.setagged.gz 
>   $IRSTLM/bin/build-lm.sh -t stat4 -i "gunzip -c 
> temp/monolingual.setagged.gz" -n 5 -p -o temp/iarpa.gz -k 10 
>   $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c > 
> $LMFILE
> 

yes, this is the procedure to train a LM with IRSTLM.
If your corpus is not too big and fits in the memory, you
can use the tlm command to esimate the LM  and directly
store it in binary format (skipping the compile-lm step).
Please, see the IRSTLM manual for details on its usage,
and send further questions directly to the irstlm mailing list:
[email protected]


best
Nicola

> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to