This is clearly an issue related to IRSTLM.

please send your answer  to the irstlm mailing list  (user-irstlm   AT  
list.fbk.eu<http://list.fbk.eu>)

Maybe Hieu is right, you run out of memory or outo-ofdisk

Which is the CPU RAM of your machine?
Have yo disk space in the temporary directory?

Please try also to run the "build-lm.sh" command with this additional 
parameters "-k 10 -verbose",
collect the stdout and stderr and send it me.

Nicola Bertoldi
On behalf of IRSTLM team




On Apr 2, 2013, at 2:19 PM, Hieu Hoang wrote:

i'm not an expert with irstlm but you should double check you haven't run out 
of disk space.

Also, in
   build-lm.sh
comment out the lines that delete temporary files and directories to see what 
the script has created. This will help you in debugging the problem
  rm $tmpdir/* 2> /dev/null
and
  rmdir $tmpdir 2> /dev/null



On 1 April 2013 16:27, Swapnil Jadhav 
<saj1...@hotmail.com<mailto:saj1...@hotmail.com>> wrote:
I am getting stuck at the following step.

export IRSTLM=$HOME/g2p/irstlm; ~/g2p/irstlm/bin/build-lm.sh -i 
file.sb.tr<http://file.sb.tr/> -t ~/g2p/flm/tmp -p -s improved-kneser-ney -o 
file.lm.tr<http://file.lm.tr/>

As .gz file is not getting created.
I have used moses 3-4 times now and I never got stuck at this step.
The only change is previously I used training files with size < 10 Mb. And now 
36 Mbs.
Will that affect ???
Because when I am trying my previous files I am successfully getting passed 
this step.
Please help.

Output :

saj@Jadhavs:~/g2p/flm$ ~/g2p/irstlm/bin/add-start-end.sh < 
~/g2p/fcorpus/file.true.tr<http://file.true.tr/> > 
file.sb.tr<http://file.sb.tr/>

saj@Jadhavs:~/g2p/flm$ ls
file.sb.tr<http://file.sb.tr/>

saj@Jadhavs:~/g2p/flm$ export IRSTLM=$HOME/g2p/irstlm; 
~/g2p/irstlm/bin/build-lm.sh -i file.sb.tr<http://file.sb.tr/> -t ~/g2p/flm/tmp 
-p -s improved-kneser-ney -o file.lm.tr<http://file.lm.tr/>
Temporary directory /home/saj/g2p/flm/tmp does not exist
creating /home/saj/g2p/flm/tmp
Extracting dictionary from training corpus
Splitting dictionary into 3 lists
Extracting n-gram statistics for each word list
Important: dictionary must be ordered according to order of appearance of words 
in data
used to generate n-gram blocks,  so that sub language model blocks results 
ordered too
dict.000
dict.001
dict.002
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > 
$tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary 
-iknstat="$tmpdir/ikn.stat.$sdict" >> $logfile 2>&1
Estimating language models for each word list
dict.000
dict.001
dict.002
$scr/build-sublm.pl<http://build-sublm.pl/> $verbose $prune $smoothing "cat 
$tmpdir/ikn.stat.dict.*" --size $order --ngrams "$gunzip -c 
$tmpdir/ngram.${sdict}.gz" -sublm $tmpdir/lm.$sdict >> $logfile 2>&1
Merging language models into file.lm.tr<http://file.lm.tr/>
Cleaning temporary directory /home/saj/g2p/flm/tmp
Removing temporary directory /home/saj/g2p/flm/tmp

saj@Jadhavs:~/g2p/flm$ ls
file.sb.tr<http://file.sb.tr/>

saj@Jadhavs:~/g2p/flm$ ~/g2p/irstlm/bin/compile-lm --text yes file.lm.tr.gz 
file.arpa.tr<http://file.arpa.tr/>
inpfile: file.lm.tr.gz
loading up to the LM level 1000 (if any)
dub: 10000000
Failed to open file.lm.tr.gz!


From
Swapnil A Jadhav
MTech CSE-IS
NIT Warangal

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support




--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu



_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to