Hi mailing,
I am trying to build my lm with IRST toolkit. First, I've added <s> tags
with 'add-start-end.sh' and, obviously, have my data tokenized & lowercased.
When I run 'build-lm.sh' it looks like it works fine, but at the end of
the process no output file is found. Here's the log:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
[EMAIL PROTECTED]:~/irstlm/bin$ bash build-lm.sh -i ~/corpus/tag.es -o
~/corpus/ca.lm -n 3 -k 5 -s kneser-ney
Cleaning temporary directory stat
Extracting dictionary from training corpus
Splitting dictionary into 5 lists
Extracting n-gram statistics for each word list
dict.000
dict.001
dict.002
dict.003
dict.004
Estimating language models for each word list
dict.000
dict.001
dict.002
dict.003
dict.004
Merging language models into /home/esca/corpus/ca.lm
Cleaning temporary directory stat
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I've tried with different corpus sizes, but it didn't work either. btw,
I am running the scripts under Ubuntu 7.04 32bit.
Regards,
Miguel
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support