Hi Suzy, I could re-produce this error in a way that I assume is what you did. You changed the specification of the CORPUS, but you did not disable the truecaser.
You need to comment out the following settings: [TRUECASER] ### script to train truecaser models # #trainer = $moses-script-dir/recaser/train-truecaser.perl [GENERAL] # truecasers #input-truecaser = $moses-script-dir/recaser/truecase.perl #output-truecaser = $moses-script-dir/recaser/truecase.perl #detruecaser = $moses-script-dir/recaser/detruecase.perl If these are not disabled, the script still thinks that it has to build a truecaser model, and hence needs to find unpreprocessed data. -phi On Tue, May 25, 2010 at 10:47 AM, Suzy Howlett <[email protected]> wrote: > Hi, > > I'm trying to run a system through the EMS where all of the > preprocessing (tokenization, lowercasing) has already been done for all > of the training, tuning and evaluation data. The intermediate steps are > not available, and I just provide the ultimate lowercased data. In my > config file I have e.g. > > [CORPUS:combined] > lowercased-stem = $wmt10preproc-data/training/lowercased > > where the directory $wmt10preproc-data/training contains two files, > lowercased.de and lowercased.en. The variables raw-stem, tokenized-stem, > clean-stem are not set. > > However when I run the system, it looks like it's still trying to run > the get-corpus/tokenize/clean steps - it produces files like > steps/1/CORPUS_combined_get-corpus.1* which contain error messages about > not being able to find files. What am I missing? > > Thanks, > Suzy > > -- > Suzy Howlett > http://web.science.mq.edu.au/~showlett/ > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
