Hi Suzy,

I could re-produce this error in a way that I assume is what you did.
You changed the specification of the CORPUS, but you did not
disable the truecaser.

You need to comment out the following settings:

[TRUECASER]

### script to train truecaser models
#
#trainer = $moses-script-dir/recaser/train-truecaser.perl

[GENERAL]
# truecasers
#input-truecaser = $moses-script-dir/recaser/truecase.perl
#output-truecaser = $moses-script-dir/recaser/truecase.perl
#detruecaser = $moses-script-dir/recaser/detruecase.perl

If these are not disabled, the script still thinks that it has to build
a truecaser model, and hence needs to find unpreprocessed data.

-phi


On Tue, May 25, 2010 at 10:47 AM, Suzy Howlett <[email protected]> wrote:
> Hi,
>
> I'm trying to run a system through the EMS where all of the
> preprocessing (tokenization, lowercasing) has already been done for all
> of the training, tuning and evaluation data. The intermediate steps are
> not available, and I just provide the ultimate lowercased data. In my
> config file I have e.g.
>
> [CORPUS:combined]
> lowercased-stem = $wmt10preproc-data/training/lowercased
>
> where the directory $wmt10preproc-data/training contains two files,
> lowercased.de and lowercased.en. The variables raw-stem, tokenized-stem,
> clean-stem are not set.
>
> However when I run the system, it looks like it's still trying to run
> the get-corpus/tokenize/clean steps - it produces files like
> steps/1/CORPUS_combined_get-corpus.1* which contain error messages about
> not being able to find files. What am I missing?
>
> Thanks,
> Suzy
>
> --
> Suzy Howlett
> http://web.science.mq.edu.au/~showlett/
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to