Hi, David,

thanks for reporting the issue with train-factored... and 
filter-and-binarize... 
Just the renaming by Philip Williams was not quite complete and it should be 
fixed now.

The new names of the scripts are simpler: train-model.perl and 
filter-model-given-input.perl

Cheers, O.

David Edelstein wrote:
> Hello,
> 
> I'm using Moses to do some SMT on Arabic, experimenting with
> diacritized vs. undiacritized Arabic training corpora. (I am using
> MADA+TOKAN to perform automatic diacritization.) So, if anyone happens
> to be specifically interested in Arabic, has some tips on using Moses
> for Arabic (right now I am just trying to get a baseline system
> running, so I haven't even begun exploring which parameters I need to
> tweak from the defaults), or can give me any other insights, I'd be
> very pleased to talk to you about it off-list; please email me.
> 
> Now, I have a specific question and a specific problem, to which I
> have not found a solution by searching the archives.
> 
> 1. There are two scripts referenced in scripts/released-files (read by
> the scripts Makefile):
>    training/train-factored-phrase-model.perl
>    training/filter-and-binarize-model-given-input.pl
> 
> These scripts do not exist in the most recent SVN release so 'make
> release' reports an error since obviously it cannot install them.
> 
> The tutorials alternately reference train-factored-phrase-model.perl
> and train-model.perl; reading the latter, it seems to do factored
> training. Is this just an error (and something that should be updated
> in the online docs and released-files), and I should only be using
> train-model.perl? Or is there a difference between the two scripts?
> And is the same true of
> training/filter-and-binarize-model-given-input.pl vs.
> filter-model-given-input.pl?
> 
> 2. I went through the entire tutorial using the French-English
> Europarl data sets, and got it working. Now I'm going through the same
> process with my Arabic-English parallel corpora. I've gotten as far as
> tuning. I've been trying to use train-model.perl, and it gets to this
> part:
> 
> "<my-moses-dir>/moses-cmd/src/moses -v 0 -config
> <my-model-dir>/moses.ini -inputtype 0 -w 0.000000 -lm 0.333333 -d
> 0.333333 -tm 0.100000 0.066667 0.100000 0.066667 0.000000
> -n-best-list run1.best100.out 100 -i <my-arabic-input-file> > run1.out
> 
> It generates run1.best100.out and run1.out, but then chokes with this
> error message:
> 
> Translation took 0.060 seconds
> Finished translating
> [ERROR] Malformed input at
>   Expected input to have words composed of 1 factor(s) (form FAC1|FAC2|...)
>   but instead received input with 2 factor(s).
> Aborted
> 
> So I gather somewhere I have a setting wrong, but I cannot figure out
> where it is. I basically followed the exact same steps with my
> Arabic-English corpora as in the tutorial, just substituting my own
> training data. I'm not trying to do factored training at this time.
> 
> Any advice appreciated. Thanks!
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

-- 
Ondrej Bojar (mailto:[email protected] / [email protected])
http://www.cuni.cz/~obo
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to