Hi Tom I added some extra debug and I get the following error:
[ERROR] Malformed input: '|' In '.voto en contra de la resolución b6-0067 | 2004 del parlamento europeo sobre los procedimientos de ratificación del tratado por el que se establece una constitución para europa y la estrategia de comunicación relativa a dicho tratado .' Expected input to have words composed of 1 factor(s) (form FAC1|FAC2|...) but instead received input with 0 factor(s). Aborted This is at line 2230 in your input file, and now it's clear what the problem is - a stray pipe which moses is interpreting as a factor delimiter. It seems that if threads are enabled then moses will read in and queue the whole input file at start up. This is not generally a problem as the input files we use are normally only a few thousand sentences, but it explains why the error was much further down the file than expected. I'll check in the extra debug code because it should be quite useful in this context. Getting the line number would be useful too, but would require more work, cheers - Barry On Tuesday 28 June 2011 15:59, Tom Hoar wrote: > I'm tuning a new ES-EN translation model. The tables were trained > with about 1.75 million pairs from the Europarl v6 data using Moses > w/KenLM SVN rev 4011 and IRSTLM 5.60.03. The attachments herewith > include the run1.moses.ini file and the output log from mert-moses.pl > that also includes the command line. > > If I run from a terminal command > line: > > "$ moses -f run1.moses.ini < mert.es > run0.out" > > Moses > terminates with the same error in the mert-moses.pl.log file. Piping any > other file into moses as above also terminates with the same error. I > also removed the [threads] value to run single threaded, and again, same > terminal error. > > If I run in a terminal: > > "$ moses -f run1.moses.ini" > > > then, copy lines from the mert.es file and paste into the terminal, > they translate fine. > > Also, three days ago, a tuning/training session > with the same moses build competed fine. It used different training > corpus started from the same data and used clean-corpus-n.perl with max > tokens = 78. This corpus uses max tokens = 65 and extracted a different > 2500 pairs for tuning. Those are the only differences in the two > training corpora. > > I'm baffled. Any suggestions? > > Tom -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
