Wow. I should have simply searched the entire file for the | character, 
 but my preparation scripts should have escaped that. Now, I have to 
 debug my escaping filter to see why that character wasn't escaped.

 Thanks for the work. Good learning experience for both of us.

 Tom

 On Wed, 29 Jun 2011 09:09:14 +0100, Barry Haddow <[email protected]> 
 wrote:
> Hi Tom
>
> I added some extra debug and I get the following error:
>
> [ERROR] Malformed input: '|'
> In '.voto en contra de la resolución b6-0067 | 2004 del parlamento 
> europeo
> sobre los procedimientos de ratificación del tratado por el que se 
> establece
> una constitución para europa y la estrategia de comunicación relativa
> a dicho
> tratado .'
>   Expected input to have words composed of 1 factor(s) (form 
> FAC1|FAC2|...)
>   but instead received input with 0 factor(s).
> Aborted
>
> This is at line 2230 in your input file, and now it's clear what the 
> problem
> is - a stray pipe which moses is interpreting as a factor delimiter.
>
> It seems that if threads are enabled then moses will read in and 
> queue the
> whole input file at start up. This is not generally a problem as the 
> input
> files we use are normally only a few thousand sentences, but it 
> explains why
> the error was much further down the file than expected. I'll check in 
> the
> extra debug code because it should be quite useful in this context. 
> Getting
> the line number would be useful too, but would require more work,
>
> cheers - Barry
>
> On Tuesday 28 June 2011 15:59, Tom Hoar wrote:
>> I'm tuning a new ES-EN translation model. The tables were trained
>> with about 1.75 million pairs from the Europarl v6 data using Moses
>> w/KenLM SVN rev 4011 and IRSTLM 5.60.03. The attachments herewith
>> include the run1.moses.ini file and the output log from 
>> mert-moses.pl
>> that also includes the command line.
>>
>> If I run from a terminal command
>> line:
>>
>> "$ moses -f run1.moses.ini < mert.es > run0.out"
>>
>> Moses
>> terminates with the same error in the mert-moses.pl.log file. Piping 
>> any
>> other file into moses as above also terminates with the same error. 
>> I
>> also removed the [threads] value to run single threaded, and again, 
>> same
>> terminal error.
>>
>> If I run in a terminal:
>>
>> "$ moses -f run1.moses.ini"
>>
>>
>> then, copy lines from the mert.es file and paste into the terminal,
>> they translate fine.
>>
>> Also, three days ago, a tuning/training session
>> with the same moses build competed fine. It used different training
>> corpus started from the same data and used clean-corpus-n.perl with 
>> max
>> tokens = 78. This corpus uses max tokens = 65 and extracted a 
>> different
>> 2500 pairs for tuning. Those are the only differences in the two
>> training corpora.
>>
>> I'm baffled. Any suggestions?
>>
>> Tom


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to