Sentence mismatch error is definitely an important error. Is there a
problem with your corpus? Dodgy encoding, Windows carriage return, range
out of disk space etc?

Also, don't use the = character in directory name any more. It's being used
to separate key=value pairs. eg.in the refactored ini file, a phrase-table
entry
  0 0 0 5 file
becomes
  PhraseDictionaryMemory path=file input-factor=0 output-factor=0

It's not the cause of your errror but it will affect it further down the
line. Sorry, should highlight this potential problem a little more

On 15 July 2013 02:07, Tom Hoar <[email protected]>wrote:

> Here is the command line when I ran train-model.perl.
>
> /usr/bin/perl -w /usr/local/bin/train-model.perl \
>   --do-steps 3 \
>   --cores 6 \
>   --corpus /opt/domy/BUILDS/lm/es-test-retokr/bitext \
>   --e en_us \
>   --external-bin-dir /usr/local/bin \
>   --f es \
>   --lm 0:0:/tmp/placeholder.lm:0 \
>   --max-phrase-length 10 \
>   --mgiza \
>   --mgiza-cpus 6 \
>   --model-dir
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10
> \
>   --root-dir
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10
>
> The log output has a non-fatal error "Sentence mismatch error!" Any
> ideas about the cause or importance?
>
> (3) generate word alignment @ Mon Jul 15 07:44:56 ICT 2013
> Combining forward and inverted alignment from files:
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10/giza.es-en_us/es-en_us.A3.final.{bz2,gz}
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10/giza.en_us-es/en_us-es.A3.final.{bz2,gz}
> Executing: mkdir -p
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10
> Executing: /usr/local/lib/mosesdecoder/scripts/training/giza2bal.pl -d
> "gzip -cd
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10/giza.en_us-es/en_us-es.A3.final.gz"
> -i "gzip -cd
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10/giza.es-en_us/es-en_us.A3.final.gz"
> |/usr/local/lib/mosesdecoder/scripts/../bin/symal -alignment="grow"
> -diagonal="yes" -final="yes" -both="no" >
>
> /opt/domy/TRAININGS/merts/mert-t=es-l=es-test-retokr-T=irstlmken-n=12-a=giza-g=10/aligned.grow-diag-final
> symal: computing grow alignment: diagonal (1) final (1)both-uncovered (0)
> Sentence mismatch error! Line #1179689
> skip=<0> counts=<1227038>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to