It's complaining about a long sentence with over 1000 characters, rather
than words. Is there possibly a very, very long word?

On 27 January 2011 02:33, Joachim Van den Bogaert
<[email protected]>wrote:

> Hi everyone,
>
> I encountered a problem during training in step 3 (Align words) with the
> following message:
>
> Using SCRIPTS_ROOTDIR:
> /opt/moses-tools/moses-scripts/scripts-20100707-1101/
> Using single-thread GIZA
> (3) generate word alignment @ Wed Jan 26 19:12:08 UTC 2011
> Combining forward and inverted alignment from files:
>  /mnt/data//giza.es-en/es-en.A3.final.{bz2,gz}
>  /mnt/data//giza.en-es/en-es.A3.final.{bz2,gz}
> Executing: mkdir -p /mnt/data//model
> Executing:
>
> /opt/moses-tools/moses-scripts/scripts-20100707-1101//training/symal/giza2ba
> l.pl -d "gzip -cd /mnt/data//giza.en-es/e
> gzip -cd /mnt/data//giza.es-en/es-en.A3.final.gz"
> |/opt/moses-tools/moses-scripts/scripts-20100707-1101//training/symal/symal
> -a
> nal="yes" -final="yes" -both="yes" >
> /mnt/data//model/aligned.grow-diag-final-and
> symal: computing grow alignment: diagonal (1) final (1)both-uncovered (1)
> 1500066: target len=999 is not less than MAX_WORD-1=999
> symal: symal.cpp:83: int getals(std::fstream&, int&, int*, int&, int*):
> Assertion `strlen(w)<1000-1' failed.
> Aborted
> Exit code: 134
> ERROR: Can't generate symmetrized alignment file
>
> Has anyone encountered this message before?
> And does anyone have a clue what it means?
>
> I cleaned the corpus to contain sentences with only 0-50 tokens.
> I checked this after the cleaning procedure, so isn't it strange that I get
> the message:
>
> target len=999 is not less than MAX_WORD-1=999
>
> Thanks,
> Joachim
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to