Well, when I look in symal.cpp (where the assertion was), I see the 
assertion is

assert(strlen(w)<MAX_WORD-1);

and up the top it says

#define MAX_WORD 10000 // maximum lengthsource/target strings

so it does sound like your target sentence is longer than it was 
expecting, although the limit in my version is 10,000 instead of 1,000.

I have no idea why your limit would be different, and I don't really 
know anything about this part of the translation process, so what I'm 
about to suggest may be a really bad idea. What happens if you change 
the limit to 10,000?

Best,
Suzy

On 27/01/11 8:52 AM, Hieu Hoang wrote:
> sorry it's crashed during symmetrising alignment so that has nothing to
> do with words size. So I'm not sure why its crashed.
>
> On 27 January 2011 04:32, Hieu Hoang <[email protected]
> <mailto:[email protected]>> wrote:
>
>     It's complaining about a long sentence with over 1000 characters,
>     rather than words. Is there possibly a very, very long word?
>
>     On 27 January 2011 02:33, Joachim Van den Bogaert
>     <[email protected] <mailto:[email protected]>> wrote:
>
>         Hi everyone,
>
>         I encountered a problem during training in step 3 (Align words)
>         with the
>         following message:
>
>         Using SCRIPTS_ROOTDIR:
>         /opt/moses-tools/moses-scripts/scripts-20100707-1101/
>         Using single-thread GIZA
>         (3) generate word alignment @ Wed Jan 26 19:12:08 UTC 2011
>         Combining forward and inverted alignment from files:
>           /mnt/data//giza.es-en/es-en.A3.final.{bz2,gz}
>           /mnt/data//giza.en-es/en-es.A3.final.{bz2,gz}
>         Executing: mkdir -p /mnt/data//model
>         Executing:
>         
> /opt/moses-tools/moses-scripts/scripts-20100707-1101//training/symal/giza2ba
>         l.pl <http://l.pl> -d "gzip -cd /mnt/data//giza.en-es/e
>         gzip -cd /mnt/data//giza.es-en/es-en.A3.final.gz"
>         
> |/opt/moses-tools/moses-scripts/scripts-20100707-1101//training/symal/symal
>         -a
>         nal="yes" -final="yes" -both="yes" >
>         /mnt/data//model/aligned.grow-diag-final-and
>         symal: computing grow alignment: diagonal (1) final
>         (1)both-uncovered (1)
>         1500066: target len=999 is not less than MAX_WORD-1=999
>         symal: symal.cpp:83: int getals(std::fstream&, int&, int*, int&,
>         int*):
>         Assertion `strlen(w)<1000-1' failed.
>         Aborted
>         Exit code: 134
>         ERROR: Can't generate symmetrized alignment file
>
>         Has anyone encountered this message before?
>         And does anyone have a clue what it means?
>
>         I cleaned the corpus to contain sentences with only 0-50 tokens.
>         I checked this after the cleaning procedure, so isn't it strange
>         that I get
>         the message:
>
>         target len=999 is not less than MAX_WORD-1=999
>
>         Thanks,
>         Joachim
>
>
>
>
>         _______________________________________________
>         Moses-support mailing list
>         [email protected] <mailto:[email protected]>
>         http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

-- 
Suzy Howlett
http://www.showlett.id.au/
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to