Hi,

I recently had a very similar error message (when binarizing the
reordering-table, which also contained, in my case not inappropriate
characters, but missing ones.

See, http://www.mail-archive.com/[email protected]/msg02869.html

A more informative error message would certainly help to 'repair' the
'corrupt' files.

Rico Sennrich wrote:
> Hi all,
>
> I recently got this error message when trying to train a hierarchical 
> model in Moses:
>
> r...@rico-work:~
> $
> /home/rico/bin/moses-scripts/scripts-20100920-1324//training/phrase-extract
> /score /home/rico/smtworkspace/SACde-fr141//model/extract.sorted
> /home/rico/smtworkspace/SACde-fr141//model/lex.f2e
> /home/rico/smtworkspace/SACde-fr141//model/rule-table.half.f2e  --Hierarchical
> Score v2.0 written by Philipp Koehn
> scoring methods for extracted rules
> processing hierarchical rules
> Loading lexical translation table
> from /home/rico/smtworkspace/SACde-fr141//model/lex.f2e.........
> score: score.cpp:434: void
> outputPhrasePair(std::vector<PhraseAlignment*,
> std::allocator<PhraseAlignment*> >&, float): Assertion
> `bestAlignment->alignedToT[ j ].size() == 1' failed.
> Aborted
>
> After a bit of searching, I found that Moses doesn't like words in square
> brackets very much. This is the line in extract.sorted that caused the crash:
>
> ( [X][X] [X] ||| , « [X][X] [...] [X] ||| 0-1 1-2 ||| 0.111111
>
> I just committed a patch that returns a (hopefully) more informative error
> message. Still, people need to make sure that the training texts do not 
> contain any words in square brackets.
>
> Does anyone have a better idea how to handle this?
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>   


-- 
Thomas Meyer
E-Mail: [email protected]
Web:    www.idiap.ch



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to