Hi,

To close the loop on this one, in case anyone else runs into this.

Turns out the reordering table contained a handful offending lines which 
triggered the abort:

>>
^K ||| ^K ||| 0.818182 0.0909091 0.0909091 0.818182 0.0909091 0.0909091
^K ||| désactivés ||| 0.6 0.2 0.2 0.6 0.2 0.2
^K ||| en ||| 0.2 0.2 0.6 0.2 0.2 0.6
^K ||| la ||| 0.714286 0.142857 0.142857 0.714286 0.142857 0.142857
<<

(The file *is* in UTF-8 but somehow ended up containing some garbage such as 
those control chars).

I'm not sure what exactly is the problem with these lines, possibly the fact 
the first field is empty except for the control character, ^K. Note that the 
control character as such is not what causes this problem as it occurs in other 
parts of the same file, and removing just the control chars was not sufficient 
to fix this.

Also, based on my experience the "line already inserted" error cannot just be 
ignored because it prevents the process from continuing. I resolved to manually 
removing those lines. 

Mirko

-----Original Message-----
From: Hieu Hoang [mailto:[email protected]] 
Sent: Sunday, May 17, 2009 1:02 AM
To: Mirko Plitt
Cc: [email protected]
Subject: Re: [Moses-support] processLexicalTable throws std::bad_alloc error

hey mirko

4gb ram should be enough to run the everything. the problem is most 
likely to be the way the program was executed, or with the data.

i would always recommend character encoding all files in utf8. in which 
case, you should ensure that the following command is run before you do 
the binarising:
export LC_ALL=C

btw, the warning "line already inserted" doesn't seem to be a problem - 
just ignore it

Mirko Plitt wrote:
>
> Hi,
>
> When trying to convert my reordering table into binary format I hit 
> the following error on a fairly high-spec (virtual) centOS Linux machine:
>
> terminate called after throwing an instance of 'std::bad_alloc'
>
> what(): St9bad_alloc
>
> Aborted
>
> Note that I manually removed a couple of lines from the reordering 
> table previously because I got "line already inserted" errors 
> beforehand. Could this be the reason? Or is this just a question of 
> available memory? The size of the compressed reordering table is 500MB 
> (uncompressed 6.6GB). I believe the virtual machine grants me 4GB of RAM.
>
> Thanks in advance for any help!
>
> Mirko
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>   


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to