I did some more debugging and it turns out that having some characters like pipe "|" or square brackets: [ ] in the input can crash the decoder with a segmentation fault. As a simple test with the sample models, the following inputs will crash the decoder:
- das||ist||ein||kleines||haus - [emailto:[email protected] <emailto%[email protected]>] In the code this seems to come from the logic to look for annotated words for hiero syntax rule in the CreateFromString method in the moses/Phrase.cpp class. I understand these characters have special meaning in the language models so I have filtered this characters from my input text. Hope this helps if others stumble upon this error. Cheers, Sameer. On Wed, Aug 24, 2016 at 6:42 PM, Sameer Bhadouria < [email protected]> wrote: > Hello, > > I m trying to run the mosesdecoder via a Java JNI call to moses library > and I am hitting some segmentation fault issues. > > ```siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: > 0x00007f11b7fff527``` > I have run into this multiple times and most of the time the problem > happens in the one of the following 2 methods: > > ``` > > 1. Moses::PhraseDecoder::CreateTargetPhraseCollection(Moses::Phrase > const&, bool, bool) > > 2. Moses::Phrase::CreateFromString(Moses::FactorDirection, > std::vector<unsigned long, std::allocator<unsigned long> > const&, > StringPiece const&, Moses::Word**) > ``` > > Any pointers on this? > > Thanks. > > -- > Regards, > Sameer Bhadouria. > -- Regards, Sameer Bhadouria.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
