I did some more debugging and it turns out that having some characters like
pipe "|" or square brackets: [ ] in the input can crash the decoder  with a
segmentation fault.
As a simple test with the sample models, the following inputs will crash
the decoder:

   - das||ist||ein||kleines||haus
   -

   [emailto:[email protected] <emailto%[email protected]>]



In the code this seems to come from the logic to look for annotated words
for hiero syntax rule in the CreateFromString method in the
moses/Phrase.cpp class. I understand these characters have special meaning
in the language models so I have filtered this characters from my input
text.

Hope this helps if others stumble upon this error.

Cheers,
Sameer.

On Wed, Aug 24, 2016 at 6:42 PM, Sameer Bhadouria <
[email protected]> wrote:

> Hello,
>
> I m trying to run the mosesdecoder via a Java JNI call to moses library
> and I am hitting some segmentation fault issues.
>
> ```siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr:
> 0x00007f11b7fff527```
> I have run into this multiple times and most of the time the problem
> happens in the one of the following 2 methods:
>
> ```
>
> 1. Moses::PhraseDecoder::CreateTargetPhraseCollection(Moses::Phrase
> const&, bool, bool)
>
> 2. Moses::Phrase::CreateFromString(Moses::FactorDirection,
> std::vector<unsigned long, std::allocator<unsigned long> > const&,
> StringPiece const&, Moses::Word**)
> ```
>
> Any pointers on this?
>
> Thanks.
>
> --
> Regards,
> Sameer Bhadouria.
>



-- 
Regards,
Sameer Bhadouria.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to