Re: [Moses-support] segmentation fault with lattice decoding

Joerg Tiedemann Thu, 04 Mar 2010 09:52:13 -0800

I did that and this is what I get:

Program received signal SIGSEGV, Segmentation fault.
0x000000000047dd7a in 
__gnu_cxx::new_allocator<Moses::TranslationOption*>::construct 
(this=0x12a44e0, __p=0xb1, __v...@0x7fff499d0480)
     at /usr/include/c++/4.3/ext/new_allocator.h:108
108           { ::new((void *)__p) _Tp(__val); }



hm - maybe there's really something wrong with my lattice input. I 
couldn't see any empty nodes. Moses did print a list of lattice segments 
like this:

...
925 -- (gewoond , -0.000, 105) (gewoond , -0.000, 45)
926 -- (stupid , -100.000, 101)
927 -- (gewoond , -0.000, 44)
928 -- (you , -100.000, 102)
929 -- (zwitserland , -0.000, 25)
930 -- (zwitserland , -0.000, 25)
931 -- (zwitserland , -0.000, 26)
...

Looks a bit strange with the -0.000 but that's probably ok. I also tried 
with a more recent version of Moses and also got the segmentation fault 
(not the lattice output though)

Well, I will have a careful look at the input again ....
If you have any other ideas - please let me know.
Thanks!

Jörg


On 3/4/10 4:32 PM, Barry Haddow wrote:
> Hi Jorg
>
> The stacktrace looks a little strange because of the compiler optimisations.
> If you edit moses/src/Makefile, changing
> CXXFLAGS = -g -O2
> to
> CXXFLAGS = -g
> do a 'make clean all', then rerun, you should get a more readable stacktrace.
> Try rerunning just on the sentence that gave you the problems to see if you
> can reproduce the problem.
>
> regards
> Barry
>
> On Thursday 04 March 2010 15:24, Chris Dyer wrote:
>> I'm not certain what's causing this.  From the part of the stack trace
>> you're showing, it looks like it's probably when translations options
>> are being gathered for the spans in the lattice.  Perhaps the lattice
>> is malformed (i.e., spans don't line up, there are empty nodes, etc)?
>>
>> 2010/3/4 Jörg Tiedemann<[email protected]>:
>>> I get a segmentation fault when decoding (large) word lattices. Moses
>>> seems to parse well through the input but crashes after a while. Tracing
>>> with gdb gave me this info:
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x00000000004a2888 in Moses::TranslationOptionCollection::Add (
>>>      this=<value optimized out>, translationOption=0x18a12a0)
>>>      at
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h:104 104
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h: No such file or directory.
>>>          in
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h
>>>
>>> Indeed, the header file does not exist on my system.
>>> Do I need to install some additional packages and re-compile Moses in a
>>> certain way to get rid of this error?
>>>
>>> Jörg
>>>
>>> Chris Dyer wrote:
>>>> Moses transition costs can be converted to probabilities (i.e., you
>>>> can make a search graph into a stochastic FSA), but they do need to be
>>>> renormalized. You can do this by computing the posterior probability
>>>> of each edge (using the forward-backward algorithm), and then
>>>> normalizing all of the out-going edges at each node.
>>>>
>>>> One caveat: the way moses is usually trained (with MERT) means that
>>>> the resulting transition probabilities might be scaled in funny ways
>>>> (i.e., the best edge might have 99.99% of the probability mass, or it
>>>> might just be a miniscule amount over the next best), so you may need
>>>> to do some things (like rescaling the probabilities) to make them
>>>> useful.
>>>>
>>>> -C
>>>>
>>>> 2010/3/4 Jörg Tiedemann<[email protected]>:
>>>>> One more time about the conversion from search graphs to word lattices:
>>>>> In the word lattice I would like to use probabilities for each edge but
>>>>> I guess that transition costs cannot be easily interpreted as log
>>>>> prob's. For example, I have seen quite a few positive transition values
>>>>> in my sample output which would definitely create some problems.
>>>>>
>>>>> Anyway, what I try to do is to use Moses output to create word lattice
>>>>> input for another translation step. Maybe the value at input lattice
>>>>> edges do not strictly have to be probabilities and I shouldn't care too
>>>>> much?
>>>>>
>>>>> Jörg
>>>>>
>>>>> Loïc BARRAULT wrote:
>>>>>> Hi Jörg,
>>>>>>
>>>>>> I'll take an example to explain my point of view.
>>>>>>
>>>>>> Here is an example of a recombined hypo :
>>>>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647
>>>>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm
>>>>>> looking for a , pC=-0.518872, c=-0.31244
>>>>>>
>>>>>> In my case, hypo number are the nodes of the graph and phrases are
>>>>>> represented on links.
>>>>>> In this case, to preserve the graph topology, the only thing which can
>>>>>> be done is to merge the nodes 319 with 181, which result in creating a
>>>>>> link between node 1 (back node) and 181 (the recombined node).
>>>>>>
>>>>>> (X) ---------->(181)
>>>>>> (1)------------->(319)
>>>>>>
>>>>>> result in
>>>>>> (X) ---------->(181)
>>>>>> (1)---------------^
>>>>>>
>>>>>> In your example, you can't merge 5 and 1 because their history is not
>>>>>> the same (you pointed this out).
>>>>>> But if 6 is recombined and pointing to 4, then the only thing you can
>>>>>> do safely is to merge 6 and 4, which means creating a link between 5
>>>>>> and 4.
>>>>>>
>>>>>> Good luck.
>>>>>>
>>>>>> Loïc
>>>>>>
>>>>>>
>>>>>> 2010/3/3 Jörg Tiedemann<[email protected]
>>>>>> <mailto:[email protected]>>
>>>>>>
>>>>>>
>>>>>>      I try to use the search graph output now for producing a word
>>>>>>      lattice in PLF style. I'm still a bit confused on how to use the
>>>>>>      recombined hypotheses and their pointers to superior hypo's. Do I
>>>>>>      have to copy the relevant parts from the superior hypotheses into
>>>>>>      the lattice or should I join the hypotheses that point to
>>>>>> recombined hypo's with the existing graph? To give an example:
>>>>>>
>>>>>>        who   is    bill    ?
>>>>>>      (0)-->(1)-->(2)--->(3)-->(4)
>>>>>>       |
>>>>>>       |--->(5)------------->(6)
>>>>>>       how  |   is bill ?
>>>>>>            |
>>>>>>            |---->(7)----->(8)
>>>>>>             is the   bill
>>>>>>
>>>>>>      where (6) is a recombined hypo pointing to (4) and covering tokens
>>>>>> 1-3 and (8) is a recombined hypo that points to (3)
>>>>>>
>>>>>>      Should I copy the relevant parts of (4) that cover the same tokens
>>>>>>      to the graph as a link to (5) or can I safely join (5) and (1)?
>>>>>>      Probably not because this would produce "who is the bill" which is
>>>>>>      not necessarily an option ...
>>>>>>
>>>>>>      Thanks a lot for clarifying this to me!
>>>>>>      Jörg
>>>>>>
>>>>>>
>>>>>>
>>>>>>      Chris Dyer wrote:
>>>>>>
>>>>>>          As long as you're just splitting, keeping the weights
>>>>>> consistent isn't
>>>>>>          too hard- just keep all the weight in one segment and make all
>>>>>> the rest of the segments have no impact when they multiply (i.e., a
>>>>>> probability of 1, or a cost of 0).  The openFST or AT&T tools can help
>>>>>>          you manipulate lattices if you want to do more interesting
>>>>>>          things with
>>>>>>          weights, such as pushing them to the start of paths.
>>>>>>
>>>>>>          Chris
>>>>>>
>>>>>>          On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
>>>>>>          <[email protected]
>>>>>>          <mailto:[email protected]>>  wrote:
>>>>>>
>>>>>>              Indeed, splitting is not hard, but the trickiest thing is
>>>>>>              how much
>>>>>>              probability/score amount do you give to each part of the
>>>>>> split ? Maybe it has not any real impact in the end, or has it ? Loïc
>>>>>>
>>>>>>              2010/3/1 Chris Dyer<[email protected]
>>>>>> <mailto:[email protected]>>
>>>>>>
>>>>>>                  I guess word-graph doesn't split phrases either (I was
>>>>>>                  just guessing).
>>>>>>                   It appears to be in SLF format, which is used by a
>>>>>>                  number of tools
>>>>>>                  (like HTK and the SRI tools).  SRILM can split
>>>>>> lattices with multi-word arcs into lattices, or you can write your own
>>>>>> code to do
>>>>>>                  it.  It's not terribly hard.
>>>>>>
>>>>>>                  Chris
>>>>>>
>>>>>>                  On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>>>>>                  <[email protected]
>>>>>>                  <mailto:[email protected]>>  wrote:
>>>>>>
>>>>>>                      Ok thanks. I will use the output-word-graph
>>>>>> option. However, I also get
>>>>>>                      phrases with that option (in the w attribute), for
>>>>>>                      example here:
>>>>>>
>>>>>>                      ....
>>>>>>                      J=42    S=0     E=53    a=0, 0, 0, -0.693147,
>>>>>>                      0.999896  l=-13.695
>>>>>>                      r=-20, 0, -1.60944, 0, 0, 0     w=bill clinton ,
>>>>>>                      pC=0.0613498,
>>>>>>                      c=-3.23392
>>>>>>                      ...
>>>>>>
>>>>>>                      I'm not sure if I'm using the command line
>>>>>> argument correctly:
>>>>>>                      echo 'who is bill clinton ?' | \
>>>>>>                      moses -f moses.ini -output-word-graph test.graph 0
>>>>>>
>>>>>>                      Jörg
>>>>>>
>>>>>>
>>>>>>                      On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>>>>
>>>>>>                          I don't have such a tool, but it wouldn't be
>>>>>> too difficult to write
>>>>>>                          one.  I think the difference between word
>>>>>> graph and search graph is
>>>>>>                          the search graph has full phrases on the
>>>>>> edges, whereas the word graph
>>>>>>                          has single words on the edges.  For the input,
>>>>>>                          you need single word
>>>>>>                          edges.
>>>>>>                          -Chris
>>>>>>
>>>>>>                          2010/3/1 Jörg
>>>>>>                          Tiedemann<[email protected]
>>>>>>                          <mailto:[email protected]>>:
>>>>>>
>>>>>>                              Is there a tool to convert output search
>>>>>>                              graphs to word lattices in
>>>>>>                              PLF
>>>>>>                               (moses lattice input format)? It's the
>>>>>>                              option -output-search-graph
>>>>>>                              that I should use for getting the relevant
>>>>>>                              information, right? I'm not
>>>>>>                              really sure if I understand the difference
>>>>>>                              between -output-word-graph
>>>>>>                              and -output-search-graph
>>>>>>                              Thanks!
>>>>>>
>>>>>>                              Jörg
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\*****************************************
>>>>>> * Jörg Tiedemann
>>>>>>                               [email protected]
>>>>>>                              <mailto:[email protected]>
>>>>>>                               Visiting Professor
>>>>>>                               http://stp.lingfil.uu.se/~joerg/
>>>>>>                               Dep. of Linguistics and Philology
>>>>>>                               Uppsala University                  tel:
>>>>>>                              +46 (0)18 - 471 1412
>>>>>>                               Box 635, SE-751 26 Uppsala/SWEDEN   fax:
>>>>>>                              +46 (0)18 - 471 1094
>>>>>>
>>>>>>
>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\***************
>>>>>> * _______________________________________________ Moses-support mailing
>>>>>> list
>>>>>>                              [email protected]
>>>>>>                              <mailto:[email protected]>
>>>>>>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>                      _______________________________________________
>>>>>>                      Moses-support mailing list
>>>>>>                      [email protected]
>>>>>> <mailto:[email protected]>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>                  _______________________________________________
>>>>>>                  Moses-support mailing list
>>>>>>                  [email protected]<mailto:[email protected]>
>>>>>>                  http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>
>>>>>>
>>>>>>              --
>>>>>>              ---
>>>>>>              Loïc BARRAULT
>>>>>>              Post-doctoral researcher
>>>>>>              LIUM - University of Le Mans
>>>>>>              Tél. +33/0 2 43 83 38 52
>>>>>>              http://www-lium.univ-lemans.fr/~barrault
>>>>>>              MANY : Open Source MT System Combination
>>>>>>              http://www-lium.univ-lemans.fr/~barrault/MANY
>>>>>>              ---
>>>>>>
>>>>>>
>>>>>>      --
>>>>>>
>>>>>>
>>>>>>
>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\*****************************************
>>>>>> * Jörg Tiedemann                      [email protected]
>>>>>> <mailto:[email protected]>
>>>>>>       Visiting Professor
>>>>>>   http://stp.lingfil.uu.se/~joerg/ Dep. of Linguistics and Philology
>>>>>>       Uppsala University                  tel: +46 (0)18 - 471 1412
>>>>>>       Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>>>>>
>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\***************
>>>>>> *
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ---
>>>>>> Loïc BARRAULT
>>>>>> Post-doctoral researcher
>>>>>> LIUM - University of Le Mans
>>>>>> Tél. +33/0 2 43 83 38 52
>>>>>> http://www-lium.univ-lemans.fr/~barrault
>>>>>> MANY : Open Source MT System Combination
>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>>>>> ---
>>>>>
>>>>> --
>>>>>
>>>>> Hälsningar,
>>>>>
>>>>> Jörg
>>>>>
>>>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>>>   Jörg Tiedemann                      [email protected]
>>>>>   Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>>>>   Dep. of Linguistics and Philology
>>>>>   Uppsala University                  tel: +46 (0)18 - 471 1412
>>>>>   Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> --
>>>
>>> Hälsningar,
>>>
>>> Jörg
>>>
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>   Jörg Tiedemann                      [email protected]
>>>   Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>>   Dep. of Linguistics and Philology
>>>   Uppsala University                  tel: +46 (0)18 - 471 1412
>>>   Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] segmentation fault with lattice decoding

Reply via email to