[Moses-support] segmentation fault with lattice decoding

Jörg Tiedemann Thu, 04 Mar 2010 06:32:35 -0800

I get a segmentation fault when decoding (large) word lattices. Moses 
seems to parse well through the input but crashes after a while. Tracing 
with gdb gave me this info:


Program received signal SIGSEGV, Segmentation fault.
0x00000000004a2888 in Moses::TranslationOptionCollection::Add (
     this=<value optimized out>, translationOption=0x18a12a0)
     at 
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:104
104 
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:
 
No such file or directory.
         in 
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h

Indeed, the header file does not exist on my system.
Do I need to install some additional packages and re-compile Moses in a 
certain way to get rid of this error?

Jörg


Chris Dyer wrote:
> Moses transition costs can be converted to probabilities (i.e., you
> can make a search graph into a stochastic FSA), but they do need to be
> renormalized. You can do this by computing the posterior probability
> of each edge (using the forward-backward algorithm), and then
> normalizing all of the out-going edges at each node.
> 
> One caveat: the way moses is usually trained (with MERT) means that
> the resulting transition probabilities might be scaled in funny ways
> (i.e., the best edge might have 99.99% of the probability mass, or it
> might just be a miniscule amount over the next best), so you may need
> to do some things (like rescaling the probabilities) to make them
> useful.
> 
> -C
> 
> 2010/3/4 Jörg Tiedemann <[email protected]>:
>> One more time about the conversion from search graphs to word lattices:
>> In the word lattice I would like to use probabilities for each edge but
>> I guess that transition costs cannot be easily interpreted as log
>> prob's. For example, I have seen quite a few positive transition values
>> in my sample output which would definitely create some problems.
>>
>> Anyway, what I try to do is to use Moses output to create word lattice
>> input for another translation step. Maybe the value at input lattice
>> edges do not strictly have to be probabilities and I shouldn't care too
>> much?
>>
>> Jörg
>>
>>
>> Loïc BARRAULT wrote:
>>> Hi Jörg,
>>>
>>> I'll take an example to explain my point of view.
>>>
>>> Here is an example of a recombined hypo :
>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647
>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm
>>> looking for a , pC=-0.518872, c=-0.31244
>>>
>>> In my case, hypo number are the nodes of the graph and phrases are
>>> represented on links.
>>> In this case, to preserve the graph topology, the only thing which can
>>> be done is to merge the nodes 319 with 181, which result in creating a
>>> link between node 1 (back node) and 181 (the recombined node).
>>>
>>> (X) ---------->(181)
>>> (1)------------->(319)
>>>
>>> result in
>>> (X) ---------->(181)
>>> (1)---------------^
>>>
>>> In your example, you can't merge 5 and 1 because their history is not
>>> the same (you pointed this out).
>>> But if 6 is recombined and pointing to 4, then the only thing you can do
>>> safely is to merge 6 and 4, which means creating a link between 5 and 4.
>>>
>>> Good luck.
>>>
>>> Loïc
>>>
>>>
>>> 2010/3/3 Jörg Tiedemann <[email protected]
>>> <mailto:[email protected]>>
>>>
>>>
>>>     I try to use the search graph output now for producing a word
>>>     lattice in PLF style. I'm still a bit confused on how to use the
>>>     recombined hypotheses and their pointers to superior hypo's. Do I
>>>     have to copy the relevant parts from the superior hypotheses into
>>>     the lattice or should I join the hypotheses that point to recombined
>>>     hypo's with the existing graph? To give an example:
>>>
>>>       who   is    bill    ?
>>>     (0)-->(1)-->(2)--->(3)-->(4)
>>>      |
>>>      |--->(5)------------->(6)
>>>      how  |   is bill ?
>>>           |
>>>           |---->(7)----->(8)
>>>            is the   bill
>>>
>>>     where (6) is a recombined hypo pointing to (4) and covering tokens 1-3
>>>     and (8) is a recombined hypo that points to (3)
>>>
>>>     Should I copy the relevant parts of (4) that cover the same tokens
>>>     to the graph as a link to (5) or can I safely join (5) and (1)?
>>>     Probably not because this would produce "who is the bill" which is
>>>     not necessarily an option ...
>>>
>>>     Thanks a lot for clarifying this to me!
>>>     Jörg
>>>
>>>
>>>
>>>     Chris Dyer wrote:
>>>
>>>         As long as you're just splitting, keeping the weights consistent
>>>         isn't
>>>         too hard- just keep all the weight in one segment and make all the
>>>         rest of the segments have no impact when they multiply (i.e., a
>>>         probability of 1, or a cost of 0).  The openFST or AT&T tools
>>>         can help
>>>         you manipulate lattices if you want to do more interesting
>>>         things with
>>>         weights, such as pushing them to the start of paths.
>>>
>>>         Chris
>>>
>>>         On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
>>>         <[email protected]
>>>         <mailto:[email protected]>> wrote:
>>>
>>>             Indeed, splitting is not hard, but the trickiest thing is
>>>             how much
>>>             probability/score amount do you give to each part of the split ?
>>>             Maybe it has not any real impact in the end, or has it ?
>>>             Loïc
>>>
>>>             2010/3/1 Chris Dyer <[email protected] <mailto:[email protected]>>
>>>
>>>                 I guess word-graph doesn't split phrases either (I was
>>>                 just guessing).
>>>                  It appears to be in SLF format, which is used by a
>>>                 number of tools
>>>                 (like HTK and the SRI tools).  SRILM can split lattices with
>>>                 multi-word arcs into lattices, or you can write your own
>>>                 code to do
>>>                 it.  It's not terribly hard.
>>>
>>>                 Chris
>>>
>>>                 On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>>                 <[email protected]
>>>                 <mailto:[email protected]>> wrote:
>>>
>>>                     Ok thanks. I will use the output-word-graph option.
>>>                     However, I also get
>>>                     phrases with that option (in the w attribute), for
>>>                     example here:
>>>
>>>                     ....
>>>                     J=42    S=0     E=53    a=0, 0, 0, -0.693147,
>>>                     0.999896  l=-13.695
>>>                     r=-20, 0, -1.60944, 0, 0, 0     w=bill clinton ,
>>>                     pC=0.0613498,
>>>                     c=-3.23392
>>>                     ...
>>>
>>>                     I'm not sure if I'm using the command line argument
>>>                     correctly:
>>>                     echo 'who is bill clinton ?' | \
>>>                     moses -f moses.ini -output-word-graph test.graph 0
>>>
>>>                     Jörg
>>>
>>>
>>>                     On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>
>>>                         I don't have such a tool, but it wouldn't be too
>>>                         difficult to write
>>>                         one.  I think the difference between word graph
>>>                         and search graph is
>>>                         the search graph has full phrases on the edges,
>>>                         whereas the word graph
>>>                         has single words on the edges.  For the input,
>>>                         you need single word
>>>                         edges.
>>>                         -Chris
>>>
>>>                         2010/3/1 Jörg
>>>                         Tiedemann<[email protected]
>>>                         <mailto:[email protected]>>:
>>>
>>>                             Is there a tool to convert output search
>>>                             graphs to word lattices in
>>>                             PLF
>>>                              (moses lattice input format)? It's the
>>>                             option -output-search-graph
>>>                             that I should use for getting the relevant
>>>                             information, right? I'm not
>>>                             really sure if I understand the difference
>>>                             between -output-word-graph
>>>                             and -output-search-graph
>>>                             Thanks!
>>>
>>>                             Jörg
>>>
>>>
>>>
>>>
>>>                             
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>                              Jörg Tiedemann
>>>                              [email protected]
>>>                             <mailto:[email protected]>
>>>                              Visiting Professor
>>>                              http://stp.lingfil.uu.se/~joerg/
>>>                              Dep. of Linguistics and Philology
>>>                              Uppsala University                  tel:
>>>                             +46 (0)18 - 471 1412
>>>                              Box 635, SE-751 26 Uppsala/SWEDEN   fax:
>>>                             +46 (0)18 - 471 1094
>>>
>>>                             
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>                             _______________________________________________
>>>                             Moses-support mailing list
>>>                             [email protected]
>>>                             <mailto:[email protected]>
>>>                             
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>                     _______________________________________________
>>>                     Moses-support mailing list
>>>                     [email protected] <mailto:[email protected]>
>>>                     http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>                 _______________________________________________
>>>                 Moses-support mailing list
>>>                 [email protected] <mailto:[email protected]>
>>>                 http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>>             --
>>>             ---
>>>             Loïc BARRAULT
>>>             Post-doctoral researcher
>>>             LIUM - University of Le Mans
>>>             Tél. +33/0 2 43 83 38 52
>>>             http://www-lium.univ-lemans.fr/~barrault
>>>             MANY : Open Source MT System Combination
>>>             http://www-lium.univ-lemans.fr/~barrault/MANY
>>>             ---
>>>
>>>
>>>     --
>>>
>>>
>>>     *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>      Jörg Tiedemann                      [email protected]
>>>     <mailto:[email protected]>
>>>      Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>>      Dep. of Linguistics and Philology
>>>      Uppsala University                  tel: +46 (0)18 - 471 1412
>>>      Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>>     *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> Loïc BARRAULT
>>> Post-doctoral researcher
>>> LIUM - University of Le Mans
>>> Tél. +33/0 2 43 83 38 52
>>> http://www-lium.univ-lemans.fr/~barrault
>>> MANY : Open Source MT System Combination
>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>> ---
>> --
>>
>> Hälsningar,
>>
>> Jörg
>>
>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>  Jörg Tiedemann                      [email protected]
>>  Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>  Dep. of Linguistics and Philology
>>  Uppsala University                  tel: +46 (0)18 - 471 1412
>>  Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>

-- 

Hälsningar,

Jörg

*******/\/\/\/\/\/\/\/\/\/\/\******************************************
  Jörg Tiedemann                      [email protected]
  Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
  Dep. of Linguistics and Philology
  Uppsala University                  tel: +46 (0)18 - 471 1412
  Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
*********************************/\/\/\/\/\/\/\/\/\/\/\****************
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] segmentation fault with lattice decoding

Reply via email to