Re: [Moses-support] segmentation fault with lattice decoding

Chris Dyer Thu, 04 Mar 2010 07:24:44 -0800

I'm not certain what's causing this.  From the part of the stack trace
you're showing, it looks like it's probably when translations options
are being gathered for the spans in the lattice.  Perhaps the lattice
is malformed (i.e., spans don't line up, there are empty nodes, etc)?


2010/3/4 Jörg Tiedemann <[email protected]>:
>
> I get a segmentation fault when decoding (large) word lattices. Moses
> seems to parse well through the input but crashes after a while. Tracing
> with gdb gave me this info:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004a2888 in Moses::TranslationOptionCollection::Add (
>     this=<value optimized out>, translationOption=0x18a12a0)
>     at
> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:104
> 104
> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:
> No such file or directory.
>         in
> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h
>
> Indeed, the header file does not exist on my system.
> Do I need to install some additional packages and re-compile Moses in a
> certain way to get rid of this error?
>
> Jörg
>
>
> Chris Dyer wrote:
>> Moses transition costs can be converted to probabilities (i.e., you
>> can make a search graph into a stochastic FSA), but they do need to be
>> renormalized. You can do this by computing the posterior probability
>> of each edge (using the forward-backward algorithm), and then
>> normalizing all of the out-going edges at each node.
>>
>> One caveat: the way moses is usually trained (with MERT) means that
>> the resulting transition probabilities might be scaled in funny ways
>> (i.e., the best edge might have 99.99% of the probability mass, or it
>> might just be a miniscule amount over the next best), so you may need
>> to do some things (like rescaling the probabilities) to make them
>> useful.
>>
>> -C
>>
>> 2010/3/4 Jörg Tiedemann <[email protected]>:
>>> One more time about the conversion from search graphs to word lattices:
>>> In the word lattice I would like to use probabilities for each edge but
>>> I guess that transition costs cannot be easily interpreted as log
>>> prob's. For example, I have seen quite a few positive transition values
>>> in my sample output which would definitely create some problems.
>>>
>>> Anyway, what I try to do is to use Moses output to create word lattice
>>> input for another translation step. Maybe the value at input lattice
>>> edges do not strictly have to be probabilities and I shouldn't care too
>>> much?
>>>
>>> Jörg
>>>
>>>
>>> Loïc BARRAULT wrote:
>>>> Hi Jörg,
>>>>
>>>> I'll take an example to explain my point of view.
>>>>
>>>> Here is an example of a recombined hypo :
>>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647
>>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm
>>>> looking for a , pC=-0.518872, c=-0.31244
>>>>
>>>> In my case, hypo number are the nodes of the graph and phrases are
>>>> represented on links.
>>>> In this case, to preserve the graph topology, the only thing which can
>>>> be done is to merge the nodes 319 with 181, which result in creating a
>>>> link between node 1 (back node) and 181 (the recombined node).
>>>>
>>>> (X) ---------->(181)
>>>> (1)------------->(319)
>>>>
>>>> result in
>>>> (X) ---------->(181)
>>>> (1)---------------^
>>>>
>>>> In your example, you can't merge 5 and 1 because their history is not
>>>> the same (you pointed this out).
>>>> But if 6 is recombined and pointing to 4, then the only thing you can do
>>>> safely is to merge 6 and 4, which means creating a link between 5 and 4.
>>>>
>>>> Good luck.
>>>>
>>>> Loïc
>>>>
>>>>
>>>> 2010/3/3 Jörg Tiedemann <[email protected]
>>>> <mailto:[email protected]>>
>>>>
>>>>
>>>>     I try to use the search graph output now for producing a word
>>>>     lattice in PLF style. I'm still a bit confused on how to use the
>>>>     recombined hypotheses and their pointers to superior hypo's. Do I
>>>>     have to copy the relevant parts from the superior hypotheses into
>>>>     the lattice or should I join the hypotheses that point to recombined
>>>>     hypo's with the existing graph? To give an example:
>>>>
>>>>       who   is    bill    ?
>>>>     (0)-->(1)-->(2)--->(3)-->(4)
>>>>      |
>>>>      |--->(5)------------->(6)
>>>>      how  |   is bill ?
>>>>           |
>>>>           |---->(7)----->(8)
>>>>            is the   bill
>>>>
>>>>     where (6) is a recombined hypo pointing to (4) and covering tokens 1-3
>>>>     and (8) is a recombined hypo that points to (3)
>>>>
>>>>     Should I copy the relevant parts of (4) that cover the same tokens
>>>>     to the graph as a link to (5) or can I safely join (5) and (1)?
>>>>     Probably not because this would produce "who is the bill" which is
>>>>     not necessarily an option ...
>>>>
>>>>     Thanks a lot for clarifying this to me!
>>>>     Jörg
>>>>
>>>>
>>>>
>>>>     Chris Dyer wrote:
>>>>
>>>>         As long as you're just splitting, keeping the weights consistent
>>>>         isn't
>>>>         too hard- just keep all the weight in one segment and make all the
>>>>         rest of the segments have no impact when they multiply (i.e., a
>>>>         probability of 1, or a cost of 0).  The openFST or AT&T tools
>>>>         can help
>>>>         you manipulate lattices if you want to do more interesting
>>>>         things with
>>>>         weights, such as pushing them to the start of paths.
>>>>
>>>>         Chris
>>>>
>>>>         On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
>>>>         <[email protected]
>>>>         <mailto:[email protected]>> wrote:
>>>>
>>>>             Indeed, splitting is not hard, but the trickiest thing is
>>>>             how much
>>>>             probability/score amount do you give to each part of the split 
>>>> ?
>>>>             Maybe it has not any real impact in the end, or has it ?
>>>>             Loïc
>>>>
>>>>             2010/3/1 Chris Dyer <[email protected] <mailto:[email protected]>>
>>>>
>>>>                 I guess word-graph doesn't split phrases either (I was
>>>>                 just guessing).
>>>>                  It appears to be in SLF format, which is used by a
>>>>                 number of tools
>>>>                 (like HTK and the SRI tools).  SRILM can split lattices 
>>>> with
>>>>                 multi-word arcs into lattices, or you can write your own
>>>>                 code to do
>>>>                 it.  It's not terribly hard.
>>>>
>>>>                 Chris
>>>>
>>>>                 On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>>>                 <[email protected]
>>>>                 <mailto:[email protected]>> wrote:
>>>>
>>>>                     Ok thanks. I will use the output-word-graph option.
>>>>                     However, I also get
>>>>                     phrases with that option (in the w attribute), for
>>>>                     example here:
>>>>
>>>>                     ....
>>>>                     J=42    S=0     E=53    a=0, 0, 0, -0.693147,
>>>>                     0.999896  l=-13.695
>>>>                     r=-20, 0, -1.60944, 0, 0, 0     w=bill clinton ,
>>>>                     pC=0.0613498,
>>>>                     c=-3.23392
>>>>                     ...
>>>>
>>>>                     I'm not sure if I'm using the command line argument
>>>>                     correctly:
>>>>                     echo 'who is bill clinton ?' | \
>>>>                     moses -f moses.ini -output-word-graph test.graph 0
>>>>
>>>>                     Jörg
>>>>
>>>>
>>>>                     On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>>
>>>>                         I don't have such a tool, but it wouldn't be too
>>>>                         difficult to write
>>>>                         one.  I think the difference between word graph
>>>>                         and search graph is
>>>>                         the search graph has full phrases on the edges,
>>>>                         whereas the word graph
>>>>                         has single words on the edges.  For the input,
>>>>                         you need single word
>>>>                         edges.
>>>>                         -Chris
>>>>
>>>>                         2010/3/1 Jörg
>>>>                         Tiedemann<[email protected]
>>>>                         <mailto:[email protected]>>:
>>>>
>>>>                             Is there a tool to convert output search
>>>>                             graphs to word lattices in
>>>>                             PLF
>>>>                              (moses lattice input format)? It's the
>>>>                             option -output-search-graph
>>>>                             that I should use for getting the relevant
>>>>                             information, right? I'm not
>>>>                             really sure if I understand the difference
>>>>                             between -output-word-graph
>>>>                             and -output-search-graph
>>>>                             Thanks!
>>>>
>>>>                             Jörg
>>>>
>>>>
>>>>
>>>>
>>>>                             
>>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>>                              Jörg Tiedemann
>>>>                              [email protected]
>>>>                             <mailto:[email protected]>
>>>>                              Visiting Professor
>>>>                              http://stp.lingfil.uu.se/~joerg/
>>>>                              Dep. of Linguistics and Philology
>>>>                              Uppsala University                  tel:
>>>>                             +46 (0)18 - 471 1412
>>>>                              Box 635, SE-751 26 Uppsala/SWEDEN   fax:
>>>>                             +46 (0)18 - 471 1094
>>>>
>>>>                             
>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>>                             _______________________________________________
>>>>                             Moses-support mailing list
>>>>                             [email protected]
>>>>                             <mailto:[email protected]>
>>>>                             
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>                     _______________________________________________
>>>>                     Moses-support mailing list
>>>>                     [email protected] <mailto:[email protected]>
>>>>                     http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>                 _______________________________________________
>>>>                 Moses-support mailing list
>>>>                 [email protected] <mailto:[email protected]>
>>>>                 http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>>
>>>>             --
>>>>             ---
>>>>             Loïc BARRAULT
>>>>             Post-doctoral researcher
>>>>             LIUM - University of Le Mans
>>>>             Tél. +33/0 2 43 83 38 52
>>>>             http://www-lium.univ-lemans.fr/~barrault
>>>>             MANY : Open Source MT System Combination
>>>>             http://www-lium.univ-lemans.fr/~barrault/MANY
>>>>             ---
>>>>
>>>>
>>>>     --
>>>>
>>>>
>>>>     *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>>      Jörg Tiedemann                      [email protected]
>>>>     <mailto:[email protected]>
>>>>      Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>>>      Dep. of Linguistics and Philology
>>>>      Uppsala University                  tel: +46 (0)18 - 471 1412
>>>>      Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>>>     *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ---
>>>> Loïc BARRAULT
>>>> Post-doctoral researcher
>>>> LIUM - University of Le Mans
>>>> Tél. +33/0 2 43 83 38 52
>>>> http://www-lium.univ-lemans.fr/~barrault
>>>> MANY : Open Source MT System Combination
>>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>>> ---
>>> --
>>>
>>> Hälsningar,
>>>
>>> Jörg
>>>
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>  Jörg Tiedemann                      [email protected]
>>>  Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>>>  Dep. of Linguistics and Philology
>>>  Uppsala University                  tel: +46 (0)18 - 471 1412
>>>  Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>
> --
>
> Hälsningar,
>
> Jörg
>
> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>  Jörg Tiedemann                      [email protected]
>  Visiting Professor                  http://stp.lingfil.uu.se/~joerg/
>  Dep. of Linguistics and Philology
>  Uppsala University                  tel: +46 (0)18 - 471 1412
>  Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094
> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] segmentation fault with lattice decoding

Reply via email to