I get a segmentation fault when decoding (large) word lattices. Moses
seems to parse well through the input but crashes after a while. Tracing
with gdb gave me this info:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004a2888 in Moses::TranslationOptionCollection::Add (
this=<value optimized out>, translationOption=0x18a12a0)
at
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:104
104
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h:
No such file or directory.
in
/usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h
Indeed, the header file does not exist on my system.
Do I need to install some additional packages and re-compile Moses in a
certain way to get rid of this error?
Jörg
Chris Dyer wrote:
> Moses transition costs can be converted to probabilities (i.e., you
> can make a search graph into a stochastic FSA), but they do need to be
> renormalized. You can do this by computing the posterior probability
> of each edge (using the forward-backward algorithm), and then
> normalizing all of the out-going edges at each node.
>
> One caveat: the way moses is usually trained (with MERT) means that
> the resulting transition probabilities might be scaled in funny ways
> (i.e., the best edge might have 99.99% of the probability mass, or it
> might just be a miniscule amount over the next best), so you may need
> to do some things (like rescaling the probabilities) to make them
> useful.
>
> -C
>
> 2010/3/4 Jörg Tiedemann <[email protected]>:
>> One more time about the conversion from search graphs to word lattices:
>> In the word lattice I would like to use probabilities for each edge but
>> I guess that transition costs cannot be easily interpreted as log
>> prob's. For example, I have seen quite a few positive transition values
>> in my sample output which would definitely create some problems.
>>
>> Anyway, what I try to do is to use Moses output to create word lattice
>> input for another translation step. Maybe the value at input lattice
>> edges do not strictly have to be probabilities and I shouldn't care too
>> much?
>>
>> Jörg
>>
>>
>> Loïc BARRAULT wrote:
>>> Hi Jörg,
>>>
>>> I'll take an example to explain my point of view.
>>>
>>> Here is an example of a recombined hypo :
>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647
>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm
>>> looking for a , pC=-0.518872, c=-0.31244
>>>
>>> In my case, hypo number are the nodes of the graph and phrases are
>>> represented on links.
>>> In this case, to preserve the graph topology, the only thing which can
>>> be done is to merge the nodes 319 with 181, which result in creating a
>>> link between node 1 (back node) and 181 (the recombined node).
>>>
>>> (X) ---------->(181)
>>> (1)------------->(319)
>>>
>>> result in
>>> (X) ---------->(181)
>>> (1)---------------^
>>>
>>> In your example, you can't merge 5 and 1 because their history is not
>>> the same (you pointed this out).
>>> But if 6 is recombined and pointing to 4, then the only thing you can do
>>> safely is to merge 6 and 4, which means creating a link between 5 and 4.
>>>
>>> Good luck.
>>>
>>> Loïc
>>>
>>>
>>> 2010/3/3 Jörg Tiedemann <[email protected]
>>> <mailto:[email protected]>>
>>>
>>>
>>> I try to use the search graph output now for producing a word
>>> lattice in PLF style. I'm still a bit confused on how to use the
>>> recombined hypotheses and their pointers to superior hypo's. Do I
>>> have to copy the relevant parts from the superior hypotheses into
>>> the lattice or should I join the hypotheses that point to recombined
>>> hypo's with the existing graph? To give an example:
>>>
>>> who is bill ?
>>> (0)-->(1)-->(2)--->(3)-->(4)
>>> |
>>> |--->(5)------------->(6)
>>> how | is bill ?
>>> |
>>> |---->(7)----->(8)
>>> is the bill
>>>
>>> where (6) is a recombined hypo pointing to (4) and covering tokens 1-3
>>> and (8) is a recombined hypo that points to (3)
>>>
>>> Should I copy the relevant parts of (4) that cover the same tokens
>>> to the graph as a link to (5) or can I safely join (5) and (1)?
>>> Probably not because this would produce "who is the bill" which is
>>> not necessarily an option ...
>>>
>>> Thanks a lot for clarifying this to me!
>>> Jörg
>>>
>>>
>>>
>>> Chris Dyer wrote:
>>>
>>> As long as you're just splitting, keeping the weights consistent
>>> isn't
>>> too hard- just keep all the weight in one segment and make all the
>>> rest of the segments have no impact when they multiply (i.e., a
>>> probability of 1, or a cost of 0). The openFST or AT&T tools
>>> can help
>>> you manipulate lattices if you want to do more interesting
>>> things with
>>> weights, such as pushing them to the start of paths.
>>>
>>> Chris
>>>
>>> On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Indeed, splitting is not hard, but the trickiest thing is
>>> how much
>>> probability/score amount do you give to each part of the split ?
>>> Maybe it has not any real impact in the end, or has it ?
>>> Loïc
>>>
>>> 2010/3/1 Chris Dyer <[email protected] <mailto:[email protected]>>
>>>
>>> I guess word-graph doesn't split phrases either (I was
>>> just guessing).
>>> It appears to be in SLF format, which is used by a
>>> number of tools
>>> (like HTK and the SRI tools). SRILM can split lattices with
>>> multi-word arcs into lattices, or you can write your own
>>> code to do
>>> it. It's not terribly hard.
>>>
>>> Chris
>>>
>>> On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Ok thanks. I will use the output-word-graph option.
>>> However, I also get
>>> phrases with that option (in the w attribute), for
>>> example here:
>>>
>>> ....
>>> J=42 S=0 E=53 a=0, 0, 0, -0.693147,
>>> 0.999896 l=-13.695
>>> r=-20, 0, -1.60944, 0, 0, 0 w=bill clinton ,
>>> pC=0.0613498,
>>> c=-3.23392
>>> ...
>>>
>>> I'm not sure if I'm using the command line argument
>>> correctly:
>>> echo 'who is bill clinton ?' | \
>>> moses -f moses.ini -output-word-graph test.graph 0
>>>
>>> Jörg
>>>
>>>
>>> On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>
>>> I don't have such a tool, but it wouldn't be too
>>> difficult to write
>>> one. I think the difference between word graph
>>> and search graph is
>>> the search graph has full phrases on the edges,
>>> whereas the word graph
>>> has single words on the edges. For the input,
>>> you need single word
>>> edges.
>>> -Chris
>>>
>>> 2010/3/1 Jörg
>>> Tiedemann<[email protected]
>>> <mailto:[email protected]>>:
>>>
>>> Is there a tool to convert output search
>>> graphs to word lattices in
>>> PLF
>>> (moses lattice input format)? It's the
>>> option -output-search-graph
>>> that I should use for getting the relevant
>>> information, right? I'm not
>>> really sure if I understand the difference
>>> between -output-word-graph
>>> and -output-search-graph
>>> Thanks!
>>>
>>> Jörg
>>>
>>>
>>>
>>>
>>>
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>> Jörg Tiedemann
>>> [email protected]
>>> <mailto:[email protected]>
>>> Visiting Professor
>>> http://stp.lingfil.uu.se/~joerg/
>>> Dep. of Linguistics and Philology
>>> Uppsala University tel:
>>> +46 (0)18 - 471 1412
>>> Box 635, SE-751 26 Uppsala/SWEDEN fax:
>>> +46 (0)18 - 471 1094
>>>
>>>
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> <mailto:[email protected]>
>>>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>> --
>>> ---
>>> Loïc BARRAULT
>>> Post-doctoral researcher
>>> LIUM - University of Le Mans
>>> Tél. +33/0 2 43 83 38 52
>>> http://www-lium.univ-lemans.fr/~barrault
>>> MANY : Open Source MT System Combination
>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>> ---
>>>
>>>
>>> --
>>>
>>>
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>> Jörg Tiedemann [email protected]
>>> <mailto:[email protected]>
>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/
>>> Dep. of Linguistics and Philology
>>> Uppsala University tel: +46 (0)18 - 471 1412
>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> Loïc BARRAULT
>>> Post-doctoral researcher
>>> LIUM - University of Le Mans
>>> Tél. +33/0 2 43 83 38 52
>>> http://www-lium.univ-lemans.fr/~barrault
>>> MANY : Open Source MT System Combination
>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>> ---
>> --
>>
>> Hälsningar,
>>
>> Jörg
>>
>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>> Jörg Tiedemann [email protected]
>> Visiting Professor http://stp.lingfil.uu.se/~joerg/
>> Dep. of Linguistics and Philology
>> Uppsala University tel: +46 (0)18 - 471 1412
>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
--
Hälsningar,
Jörg
*******/\/\/\/\/\/\/\/\/\/\/\******************************************
Jörg Tiedemann [email protected]
Visiting Professor http://stp.lingfil.uu.se/~joerg/
Dep. of Linguistics and Philology
Uppsala University tel: +46 (0)18 - 471 1412
Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
*********************************/\/\/\/\/\/\/\/\/\/\/\****************
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support