I did that and this is what I get:
Program received signal SIGSEGV, Segmentation fault.
0x000000000047dd7a in
__gnu_cxx::new_allocator<Moses::TranslationOption*>::construct
(this=0x12a44e0, __p=0xb1, __v...@0x7fff499d0480)
at /usr/include/c++/4.3/ext/new_allocator.h:108
108 { ::new((void *)__p) _Tp(__val); }
hm - maybe there's really something wrong with my lattice input. I
couldn't see any empty nodes. Moses did print a list of lattice segments
like this:
...
925 -- (gewoond , -0.000, 105) (gewoond , -0.000, 45)
926 -- (stupid , -100.000, 101)
927 -- (gewoond , -0.000, 44)
928 -- (you , -100.000, 102)
929 -- (zwitserland , -0.000, 25)
930 -- (zwitserland , -0.000, 25)
931 -- (zwitserland , -0.000, 26)
...
Looks a bit strange with the -0.000 but that's probably ok. I also tried
with a more recent version of Moses and also got the segmentation fault
(not the lattice output though)
Well, I will have a careful look at the input again ....
If you have any other ideas - please let me know.
Thanks!
Jörg
On 3/4/10 4:32 PM, Barry Haddow wrote:
> Hi Jorg
>
> The stacktrace looks a little strange because of the compiler optimisations.
> If you edit moses/src/Makefile, changing
> CXXFLAGS = -g -O2
> to
> CXXFLAGS = -g
> do a 'make clean all', then rerun, you should get a more readable stacktrace.
> Try rerunning just on the sentence that gave you the problems to see if you
> can reproduce the problem.
>
> regards
> Barry
>
> On Thursday 04 March 2010 15:24, Chris Dyer wrote:
>> I'm not certain what's causing this. From the part of the stack trace
>> you're showing, it looks like it's probably when translations options
>> are being gathered for the spans in the lattice. Perhaps the lattice
>> is malformed (i.e., spans don't line up, there are empty nodes, etc)?
>>
>> 2010/3/4 Jörg Tiedemann<[email protected]>:
>>> I get a segmentation fault when decoding (large) word lattices. Moses
>>> seems to parse well through the input but crashes after a while. Tracing
>>> with gdb gave me this info:
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x00000000004a2888 in Moses::TranslationOptionCollection::Add (
>>> this=<value optimized out>, translationOption=0x18a12a0)
>>> at
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h:104 104
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h: No such file or directory.
>>> in
>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new
>>> _allocator.h
>>>
>>> Indeed, the header file does not exist on my system.
>>> Do I need to install some additional packages and re-compile Moses in a
>>> certain way to get rid of this error?
>>>
>>> Jörg
>>>
>>> Chris Dyer wrote:
>>>> Moses transition costs can be converted to probabilities (i.e., you
>>>> can make a search graph into a stochastic FSA), but they do need to be
>>>> renormalized. You can do this by computing the posterior probability
>>>> of each edge (using the forward-backward algorithm), and then
>>>> normalizing all of the out-going edges at each node.
>>>>
>>>> One caveat: the way moses is usually trained (with MERT) means that
>>>> the resulting transition probabilities might be scaled in funny ways
>>>> (i.e., the best edge might have 99.99% of the probability mass, or it
>>>> might just be a miniscule amount over the next best), so you may need
>>>> to do some things (like rescaling the probabilities) to make them
>>>> useful.
>>>>
>>>> -C
>>>>
>>>> 2010/3/4 Jörg Tiedemann<[email protected]>:
>>>>> One more time about the conversion from search graphs to word lattices:
>>>>> In the word lattice I would like to use probabilities for each edge but
>>>>> I guess that transition costs cannot be easily interpreted as log
>>>>> prob's. For example, I have seen quite a few positive transition values
>>>>> in my sample output which would definitely create some problems.
>>>>>
>>>>> Anyway, what I try to do is to use Moses output to create word lattice
>>>>> input for another translation step. Maybe the value at input lattice
>>>>> edges do not strictly have to be probabilities and I shouldn't care too
>>>>> much?
>>>>>
>>>>> Jörg
>>>>>
>>>>> Loïc BARRAULT wrote:
>>>>>> Hi Jörg,
>>>>>>
>>>>>> I'll take an example to explain my point of view.
>>>>>>
>>>>>> Here is an example of a recombined hypo :
>>>>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647
>>>>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm
>>>>>> looking for a , pC=-0.518872, c=-0.31244
>>>>>>
>>>>>> In my case, hypo number are the nodes of the graph and phrases are
>>>>>> represented on links.
>>>>>> In this case, to preserve the graph topology, the only thing which can
>>>>>> be done is to merge the nodes 319 with 181, which result in creating a
>>>>>> link between node 1 (back node) and 181 (the recombined node).
>>>>>>
>>>>>> (X) ---------->(181)
>>>>>> (1)------------->(319)
>>>>>>
>>>>>> result in
>>>>>> (X) ---------->(181)
>>>>>> (1)---------------^
>>>>>>
>>>>>> In your example, you can't merge 5 and 1 because their history is not
>>>>>> the same (you pointed this out).
>>>>>> But if 6 is recombined and pointing to 4, then the only thing you can
>>>>>> do safely is to merge 6 and 4, which means creating a link between 5
>>>>>> and 4.
>>>>>>
>>>>>> Good luck.
>>>>>>
>>>>>> Loïc
>>>>>>
>>>>>>
>>>>>> 2010/3/3 Jörg Tiedemann<[email protected]
>>>>>> <mailto:[email protected]>>
>>>>>>
>>>>>>
>>>>>> I try to use the search graph output now for producing a word
>>>>>> lattice in PLF style. I'm still a bit confused on how to use the
>>>>>> recombined hypotheses and their pointers to superior hypo's. Do I
>>>>>> have to copy the relevant parts from the superior hypotheses into
>>>>>> the lattice or should I join the hypotheses that point to
>>>>>> recombined hypo's with the existing graph? To give an example:
>>>>>>
>>>>>> who is bill ?
>>>>>> (0)-->(1)-->(2)--->(3)-->(4)
>>>>>> |
>>>>>> |--->(5)------------->(6)
>>>>>> how | is bill ?
>>>>>> |
>>>>>> |---->(7)----->(8)
>>>>>> is the bill
>>>>>>
>>>>>> where (6) is a recombined hypo pointing to (4) and covering tokens
>>>>>> 1-3 and (8) is a recombined hypo that points to (3)
>>>>>>
>>>>>> Should I copy the relevant parts of (4) that cover the same tokens
>>>>>> to the graph as a link to (5) or can I safely join (5) and (1)?
>>>>>> Probably not because this would produce "who is the bill" which is
>>>>>> not necessarily an option ...
>>>>>>
>>>>>> Thanks a lot for clarifying this to me!
>>>>>> Jörg
>>>>>>
>>>>>>
>>>>>>
>>>>>> Chris Dyer wrote:
>>>>>>
>>>>>> As long as you're just splitting, keeping the weights
>>>>>> consistent isn't
>>>>>> too hard- just keep all the weight in one segment and make all
>>>>>> the rest of the segments have no impact when they multiply (i.e., a
>>>>>> probability of 1, or a cost of 0). The openFST or AT&T tools can help
>>>>>> you manipulate lattices if you want to do more interesting
>>>>>> things with
>>>>>> weights, such as pushing them to the start of paths.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
>>>>>> <[email protected]
>>>>>> <mailto:[email protected]>> wrote:
>>>>>>
>>>>>> Indeed, splitting is not hard, but the trickiest thing is
>>>>>> how much
>>>>>> probability/score amount do you give to each part of the
>>>>>> split ? Maybe it has not any real impact in the end, or has it ? Loïc
>>>>>>
>>>>>> 2010/3/1 Chris Dyer<[email protected]
>>>>>> <mailto:[email protected]>>
>>>>>>
>>>>>> I guess word-graph doesn't split phrases either (I was
>>>>>> just guessing).
>>>>>> It appears to be in SLF format, which is used by a
>>>>>> number of tools
>>>>>> (like HTK and the SRI tools). SRILM can split
>>>>>> lattices with multi-word arcs into lattices, or you can write your own
>>>>>> code to do
>>>>>> it. It's not terribly hard.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>>>>> <[email protected]
>>>>>> <mailto:[email protected]>> wrote:
>>>>>>
>>>>>> Ok thanks. I will use the output-word-graph
>>>>>> option. However, I also get
>>>>>> phrases with that option (in the w attribute), for
>>>>>> example here:
>>>>>>
>>>>>> ....
>>>>>> J=42 S=0 E=53 a=0, 0, 0, -0.693147,
>>>>>> 0.999896 l=-13.695
>>>>>> r=-20, 0, -1.60944, 0, 0, 0 w=bill clinton ,
>>>>>> pC=0.0613498,
>>>>>> c=-3.23392
>>>>>> ...
>>>>>>
>>>>>> I'm not sure if I'm using the command line
>>>>>> argument correctly:
>>>>>> echo 'who is bill clinton ?' | \
>>>>>> moses -f moses.ini -output-word-graph test.graph 0
>>>>>>
>>>>>> Jörg
>>>>>>
>>>>>>
>>>>>> On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>>>>
>>>>>> I don't have such a tool, but it wouldn't be
>>>>>> too difficult to write
>>>>>> one. I think the difference between word
>>>>>> graph and search graph is
>>>>>> the search graph has full phrases on the
>>>>>> edges, whereas the word graph
>>>>>> has single words on the edges. For the input,
>>>>>> you need single word
>>>>>> edges.
>>>>>> -Chris
>>>>>>
>>>>>> 2010/3/1 Jörg
>>>>>> Tiedemann<[email protected]
>>>>>> <mailto:[email protected]>>:
>>>>>>
>>>>>> Is there a tool to convert output search
>>>>>> graphs to word lattices in
>>>>>> PLF
>>>>>> (moses lattice input format)? It's the
>>>>>> option -output-search-graph
>>>>>> that I should use for getting the relevant
>>>>>> information, right? I'm not
>>>>>> really sure if I understand the difference
>>>>>> between -output-word-graph
>>>>>> and -output-search-graph
>>>>>> Thanks!
>>>>>>
>>>>>> Jörg
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\*****************************************
>>>>>> * Jörg Tiedemann
>>>>>> [email protected]
>>>>>> <mailto:[email protected]>
>>>>>> Visiting Professor
>>>>>> http://stp.lingfil.uu.se/~joerg/
>>>>>> Dep. of Linguistics and Philology
>>>>>> Uppsala University tel:
>>>>>> +46 (0)18 - 471 1412
>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax:
>>>>>> +46 (0)18 - 471 1094
>>>>>>
>>>>>>
>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\***************
>>>>>> * _______________________________________________ Moses-support mailing
>>>>>> list
>>>>>> [email protected]
>>>>>> <mailto:[email protected]>
>>>>>>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> <mailto:[email protected]>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]<mailto:[email protected]>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ---
>>>>>> Loïc BARRAULT
>>>>>> Post-doctoral researcher
>>>>>> LIUM - University of Le Mans
>>>>>> Tél. +33/0 2 43 83 38 52
>>>>>> http://www-lium.univ-lemans.fr/~barrault
>>>>>> MANY : Open Source MT System Combination
>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>>>>> ---
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\*****************************************
>>>>>> * Jörg Tiedemann [email protected]
>>>>>> <mailto:[email protected]>
>>>>>> Visiting Professor
>>>>>> http://stp.lingfil.uu.se/~joerg/ Dep. of Linguistics and Philology
>>>>>> Uppsala University tel: +46 (0)18 - 471 1412
>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>>>>>>
>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\***************
>>>>>> *
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ---
>>>>>> Loïc BARRAULT
>>>>>> Post-doctoral researcher
>>>>>> LIUM - University of Le Mans
>>>>>> Tél. +33/0 2 43 83 38 52
>>>>>> http://www-lium.univ-lemans.fr/~barrault
>>>>>> MANY : Open Source MT System Combination
>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY
>>>>>> ---
>>>>>
>>>>> --
>>>>>
>>>>> Hälsningar,
>>>>>
>>>>> Jörg
>>>>>
>>>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>>> Jörg Tiedemann [email protected]
>>>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/
>>>>> Dep. of Linguistics and Philology
>>>>> Uppsala University tel: +46 (0)18 - 471 1412
>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> --
>>>
>>> Hälsningar,
>>>
>>> Jörg
>>>
>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>> Jörg Tiedemann [email protected]
>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/
>>> Dep. of Linguistics and Philology
>>> Uppsala University tel: +46 (0)18 - 471 1412
>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support