I don't see anything obvious, but this error is occurring when processing an unknown word, so you might try to identify exactly what span is being dealt with when the error occurs and look to see if the lattice has anything strange at that point. Of course, SEGVs in new are often delayed effects to memory corruption, so this may be a red herring. Still, it's a place to start.
On Thu, Mar 4, 2010 at 2:10 PM, Joerg Tiedemann <[email protected]> wrote: > > Here it is: > > > Program received signal SIGSEGV, Segmentation fault. > 0x000000000047dd7a in > __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct > (this=0x12a44e0, __p=0xb1, __v...@0x7fffd1187c30) > at /usr/include/c++/4.3/ext/new_allocator.h:108 > 108 { ::new((void *)__p) _Tp(__val); } > (gdb) backtrace > #0 0x000000000047dd7a in > __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct > (this=0x12a44e0, __p=0xb1, __v...@0x7fffd1187c30) > at /usr/include/c++/4.3/ext/new_allocator.h:108 > #1 0x000000000047fee9 in std::vector<Moses::TranslationOption*, > std::allocator<Moses::TranslationOption*> >::push_back (this=0x12a44e0, > _...@0x7fffd1187c30) > at /usr/include/c++/4.3/bits/stl_vector.h:690 > #2 0x000000000049ce23 in Moses::TranslationOptionList::Add > (this=0x12a44e0, > transOpt=0x1a00d00) at TranslationOptionList.h:49 > #3 0x00000000004e84ec in Moses::TranslationOptionCollection::Add ( > this=0x12f0670, translationOption=0x1a00d00) > at TranslationOptionCollection.cpp:566 > #4 0x00000000004e9fe5 in > Moses::TranslationOptionCollection::ProcessOneUnknownWord > (this=0x12f0670, sourcewo...@0x12b33b0, sourcePos=0, length=21) > at TranslationOptionCollection.cpp:265 > #5 0x000000000049c7b1 in > Moses::TranslationOptionCollectionConfusionNet::ProcessUnknownWord > (this=0x12f0670, sourcePos=0) > at TranslationOptionCollectionConfusionNet.cpp:31 > #6 0x00000000004e8163 in > Moses::TranslationOptionCollection::ProcessUnknownWord (this=0x12f0670, > decodestep...@0x7a0600) > at TranslationOptionCollection.cpp:195 > #7 0x00000000004e9867 in > Moses::TranslationOptionCollection::CreateTranslationOptions > (this=0x12f0670, decodestep...@0x7a0600) > at TranslationOptionCollection.cpp:389 > ---Type <return> to continue, or q <return> to quit--- > #8 0x000000000044657f in Moses::Manager::ProcessSentence > (this=0x7fffd1188160) > at Manager.cpp:90 > #9 0x00000000004079a6 in main (argc=<value optimized out>, > argv=0x7fffd1188338) at Main.cpp:143 > > > > > On 3/4/10 7:32 PM, Chris Dyer wrote: >> Can you show the complete stack trace? >> >> On Thu, Mar 4, 2010 at 12:48 PM, Joerg Tiedemann >> <[email protected]> wrote: >>> >>> I did that and this is what I get: >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> 0x000000000047dd7a in >>> __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct >>> (this=0x12a44e0, __p=0xb1, __v...@0x7fff499d0480) >>> at /usr/include/c++/4.3/ext/new_allocator.h:108 >>> 108 { ::new((void *)__p) _Tp(__val); } >>> >>> >>> hm - maybe there's really something wrong with my lattice input. I >>> couldn't see any empty nodes. Moses did print a list of lattice segments >>> like this: >>> >>> ... >>> 925 -- (gewoond , -0.000, 105) (gewoond , -0.000, 45) >>> 926 -- (stupid , -100.000, 101) >>> 927 -- (gewoond , -0.000, 44) >>> 928 -- (you , -100.000, 102) >>> 929 -- (zwitserland , -0.000, 25) >>> 930 -- (zwitserland , -0.000, 25) >>> 931 -- (zwitserland , -0.000, 26) >>> ... >>> >>> Looks a bit strange with the -0.000 but that's probably ok. I also tried >>> with a more recent version of Moses and also got the segmentation fault >>> (not the lattice output though) >>> >>> Well, I will have a careful look at the input again .... >>> If you have any other ideas - please let me know. >>> Thanks! >>> >>> Jörg >>> >>> >>> On 3/4/10 4:32 PM, Barry Haddow wrote: >>>> Hi Jorg >>>> >>>> The stacktrace looks a little strange because of the compiler >>>> optimisations. >>>> If you edit moses/src/Makefile, changing >>>> CXXFLAGS = -g -O2 >>>> to >>>> CXXFLAGS = -g >>>> do a 'make clean all', then rerun, you should get a more readable >>>> stacktrace. >>>> Try rerunning just on the sentence that gave you the problems to see if you >>>> can reproduce the problem. >>>> >>>> regards >>>> Barry >>>> >>>> On Thursday 04 March 2010 15:24, Chris Dyer wrote: >>>>> I'm not certain what's causing this. From the part of the stack trace >>>>> you're showing, it looks like it's probably when translations options >>>>> are being gathered for the spans in the lattice. Perhaps the lattice >>>>> is malformed (i.e., spans don't line up, there are empty nodes, etc)? >>>>> >>>>> 2010/3/4 Jörg Tiedemann<[email protected]>: >>>>>> I get a segmentation fault when decoding (large) word lattices. Moses >>>>>> seems to parse well through the input but crashes after a while. Tracing >>>>>> with gdb gave me this info: >>>>>> >>>>>> Program received signal SIGSEGV, Segmentation fault. >>>>>> 0x00000000004a2888 in Moses::TranslationOptionCollection::Add ( >>>>>> this=<value optimized out>, translationOption=0x18a12a0) >>>>>> at >>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new >>>>>> _allocator.h:104 104 >>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new >>>>>> _allocator.h: No such file or directory. >>>>>> in >>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new >>>>>> _allocator.h >>>>>> >>>>>> Indeed, the header file does not exist on my system. >>>>>> Do I need to install some additional packages and re-compile Moses in a >>>>>> certain way to get rid of this error? >>>>>> >>>>>> Jörg >>>>>> >>>>>> Chris Dyer wrote: >>>>>>> Moses transition costs can be converted to probabilities (i.e., you >>>>>>> can make a search graph into a stochastic FSA), but they do need to be >>>>>>> renormalized. You can do this by computing the posterior probability >>>>>>> of each edge (using the forward-backward algorithm), and then >>>>>>> normalizing all of the out-going edges at each node. >>>>>>> >>>>>>> One caveat: the way moses is usually trained (with MERT) means that >>>>>>> the resulting transition probabilities might be scaled in funny ways >>>>>>> (i.e., the best edge might have 99.99% of the probability mass, or it >>>>>>> might just be a miniscule amount over the next best), so you may need >>>>>>> to do some things (like rescaling the probabilities) to make them >>>>>>> useful. >>>>>>> >>>>>>> -C >>>>>>> >>>>>>> 2010/3/4 Jörg Tiedemann<[email protected]>: >>>>>>>> One more time about the conversion from search graphs to word lattices: >>>>>>>> In the word lattice I would like to use probabilities for each edge but >>>>>>>> I guess that transition costs cannot be easily interpreted as log >>>>>>>> prob's. For example, I have seen quite a few positive transition values >>>>>>>> in my sample output which would definitely create some problems. >>>>>>>> >>>>>>>> Anyway, what I try to do is to use Moses output to create word lattice >>>>>>>> input for another translation step. Maybe the value at input lattice >>>>>>>> edges do not strictly have to be probabilities and I shouldn't care too >>>>>>>> much? >>>>>>>> >>>>>>>> Jörg >>>>>>>> >>>>>>>> Loïc BARRAULT wrote: >>>>>>>>> Hi Jörg, >>>>>>>>> >>>>>>>>> I'll take an example to explain my point of view. >>>>>>>>> >>>>>>>>> Here is an example of a recombined hypo : >>>>>>>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647 >>>>>>>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I 'm >>>>>>>>> looking for a , pC=-0.518872, c=-0.31244 >>>>>>>>> >>>>>>>>> In my case, hypo number are the nodes of the graph and phrases are >>>>>>>>> represented on links. >>>>>>>>> In this case, to preserve the graph topology, the only thing which can >>>>>>>>> be done is to merge the nodes 319 with 181, which result in creating a >>>>>>>>> link between node 1 (back node) and 181 (the recombined node). >>>>>>>>> >>>>>>>>> (X) ---------->(181) >>>>>>>>> (1)------------->(319) >>>>>>>>> >>>>>>>>> result in >>>>>>>>> (X) ---------->(181) >>>>>>>>> (1)---------------^ >>>>>>>>> >>>>>>>>> In your example, you can't merge 5 and 1 because their history is not >>>>>>>>> the same (you pointed this out). >>>>>>>>> But if 6 is recombined and pointing to 4, then the only thing you can >>>>>>>>> do safely is to merge 6 and 4, which means creating a link between 5 >>>>>>>>> and 4. >>>>>>>>> >>>>>>>>> Good luck. >>>>>>>>> >>>>>>>>> Loïc >>>>>>>>> >>>>>>>>> >>>>>>>>> 2010/3/3 Jörg Tiedemann<[email protected] >>>>>>>>> <mailto:[email protected]>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I try to use the search graph output now for producing a word >>>>>>>>> lattice in PLF style. I'm still a bit confused on how to use the >>>>>>>>> recombined hypotheses and their pointers to superior hypo's. Do >>>>>>>>> I >>>>>>>>> have to copy the relevant parts from the superior hypotheses >>>>>>>>> into >>>>>>>>> the lattice or should I join the hypotheses that point to >>>>>>>>> recombined hypo's with the existing graph? To give an example: >>>>>>>>> >>>>>>>>> who is bill ? >>>>>>>>> (0)-->(1)-->(2)--->(3)-->(4) >>>>>>>>> | >>>>>>>>> |--->(5)------------->(6) >>>>>>>>> how | is bill ? >>>>>>>>> | >>>>>>>>> |---->(7)----->(8) >>>>>>>>> is the bill >>>>>>>>> >>>>>>>>> where (6) is a recombined hypo pointing to (4) and covering >>>>>>>>> tokens >>>>>>>>> 1-3 and (8) is a recombined hypo that points to (3) >>>>>>>>> >>>>>>>>> Should I copy the relevant parts of (4) that cover the same >>>>>>>>> tokens >>>>>>>>> to the graph as a link to (5) or can I safely join (5) and (1)? >>>>>>>>> Probably not because this would produce "who is the bill" which >>>>>>>>> is >>>>>>>>> not necessarily an option ... >>>>>>>>> >>>>>>>>> Thanks a lot for clarifying this to me! >>>>>>>>> Jörg >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Chris Dyer wrote: >>>>>>>>> >>>>>>>>> As long as you're just splitting, keeping the weights >>>>>>>>> consistent isn't >>>>>>>>> too hard- just keep all the weight in one segment and make >>>>>>>>> all >>>>>>>>> the rest of the segments have no impact when they multiply (i.e., a >>>>>>>>> probability of 1, or a cost of 0). The openFST or AT&T tools can help >>>>>>>>> you manipulate lattices if you want to do more interesting >>>>>>>>> things with >>>>>>>>> weights, such as pushing them to the start of paths. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT >>>>>>>>> <[email protected] >>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>> >>>>>>>>> Indeed, splitting is not hard, but the trickiest thing >>>>>>>>> is >>>>>>>>> how much >>>>>>>>> probability/score amount do you give to each part of the >>>>>>>>> split ? Maybe it has not any real impact in the end, or has it ? Loïc >>>>>>>>> >>>>>>>>> 2010/3/1 Chris Dyer<[email protected] >>>>>>>>> <mailto:[email protected]>> >>>>>>>>> >>>>>>>>> I guess word-graph doesn't split phrases either (I >>>>>>>>> was >>>>>>>>> just guessing). >>>>>>>>> It appears to be in SLF format, which is used by a >>>>>>>>> number of tools >>>>>>>>> (like HTK and the SRI tools). SRILM can split >>>>>>>>> lattices with multi-word arcs into lattices, or you can write your own >>>>>>>>> code to do >>>>>>>>> it. It's not terribly hard. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann >>>>>>>>> <[email protected] >>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>> >>>>>>>>> Ok thanks. I will use the output-word-graph >>>>>>>>> option. However, I also get >>>>>>>>> phrases with that option (in the w attribute), >>>>>>>>> for >>>>>>>>> example here: >>>>>>>>> >>>>>>>>> .... >>>>>>>>> J=42 S=0 E=53 a=0, 0, 0, -0.693147, >>>>>>>>> 0.999896 l=-13.695 >>>>>>>>> r=-20, 0, -1.60944, 0, 0, 0 w=bill clinton , >>>>>>>>> pC=0.0613498, >>>>>>>>> c=-3.23392 >>>>>>>>> ... >>>>>>>>> >>>>>>>>> I'm not sure if I'm using the command line >>>>>>>>> argument correctly: >>>>>>>>> echo 'who is bill clinton ?' | \ >>>>>>>>> moses -f moses.ini -output-word-graph >>>>>>>>> test.graph 0 >>>>>>>>> >>>>>>>>> Jörg >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/1/10 5:35 PM, Chris Dyer wrote: >>>>>>>>> >>>>>>>>> I don't have such a tool, but it wouldn't be >>>>>>>>> too difficult to write >>>>>>>>> one. I think the difference between word >>>>>>>>> graph and search graph is >>>>>>>>> the search graph has full phrases on the >>>>>>>>> edges, whereas the word graph >>>>>>>>> has single words on the edges. For the >>>>>>>>> input, >>>>>>>>> you need single word >>>>>>>>> edges. >>>>>>>>> -Chris >>>>>>>>> >>>>>>>>> 2010/3/1 Jörg >>>>>>>>> Tiedemann<[email protected] >>>>>>>>> <mailto:[email protected]>>: >>>>>>>>> >>>>>>>>> Is there a tool to convert output search >>>>>>>>> graphs to word lattices in >>>>>>>>> PLF >>>>>>>>> (moses lattice input format)? It's the >>>>>>>>> option -output-search-graph >>>>>>>>> that I should use for getting the >>>>>>>>> relevant >>>>>>>>> information, right? I'm not >>>>>>>>> really sure if I understand the >>>>>>>>> difference >>>>>>>>> between -output-word-graph >>>>>>>>> and -output-search-graph >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Jörg >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\***************************************** >>>>>>>>> * Jörg Tiedemann >>>>>>>>> [email protected] >>>>>>>>> <mailto:[email protected]> >>>>>>>>> Visiting Professor >>>>>>>>> http://stp.lingfil.uu.se/~joerg/ >>>>>>>>> Dep. of Linguistics and Philology >>>>>>>>> Uppsala University >>>>>>>>> tel: >>>>>>>>> +46 (0)18 - 471 1412 >>>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN >>>>>>>>> fax: >>>>>>>>> +46 (0)18 - 471 1094 >>>>>>>>> >>>>>>>>> >>>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\*************** >>>>>>>>> * _______________________________________________ Moses-support >>>>>>>>> mailing >>>>>>>>> list >>>>>>>>> [email protected] >>>>>>>>> <mailto:[email protected]> >>>>>>>>> >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Moses-support mailing list >>>>>>>>> [email protected] >>>>>>>>> <mailto:[email protected]> >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Moses-support mailing list >>>>>>>>> [email protected]<mailto:[email protected]> >>>>>>>>> >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --- >>>>>>>>> Loïc BARRAULT >>>>>>>>> Post-doctoral researcher >>>>>>>>> LIUM - University of Le Mans >>>>>>>>> Tél. +33/0 2 43 83 38 52 >>>>>>>>> http://www-lium.univ-lemans.fr/~barrault >>>>>>>>> MANY : Open Source MT System Combination >>>>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY >>>>>>>>> --- >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\***************************************** >>>>>>>>> * Jörg Tiedemann [email protected] >>>>>>>>> <mailto:[email protected]> >>>>>>>>> Visiting Professor >>>>>>>>> http://stp.lingfil.uu.se/~joerg/ Dep. of Linguistics and Philology >>>>>>>>> Uppsala University tel: +46 (0)18 - 471 1412 >>>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 >>>>>>>>> >>>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\*************** >>>>>>>>> * >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --- >>>>>>>>> Loïc BARRAULT >>>>>>>>> Post-doctoral researcher >>>>>>>>> LIUM - University of Le Mans >>>>>>>>> Tél. +33/0 2 43 83 38 52 >>>>>>>>> http://www-lium.univ-lemans.fr/~barrault >>>>>>>>> MANY : Open Source MT System Combination >>>>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY >>>>>>>>> --- >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Hälsningar, >>>>>>>> >>>>>>>> Jörg >>>>>>>> >>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\****************************************** >>>>>>>> Jörg Tiedemann [email protected] >>>>>>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/ >>>>>>>> Dep. of Linguistics and Philology >>>>>>>> Uppsala University tel: +46 (0)18 - 471 1412 >>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 >>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\**************** >>>>>>>> _______________________________________________ >>>>>>>> Moses-support mailing list >>>>>>>> [email protected] >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>> >>>>>> -- >>>>>> >>>>>> Hälsningar, >>>>>> >>>>>> Jörg >>>>>> >>>>>> *******/\/\/\/\/\/\/\/\/\/\/\****************************************** >>>>>> Jörg Tiedemann [email protected] >>>>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/ >>>>>> Dep. of Linguistics and Philology >>>>>> Uppsala University tel: +46 (0)18 - 471 1412 >>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 >>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\**************** >>>>>> _______________________________________________ >>>>>> Moses-support mailing list >>>>>> [email protected] >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
