ok, thanks for the tip. I've never used valgrind. let's see if I find some time to look further tomorrow ...
Jorg On 3/4/10 8:20 PM, Barry Haddow wrote: > Hi Chrs, Jorg > > The fact that the SEGV occurs in new suggests heap corruption, which means > that the error may have occurred earlier on, The way to check for this is to > use the memcheck tool in valgrind, > > regards > Barry > > On Thursday 04 March 2010 19:14, Chris Dyer wrote: >> I don't see anything obvious, but this error is occurring when >> processing an unknown word, so you might try to identify exactly what >> span is being dealt with when the error occurs and look to see if the >> lattice has anything strange at that point. Of course, SEGVs in new >> are often delayed effects to memory corruption, so this may be a red >> herring. Still, it's a place to start. >> >> On Thu, Mar 4, 2010 at 2:10 PM, Joerg Tiedemann >> >> <[email protected]> wrote: >>> Here it is: >>> >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> 0x000000000047dd7a in >>> __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct >>> (this=0x12a44e0, __p=0xb1, __v...@0x7fffd1187c30) >>> at /usr/include/c++/4.3/ext/new_allocator.h:108 >>> 108 { ::new((void *)__p) _Tp(__val); } >>> (gdb) backtrace >>> #0 0x000000000047dd7a in >>> __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct >>> (this=0x12a44e0, __p=0xb1, __v...@0x7fffd1187c30) >>> at /usr/include/c++/4.3/ext/new_allocator.h:108 >>> #1 0x000000000047fee9 in std::vector<Moses::TranslationOption*, >>> std::allocator<Moses::TranslationOption*> >::push_back (this=0x12a44e0, >>> _...@0x7fffd1187c30) >>> at /usr/include/c++/4.3/bits/stl_vector.h:690 >>> #2 0x000000000049ce23 in Moses::TranslationOptionList::Add >>> (this=0x12a44e0, >>> transOpt=0x1a00d00) at TranslationOptionList.h:49 >>> #3 0x00000000004e84ec in Moses::TranslationOptionCollection::Add ( >>> this=0x12f0670, translationOption=0x1a00d00) >>> at TranslationOptionCollection.cpp:566 >>> #4 0x00000000004e9fe5 in >>> Moses::TranslationOptionCollection::ProcessOneUnknownWord >>> (this=0x12f0670, sourcewo...@0x12b33b0, sourcePos=0, length=21) >>> at TranslationOptionCollection.cpp:265 >>> #5 0x000000000049c7b1 in >>> Moses::TranslationOptionCollectionConfusionNet::ProcessUnknownWord >>> (this=0x12f0670, sourcePos=0) >>> at TranslationOptionCollectionConfusionNet.cpp:31 >>> #6 0x00000000004e8163 in >>> Moses::TranslationOptionCollection::ProcessUnknownWord (this=0x12f0670, >>> decodestep...@0x7a0600) >>> at TranslationOptionCollection.cpp:195 >>> #7 0x00000000004e9867 in >>> Moses::TranslationOptionCollection::CreateTranslationOptions >>> (this=0x12f0670, decodestep...@0x7a0600) >>> at TranslationOptionCollection.cpp:389 >>> ---Type<return> to continue, or q<return> to quit--- >>> #8 0x000000000044657f in Moses::Manager::ProcessSentence >>> (this=0x7fffd1188160) >>> at Manager.cpp:90 >>> #9 0x00000000004079a6 in main (argc=<value optimized out>, >>> argv=0x7fffd1188338) at Main.cpp:143 >>> >>> On 3/4/10 7:32 PM, Chris Dyer wrote: >>>> Can you show the complete stack trace? >>>> >>>> On Thu, Mar 4, 2010 at 12:48 PM, Joerg Tiedemann >>>> >>>> <[email protected]> wrote: >>>>> I did that and this is what I get: >>>>> >>>>> Program received signal SIGSEGV, Segmentation fault. >>>>> 0x000000000047dd7a in >>>>> __gnu_cxx::new_allocator<Moses::TranslationOption*>::construct >>>>> (this=0x12a44e0, __p=0xb1, __v...@0x7fff499d0480) >>>>> at /usr/include/c++/4.3/ext/new_allocator.h:108 >>>>> 108 { ::new((void *)__p) _Tp(__val); } >>>>> >>>>> >>>>> hm - maybe there's really something wrong with my lattice input. I >>>>> couldn't see any empty nodes. Moses did print a list of lattice >>>>> segments like this: >>>>> >>>>> ... >>>>> 925 -- (gewoond , -0.000, 105) (gewoond , -0.000, 45) >>>>> 926 -- (stupid , -100.000, 101) >>>>> 927 -- (gewoond , -0.000, 44) >>>>> 928 -- (you , -100.000, 102) >>>>> 929 -- (zwitserland , -0.000, 25) >>>>> 930 -- (zwitserland , -0.000, 25) >>>>> 931 -- (zwitserland , -0.000, 26) >>>>> ... >>>>> >>>>> Looks a bit strange with the -0.000 but that's probably ok. I also >>>>> tried with a more recent version of Moses and also got the segmentation >>>>> fault (not the lattice output though) >>>>> >>>>> Well, I will have a careful look at the input again .... >>>>> If you have any other ideas - please let me know. >>>>> Thanks! >>>>> >>>>> Jörg >>>>> >>>>> On 3/4/10 4:32 PM, Barry Haddow wrote: >>>>>> Hi Jorg >>>>>> >>>>>> The stacktrace looks a little strange because of the compiler >>>>>> optimisations. If you edit moses/src/Makefile, changing >>>>>> CXXFLAGS = -g -O2 >>>>>> to >>>>>> CXXFLAGS = -g >>>>>> do a 'make clean all', then rerun, you should get a more readable >>>>>> stacktrace. Try rerunning just on the sentence that gave you the >>>>>> problems to see if you can reproduce the problem. >>>>>> >>>>>> regards >>>>>> Barry >>>>>> >>>>>> On Thursday 04 March 2010 15:24, Chris Dyer wrote: >>>>>>> I'm not certain what's causing this. From the part of the stack >>>>>>> trace you're showing, it looks like it's probably when translations >>>>>>> options are being gathered for the spans in the lattice. Perhaps the >>>>>>> lattice is malformed (i.e., spans don't line up, there are empty >>>>>>> nodes, etc)? >>>>>>> >>>>>>> 2010/3/4 Jörg Tiedemann<[email protected]>: >>>>>>>> I get a segmentation fault when decoding (large) word lattices. >>>>>>>> Moses seems to parse well through the input but crashes after a >>>>>>>> while. Tracing with gdb gave me this info: >>>>>>>> >>>>>>>> Program received signal SIGSEGV, Segmentation fault. >>>>>>>> 0x00000000004a2888 in Moses::TranslationOptionCollection::Add ( >>>>>>>> this=<value optimized out>, translationOption=0x18a12a0) >>>>>>>> at >>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ex >>>>>>>> t/new _allocator.h:104 104 >>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ex >>>>>>>> t/new _allocator.h: No such file or directory. >>>>>>>> in >>>>>>>> /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ex >>>>>>>> t/new _allocator.h >>>>>>>> >>>>>>>> Indeed, the header file does not exist on my system. >>>>>>>> Do I need to install some additional packages and re-compile Moses >>>>>>>> in a certain way to get rid of this error? >>>>>>>> >>>>>>>> Jörg >>>>>>>> >>>>>>>> Chris Dyer wrote: >>>>>>>>> Moses transition costs can be converted to probabilities (i.e., you >>>>>>>>> can make a search graph into a stochastic FSA), but they do need to >>>>>>>>> be renormalized. You can do this by computing the posterior >>>>>>>>> probability of each edge (using the forward-backward algorithm), >>>>>>>>> and then normalizing all of the out-going edges at each node. >>>>>>>>> >>>>>>>>> One caveat: the way moses is usually trained (with MERT) means that >>>>>>>>> the resulting transition probabilities might be scaled in funny >>>>>>>>> ways (i.e., the best edge might have 99.99% of the probability >>>>>>>>> mass, or it might just be a miniscule amount over the next best), >>>>>>>>> so you may need to do some things (like rescaling the >>>>>>>>> probabilities) to make them useful. >>>>>>>>> >>>>>>>>> -C >>>>>>>>> >>>>>>>>> 2010/3/4 Jörg Tiedemann<[email protected]>: >>>>>>>>>> One more time about the conversion from search graphs to word >>>>>>>>>> lattices: In the word lattice I would like to use probabilities >>>>>>>>>> for each edge but I guess that transition costs cannot be easily >>>>>>>>>> interpreted as log prob's. For example, I have seen quite a few >>>>>>>>>> positive transition values in my sample output which would >>>>>>>>>> definitely create some problems. >>>>>>>>>> >>>>>>>>>> Anyway, what I try to do is to use Moses output to create word >>>>>>>>>> lattice input for another translation step. Maybe the value at >>>>>>>>>> input lattice edges do not strictly have to be probabilities and I >>>>>>>>>> shouldn't care too much? >>>>>>>>>> >>>>>>>>>> Jörg >>>>>>>>>> >>>>>>>>>> Loïc BARRAULT wrote: >>>>>>>>>>> Hi Jörg, >>>>>>>>>>> >>>>>>>>>>> I'll take an example to explain my point of view. >>>>>>>>>>> >>>>>>>>>>> Here is an example of a recombined hypo : >>>>>>>>>>> 0 hyp=319 stack=3 back=1 score=-0.831512 transition=-0.641647 >>>>>>>>>>> recombined=181 forward=3766 fscore=-205.134 covered=1-2 out=. I >>>>>>>>>>> 'm looking for a , pC=-0.518872, c=-0.31244 >>>>>>>>>>> >>>>>>>>>>> In my case, hypo number are the nodes of the graph and phrases >>>>>>>>>>> are represented on links. >>>>>>>>>>> In this case, to preserve the graph topology, the only thing >>>>>>>>>>> which can be done is to merge the nodes 319 with 181, which >>>>>>>>>>> result in creating a link between node 1 (back node) and 181 (the >>>>>>>>>>> recombined node). >>>>>>>>>>> >>>>>>>>>>> (X) ---------->(181) >>>>>>>>>>> (1)------------->(319) >>>>>>>>>>> >>>>>>>>>>> result in >>>>>>>>>>> (X) ---------->(181) >>>>>>>>>>> (1)---------------^ >>>>>>>>>>> >>>>>>>>>>> In your example, you can't merge 5 and 1 because their history is >>>>>>>>>>> not the same (you pointed this out). >>>>>>>>>>> But if 6 is recombined and pointing to 4, then the only thing you >>>>>>>>>>> can do safely is to merge 6 and 4, which means creating a link >>>>>>>>>>> between 5 and 4. >>>>>>>>>>> >>>>>>>>>>> Good luck. >>>>>>>>>>> >>>>>>>>>>> Loïc >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2010/3/3 Jörg Tiedemann<[email protected] >>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I try to use the search graph output now for producing a >>>>>>>>>>> word lattice in PLF style. I'm still a bit confused on how to use >>>>>>>>>>> the recombined hypotheses and their pointers to superior hypo's. >>>>>>>>>>> Do I have to copy the relevant parts from the superior hypotheses >>>>>>>>>>> into the lattice or should I join the hypotheses that point to >>>>>>>>>>> recombined hypo's with the existing graph? To give an example: >>>>>>>>>>> >>>>>>>>>>> who is bill ? >>>>>>>>>>> (0)-->(1)-->(2)--->(3)-->(4) >>>>>>>>>>> | >>>>>>>>>>> |--->(5)------------->(6) >>>>>>>>>>> how | is bill ? >>>>>>>>>>> | >>>>>>>>>>> |---->(7)----->(8) >>>>>>>>>>> is the bill >>>>>>>>>>> >>>>>>>>>>> where (6) is a recombined hypo pointing to (4) and covering >>>>>>>>>>> tokens 1-3 and (8) is a recombined hypo that points to (3) >>>>>>>>>>> >>>>>>>>>>> Should I copy the relevant parts of (4) that cover the same >>>>>>>>>>> tokens to the graph as a link to (5) or can I safely join (5) and >>>>>>>>>>> (1)? Probably not because this would produce "who is the bill" >>>>>>>>>>> which is not necessarily an option ... >>>>>>>>>>> >>>>>>>>>>> Thanks a lot for clarifying this to me! >>>>>>>>>>> Jörg >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Chris Dyer wrote: >>>>>>>>>>> >>>>>>>>>>> As long as you're just splitting, keeping the weights >>>>>>>>>>> consistent isn't >>>>>>>>>>> too hard- just keep all the weight in one segment and >>>>>>>>>>> make all the rest of the segments have no impact when they >>>>>>>>>>> multiply (i.e., a probability of 1, or a cost of 0). The openFST >>>>>>>>>>> or AT&T tools can help you manipulate lattices if you want to do >>>>>>>>>>> more interesting things with >>>>>>>>>>> weights, such as pushing them to the start of paths. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT >>>>>>>>>>> <[email protected] >>>>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>>> >>>>>>>>>>> Indeed, splitting is not hard, but the trickiest >>>>>>>>>>> thing is how much >>>>>>>>>>> probability/score amount do you give to each part >>>>>>>>>>> of the split ? Maybe it has not any real impact in the end, or >>>>>>>>>>> has it ? Loïc >>>>>>>>>>> >>>>>>>>>>> 2010/3/1 Chris Dyer<[email protected] >>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>> >>>>>>>>>>> I guess word-graph doesn't split phrases either >>>>>>>>>>> (I was just guessing). >>>>>>>>>>> It appears to be in SLF format, which is used >>>>>>>>>>> by a number of tools >>>>>>>>>>> (like HTK and the SRI tools). SRILM can split >>>>>>>>>>> lattices with multi-word arcs into lattices, or you can write >>>>>>>>>>> your own code to do >>>>>>>>>>> it. It's not terribly hard. >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On Mon, Mar 1, 2010 at 12:32 PM, Joerg >>>>>>>>>>> Tiedemann<[email protected] >>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Ok thanks. I will use the output-word-graph >>>>>>>>>>> option. However, I also get >>>>>>>>>>> phrases with that option (in the w >>>>>>>>>>> attribute), for example here: >>>>>>>>>>> >>>>>>>>>>> .... >>>>>>>>>>> J=42 S=0 E=53 a=0, 0, 0, >>>>>>>>>>> -0.693147, 0.999896 l=-13.695 >>>>>>>>>>> r=-20, 0, -1.60944, 0, 0, 0 w=bill >>>>>>>>>>> clinton , pC=0.0613498, >>>>>>>>>>> c=-3.23392 >>>>>>>>>>> ... >>>>>>>>>>> >>>>>>>>>>> I'm not sure if I'm using the command line >>>>>>>>>>> argument correctly: >>>>>>>>>>> echo 'who is bill clinton ?' | \ >>>>>>>>>>> moses -f moses.ini -output-word-graph >>>>>>>>>>> test.graph 0 >>>>>>>>>>> >>>>>>>>>>> Jörg >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 3/1/10 5:35 PM, Chris Dyer wrote: >>>>>>>>>>> >>>>>>>>>>> I don't have such a tool, but it >>>>>>>>>>> wouldn't be too difficult to write >>>>>>>>>>> one. I think the difference between >>>>>>>>>>> word graph and search graph is >>>>>>>>>>> the search graph has full phrases on >>>>>>>>>>> the edges, whereas the word graph >>>>>>>>>>> has single words on the edges. For the >>>>>>>>>>> input, you need single word >>>>>>>>>>> edges. >>>>>>>>>>> -Chris >>>>>>>>>>> >>>>>>>>>>> 2010/3/1 Jörg >>>>>>>>>>> Tiedemann<[email protected] >>>>>>>>>>> <mailto:[email protected]>>: >>>>>>>>>>> >>>>>>>>>>> Is there a tool to convert output >>>>>>>>>>> search graphs to word lattices in PLF >>>>>>>>>>> (moses lattice input format)? It's >>>>>>>>>>> the option -output-search-graph that I should use for getting the >>>>>>>>>>> relevant information, right? I'm not really sure if I understand >>>>>>>>>>> the difference between -output-word-graph and >>>>>>>>>>> -output-search-graph >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Jörg >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\************************************ >>>>>>>>>>> ***** * Jörg Tiedemann >>>>>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>> <mailto:[email protected]> Visiting Professor >>>>>>>>>>> http://stp.lingfil.uu.se/~joerg/ >>>>>>>>>>> Dep. of Linguistics and Philology >>>>>>>>>>> Uppsala University >>>>>>>>>>> tel: +46 (0)18 - 471 1412 >>>>>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN >>>>>>>>>>> fax: +46 (0)18 - 471 1094 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\********** >>>>>>>>>>> ***** * _______________________________________________ >>>>>>>>>>> Moses-support mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>> >>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ Moses-support >>>>>>>>>>> mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Moses-support mailing list >>>>>>>>>>> >>>>>>>>>>> [email protected]<mailto:[email protected]> >>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> --- >>>>>>>>>>> Loïc BARRAULT >>>>>>>>>>> Post-doctoral researcher >>>>>>>>>>> LIUM - University of Le Mans >>>>>>>>>>> Tél. +33/0 2 43 83 38 52 >>>>>>>>>>> http://www-lium.univ-lemans.fr/~barrault >>>>>>>>>>> MANY : Open Source MT System Combination >>>>>>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\************************************ >>>>>>>>>>> ***** * Jörg Tiedemann >>>>>>>>>>> [email protected] >>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>> Visiting Professor >>>>>>>>>>> http://stp.lingfil.uu.se/~joerg/ Dep. of Linguistics and >>>>>>>>>>> Philology Uppsala University tel: +46 (0)18 - >>>>>>>>>>> 471 1412 Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 >>>>>>>>>>> 1094 >>>>>>>>>>> >>>>>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\********** >>>>>>>>>>> ***** * >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> --- >>>>>>>>>>> Loïc BARRAULT >>>>>>>>>>> Post-doctoral researcher >>>>>>>>>>> LIUM - University of Le Mans >>>>>>>>>>> Tél. +33/0 2 43 83 38 52 >>>>>>>>>>> http://www-lium.univ-lemans.fr/~barrault >>>>>>>>>>> MANY : Open Source MT System Combination >>>>>>>>>>> http://www-lium.univ-lemans.fr/~barrault/MANY >>>>>>>>>>> --- >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Hälsningar, >>>>>>>>>> >>>>>>>>>> Jörg >>>>>>>>>> >>>>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\************************************* >>>>>>>>>> ***** Jörg Tiedemann >>>>>>>>>> [email protected] Visiting Professor >>>>>>>>>> http://stp.lingfil.uu.se/~joerg/ Dep. of Linguistics and >>>>>>>>>> Philology >>>>>>>>>> Uppsala University tel: +46 (0)18 - 471 1412 >>>>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 >>>>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\*********** >>>>>>>>>> ***** _______________________________________________ >>>>>>>>>> Moses-support mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Hälsningar, >>>>>>>> >>>>>>>> Jörg >>>>>>>> >>>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\*************************************** >>>>>>>> *** Jörg Tiedemann [email protected] >>>>>>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/ >>>>>>>> Dep. of Linguistics and Philology >>>>>>>> Uppsala University tel: +46 (0)18 - 471 1412 >>>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 >>>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\************* >>>>>>>> *** _______________________________________________ >>>>>>>> Moses-support mailing list >>>>>>>> [email protected] >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Moses-support mailing list >>>>>>> [email protected] >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
