Doors the reordering model actually use the source phrase? Wouldn't it similarly segfault if it did?
Barry Haddow <[email protected]> wrote: Hi Folks This bug was discovered here: http://www.mail-archive.com/[email protected]/msg03496.html but for some reason a fix never made it into trunk. I think I "fixed" it in the sparse features branch by storing a Phrase object instead of a pointer, but obviously that uses extra memory. cheers - Barry On Saturday 15 Oct 2011 08:58:15 Hieu Hoang wrote: > On 15/10/2011 13:41, Kenneth Heafield wrote: > > The bug is this line in PhraseDictionaryMemory: > > > > targetPhrase.SetSourcePhrase(&sourcePhrase); > > > > That calls this piece of code including a comment that seems to indicate > > awareness of the problem: > > > > //! TODO - why is this needed and is it set correctly by every phrase > > dictionary class ? should be set in constructor > > void SetSourcePhrase(Phrase const* p) { > > m_sourcePhrase=p; > > } > > > > So the pointer to sourcePhrase is stored in TargetPhrase. But > > sourcePhrase is a stack-allocated object in > > PhraseDictionaryMemory::Load. It has the same address every iteration > > and goes out of scope at the end of the function. So every TargetPhrase > > object is storing the same dangling pointer. > > It's the same sourcephrase for sentence input, but different for > lattices and confusion networds. The in-memory phrase table doesn't > support lattices or confusion net so it's prob not strictly necessary > there, but it is in the binary pt implementation. It's a hack dating > back to JHU 2006 which never got cleaned up > > > Furthermore, > > PhraseDictionaryMemory does not explicitly store every source Phrase > > object (and it shouldn't have to), so there's nothing for these to point > > to. > > > > It seems the correct solution is to remove the pointer from TargetPhrase > > and properly track where things came from. Given how deeply embedded > > this is into code that isn't mine, I'm not offering to fix it. > > yeah, prob best to leave it unless it's broken > > > Kenneth > > > > On 10/15/11 05:03, Hieu Hoang wrote: > >> yep, the source phrase is carried as part of the translation rule *only* > >> to calculate lexicalized reordering score. > >> > >> If I can remember correctly, it's implemented slightly differently > >> depending on the whether the input is a sentence, confusion net or > >> lattice. It's optimised since you can easily get the source phrase from > >> a sentence but the source has to be calculated as you go along in the > >> other 2 types. > >> > >> I would prefer it if the implementation was consistent, even at a cost > >> of slightly degraded speed. > >> > >> I've added a regression test > >> phrase.lexicalized-reordering-bin > >> just in case you decide to play about with it > >> > >> I don't see the bug, wadda you want me to fix? > >> > >> On 15/10/2011 03:17, Kenneth Heafield wrote: > >>> Actually, the incorrect source phrase is used for more than logging. > >>> It makes it all the way to lexical reordering: > >>> > >>> const Phrase *sourcePhrase = transOpt.GetSourcePhrase(); > >>> if (sourcePhrase) { > >>> Scores score = lexreordering.GetProb(*sourcePhrase, > >>> transOpt.GetTargetPhrase()); > >>> if (!score.empty()) > >>> transOpt.CacheScores(lexreordering, score); > >>> > >>> These are parts of Moses far afield. Hieu, the bug goes back to you in > >>> 1338. Want to fix it? > >>> > >>> On 10/14/11 20:48, Kenneth Heafield wrote: > >>>> Hi, > >>>> > >>>> Apparently target phrases keep track of their source phrases? This > >>>> is incorrect with e.g. PhraseTableMemory (this is the code from > >>>> revision 4364): > >>>> > >>>> Phrase sourcePhrase(Input, 0); > >>>> sourcePhrase.CreateFromString( input, phraseVector); > >>>> //target > >>>> TargetPhrase targetPhrase(Output); > >>>> targetPhrase.SetSourcePhrase(&sourcePhrase); > >>>> > >>>> I'm currently refactoring it, but have provided the old version to > >>>> make it clear this isn't my fault. > >>>> > >>>> The last line stores a pointer to a stack allocated object in > >>>> targetPhrase. Furthermore, the sourcePhrase object does not even > >>>> exist in copy form, so there's nothing correct for it to point to. > >>>> > >>>> I tried to remove it, apparently it makes its way to TranslationOption > >>>> and Hypothesis, GetSourcePhraseStringRep in particular. That is > >>>> called only from Manager.cpp and used for logging purposes in e.g.: > >>>> > >>>> outputWordGraphStream<< "\tw="<< > >>>> hypo->GetSourcePhraseStringRep()<< "|"<< > >>>> hypo->GetCurrTargetPhrase(); > >>>> > >>>> On the basis that no logging is better than incorrect logging, I'm > >>>> going to stop tracking the source phrase in the target phrase. > >>>> > >>>> Kenneth > >>>>_____________________________________________ > >>>> Moses-support mailing list > >>>> [email protected] > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>> > >>>_____________________________________________ > >>> Moses-support mailing list > >>> [email protected] > >>> http://mailman.mit.edu/mailman/listinfo/moses-support > >> > >>_____________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > >_____________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > _____________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
