Doors the reordering model actually use the source phrase? Wouldn't it 
similarly segfault if it did?

Barry Haddow <[email protected]> wrote:

Hi Folks

This bug was discovered here:

http://www.mail-archive.com/[email protected]/msg03496.html

but for some reason a fix never made it into trunk. I think I "fixed" it in the 
sparse features branch by storing a Phrase object instead of a pointer, but 
obviously that uses extra memory.

cheers - Barry


On Saturday 15 Oct 2011 08:58:15 Hieu Hoang wrote:
> On 15/10/2011 13:41, Kenneth Heafield wrote:
> > The bug is this line in PhraseDictionaryMemory:
> >
> > targetPhrase.SetSourcePhrase(&sourcePhrase);
> >
> > That calls this piece of code including a comment that seems to indicate
> > awareness of the problem:
> >
> > //! TODO - why is this needed and is it set correctly by every phrase
> > dictionary class ? should be set in constructor
> > void SetSourcePhrase(Phrase const* p) {
> > m_sourcePhrase=p;
> > }
> >
> > So the pointer to sourcePhrase is stored in TargetPhrase. But
> > sourcePhrase is a stack-allocated object in
> > PhraseDictionaryMemory::Load. It has the same address every iteration
> > and goes out of scope at the end of the function. So every TargetPhrase
> > object is storing the same dangling pointer.
> 
> It's the same sourcephrase for sentence input, but different for
> lattices and confusion networds. The in-memory phrase table doesn't
> support lattices or confusion net so it's prob not strictly necessary
> there, but it is in the binary pt implementation. It's a hack dating
> back to JHU 2006 which never got cleaned up
> 
> > Furthermore,
> > PhraseDictionaryMemory does not explicitly store every source Phrase
> > object (and it shouldn't have to), so there's nothing for these to point
> > to.
> >
> > It seems the correct solution is to remove the pointer from TargetPhrase
> > and properly track where things came from. Given how deeply embedded
> > this is into code that isn't mine, I'm not offering to fix it.
> 
> yeah, prob best to leave it unless it's broken
> 
> > Kenneth
> >
> > On 10/15/11 05:03, Hieu Hoang wrote:
> >> yep, the source phrase is carried as part of the translation rule *only*
> >> to calculate lexicalized reordering score.
> >>
> >> If I can remember correctly, it's implemented slightly differently
> >> depending on the whether the input is a sentence, confusion net or
> >> lattice. It's optimised since you can easily get the source phrase from
> >> a sentence but the source has to be calculated as you go along in the
> >> other 2 types.
> >>
> >> I would prefer it if the implementation was consistent, even at a cost
> >> of slightly degraded speed.
> >>
> >> I've added a regression test
> >> phrase.lexicalized-reordering-bin
> >> just in case you decide to play about with it
> >>
> >> I don't see the bug, wadda you want me to fix?
> >>
> >> On 15/10/2011 03:17, Kenneth Heafield wrote:
> >>> Actually, the incorrect source phrase is used for more than logging. 
> >>> It makes it all the way to lexical reordering:
> >>>
> >>> const Phrase *sourcePhrase = transOpt.GetSourcePhrase();
> >>> if (sourcePhrase) {
> >>> Scores score = lexreordering.GetProb(*sourcePhrase,
> >>> transOpt.GetTargetPhrase());
> >>> if (!score.empty())
> >>> transOpt.CacheScores(lexreordering, score);
> >>>
> >>> These are parts of Moses far afield. Hieu, the bug goes back to you in
> >>> 1338. Want to fix it?
> >>>
> >>> On 10/14/11 20:48, Kenneth Heafield wrote:
> >>>> Hi,
> >>>>
> >>>>  Apparently target phrases keep track of their source phrases? This
> >>>> is incorrect with e.g. PhraseTableMemory (this is the code from
> >>>> revision 4364):
> >>>>
> >>>> Phrase sourcePhrase(Input, 0);
> >>>> sourcePhrase.CreateFromString( input, phraseVector);
> >>>> //target
> >>>> TargetPhrase targetPhrase(Output);
> >>>> targetPhrase.SetSourcePhrase(&sourcePhrase);
> >>>>
> >>>> I'm currently refactoring it, but have provided the old version to
> >>>> make it clear this isn't my fault.
> >>>>
> >>>> The last line stores a pointer to a stack allocated object in
> >>>> targetPhrase. Furthermore, the sourcePhrase object does not even
> >>>> exist in copy form, so there's nothing correct for it to point to.
> >>>>
> >>>> I tried to remove it, apparently it makes its way to TranslationOption
> >>>> and Hypothesis, GetSourcePhraseStringRep in particular. That is
> >>>> called only from Manager.cpp and used for logging purposes in e.g.:
> >>>>
> >>>> outputWordGraphStream<< "\tw="<< 
> >>>> hypo->GetSourcePhraseStringRep()<< "|"<< 
> >>>> hypo->GetCurrTargetPhrase();
> >>>>
> >>>> On the basis that no logging is better than incorrect logging, I'm
> >>>> going to stop tracking the source phrase in the target phrase.
> >>>>
> >>>> Kenneth
> >>>>_____________________________________________

> >>>> Moses-support mailing list
> >>>> [email protected]
> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>
> >>>_____________________________________________

> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>
> >>_____________________________________________

> >> Moses-support mailing list
> >> [email protected]
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >_____________________________________________

> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_____________________________________________

Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to