Hi, I hope it is not too late to add to this discussion.
If you are comfortable with weighted deduction, Adam Lopez's 2009 EACL paper is very a good reference for phrase-based reordering spaces. If I remember well the implementation in Moses does exactly what he shows with the logic program WLd. http://alopez.github.io/papers/eacl2009-lopez.pdf Cheers, Wilker On 16 December 2015 at 00:56, Matthias Huck <[email protected]> wrote: > Hi Lane, > > Well, you can find excellent descriptions of phrase-based decoding > algorithms in the literature, though possibly not all details of this > specific implementation. > > I like this description: > > R. Zens, and H. Ney. Improvements in Dynamic Programming Beam Search for > Phrase-based Statistical Machine Translation. In International Workshop > on Spoken Language Translation (IWSLT), pages 195-205, Honolulu, HI, > USA, October 2008. > > http://www.hltpr.rwth-aachen.de/publications/download/618/Zens-IWSLT-2008.pdf > > It's what's implemented in Jane, RWTH's open source statistical machine > translation toolkit. > > J. Wuebker, M. Huck, S. Peitz, M. Nuhn, M. Freitag, J. Peter, S. > Mansour, and H. Ney. Jane 2: Open Source Phrase-based and Hierarchical > Statistical Machine Translation. In International Conference on > Computational Linguistics (COLING), pages 483-491, Mumbai, India, > December 2012. > > http://www.hltpr.rwth-aachen.de/publications/download/830/Wuebker-COLING-2012.pdf > > However, I believe that the distinction of coverage hypotheses and > lexical hypotheses is a unique property of the RWTH systems. > > The formalization in the Zens & Ney paper is very nicely done. With hard > distortion limits or coverage-based reordering constraints, you may need > a few more steps in the algorithm. E.g., if you have a hard distortion > limit, you will probably want to avoid leaving a gap and then extending > your sequence in a way that puts your current position further away from > the gap than your maximum jump width. Other people should know more > about how exactly Moses' phrase-based decoder is dealing with this. > > I can recommend Richard Zens' PhD thesis as well. > http://www.hltpr.rwth-aachen.de/publications/download/562/Zens--2008.pdf > > I also remember that the following publication from Microsoft Research > is pretty helpful: > > Robert C. Moore and Chris Quirk, Faster Beam-Search Decoding for Phrasal > Statistical Machine Translation, in Proceedings of MT Summit XI, > European Association for Machine Translation, September 2007. > http://research.microsoft.com/pubs/68097/mtsummit2007_beamsearch.pdf > > Cheers, > Matthias > > > > On Tue, 2015-12-15 at 22:33 +0000, Hieu Hoang wrote: > > I've been looking at this and it is surprisingly complicated. I think > > the code is designed to predetermine if extending a hypothesis will > > lead it down a path that won't ever be completed. > > > > > > Don't know any slide that explains the reasoning, Philipp Koehn > > explained it to me once and it seems pretty reasonable. > > > > > > > > I wouldn't mind seeing this code cleaned up a bit and abstracted and > > formalised. I've made a start with the cleanup in my new decoder > > > > > https://github.com/moses-smt/mosesdecoder/blob/perf_moses2/contrib/other-builds/moses2/Search/Search.cpp#L36 > > Search::CanExtend() > > > > > > There was an Aachen paper from years ago comparing different > > distortion limit heuristics - can't remember the authors or title. > > Maybe someone know more > > > > > > > > > > > > Hieu Hoang > > http://www.hoang.co.uk/hieu > > > > > > On 15 December 2015 at 20:59, Lane Schwartz <[email protected]> > > wrote: > > Hey all, > > > > > > So the SearchNormal::ProcessOneHypothesis() method in > > SearchNormal.cpp is responsible for taking an existing > > hypothesis, creating all legal new extension hypotheses, and > > adding those new hypotheses to the appropriate decoder > > stacks. > > > > > > First off, the method is actually reasonably well commented, > > so kudos to whoever did that. :) > > > > > > That said, does anyone happen to have any slides that actually > > walk through this process, specifically slides that take into > > account the interaction with the distortion limit? That > > interaction is where most of the complexity of this method > > comes from. I don't know about others, but even having a > > pretty good notion of what's going on here, the discussion of > > "the closest thing to the left" is still a bit opaque. > > > > > > Anyway, if anyone knows of a good set of slides, or even a > > good description in a paper, of what's going on here, I'd > > appreciate any pointers. > > > > > > Thanks, > > Lane > > > > > > > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > -- Wilker Aziz http://wilkeraziz.github.io <http://pers-www.wlv.ac.uk/~in1676/>
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
