Hi, this is correct - this was in fact a mistake in the description on the Wiki. I fixed it to "or" instead of "and". As far as I know, there has not been all that much work on different such heuristics, the Och&Ney Computational Linguistics journal paper comes to mind. Most of the work that advances over such a heuristic also looks at posterior probabilities and so on.
-phi On Tue, May 12, 2009 at 3:12 PM, Nicola Bertoldi <[email protected]> wrote: > Dear Attila, > > you are right that the algorithm described into the Moses website > differs from what is implemented in symal. > > > But in the moses website (http://www.statmt.org/moses/?n=Moses.Background) > there is also this description > "Our heuristic proceeds as follows: We start with intersection of the two > word alignments. We only add new alignment points that exist in the union of > two word alignments. We also always require that a new alignment point > connects at least one previously unaligned word." > which does match the symal implementation. > > As both procedures are heuristic, both are good in principle. > Nevertheless I personally prefer symal one, > because it adds points earlier which will be added later in the final step. > > Philip (and other developers), what do you think of? > Which is the most reliable heuristic procedure? > > best regards, > Nicola > > > On 5/12/09 12:16 PM, "Attila Zséder" <[email protected]> wrote: > > Hello, > > I wanted to understand grow-diag() algorithm described here: > http://www.statmt.org/moses/?n=FactoredTraining.AlignWords , in order > to check where the script generates non-distinct alignment pairs ( I > mean: 1-2 1-3 2-3 is non-distinct; 1-2 2-2 1-3 2-3 ( and no more with > these words) is a distinct pairing) > > in this description there is a line: > if ( e-new not aligned and f-new not aligned ) > > It would mean that we can add new aligns only if both of new words > weren't covered before by the alignment (and so, only distinct pairs > would have been generated). But in the source code (symal.cpp) these > are the corresponding lines: > //check if it connects at least one uncovered word > if (!(ea[point.first] && fa[point.second])) > > So i think the description of this algorithm is not fully proper. I'm > not absolutely sure, but its worth a review. > > Btw, any comments to my original intentions regarding to distinct > pairing are appreciated. > > Thank you! > > Br, > Attila > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
