Re: [Moses-support] grow algorithm on wiki

Philipp Koehn Tue, 12 May 2009 10:04:50 -0700

Hi,

this is correct - this was in fact a mistake in the description on the Wiki.
I fixed it to "or" instead of "and". As far as I know, there has not been
all that much work on different such heuristics, the Och&Ney Computational
Linguistics journal paper comes to mind. Most of the work that advances
over such a heuristic also looks at posterior probabilities and so on.


-phi

On Tue, May 12, 2009 at 3:12 PM, Nicola Bertoldi <[email protected]> wrote:
> Dear Attila,
>
> you are right that the algorithm described  into the Moses website
> differs from what is implemented in symal.
>
>
> But in the moses website (http://www.statmt.org/moses/?n=Moses.Background)
> there is also this description
> "Our heuristic proceeds as follows: We start with intersection of the two 
> word alignments. We only add new alignment points that exist in the union of 
> two word alignments. We also always require that a new alignment point 
> connects at least one previously unaligned word."
> which does match the  symal implementation.
>
> As both procedures are heuristic, both are good in principle.
> Nevertheless I personally prefer symal one,
> because it adds points earlier which will be added later in the final step.
>
> Philip (and other developers), what do you think of?
> Which is the most reliable heuristic procedure?
>
> best regards,
> Nicola
>
>
> On 5/12/09 12:16 PM, "Attila Zséder" <[email protected]> wrote:
>
> Hello,
>
> I wanted to understand grow-diag() algorithm described here:
> http://www.statmt.org/moses/?n=FactoredTraining.AlignWords , in order
> to check where the script generates non-distinct alignment pairs ( I
> mean: 1-2 1-3 2-3 is non-distinct; 1-2 2-2 1-3 2-3 ( and no more with
> these words) is a distinct pairing)
>
> in this description there is a line:
> if ( e-new not aligned and f-new not aligned )
>
> It would mean that we can add new aligns only if both of new words
> weren't covered before by the alignment (and so, only distinct pairs
> would have been generated). But in the source code (symal.cpp) these
> are the corresponding lines:
> //check if it connects at least one uncovered word
> if (!(ea[point.first] && fa[point.second]))
>
> So i think the description of this algorithm is not fully proper. I'm
> not absolutely sure, but its worth a review.
>
> Btw, any comments to my original intentions regarding to distinct
> pairing are appreciated.
>
> Thank you!
>
> Br,
> Attila
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] grow algorithm on wiki

Reply via email to