there's been discussion offline about adding word alignment back into the training & decoding. It's a useful feature for all sorts of reasons and often asked for so we'll work on it next week.
we'll make it an optional feature & try to avoid the same mistakes as last time. On 15/07/2010 00:42, Raphael Payen wrote: > Yes it's what I'd like to do also. > > The idea that I wrote earlier: having the value as a factor, was > naive, since moses works on phrases, not on tokens. I think we need to > have information on the word alignments inside the phrases. A phrase > like: > I am NUM years old and have NUM cats -> Tengo NUM aƱos y tengo NUM gatos > should also contain the info that the third token in source is aligned > with the second in target, and the eighth with the sixth. Then > postprocessing could assign the values. > > I saw this on the ML: > Barry Haddow wrote: > >> The word alignment info code got removed as it was using too much memory. If >> you really need it, then you could go back in svn to the time before the >> multi-threaded code was merged in (before r2477, I think) >> > Currently, the word alignment info is not even written in the phrase table. > > It might be feasible to reintroduce the word alignment info, but only > for specific tokens ? Would this keep the memory use lower than having > it for all tokens ? > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
