The alignment models are going to struggle quite a bit when the source to target length ratio is so skewed. I would recommend finding a way to retokenize/resegment the source and/or target language so as to induce a more even ratio. If this isn't possible, you may need to look into custom alignment methods.
Chris On Thu, Jun 18, 2009 at 1:34 PM, Catharine Oertel<[email protected]> wrote: > > > Begin forwarded message: > > From: Catharine Oertel <[email protected]> > Date: June 18, 2009 7:13:24 PM GMT+02:00 > To: John Burger <[email protected]> > Subject: Re: [Moses-support] alignment problem > > On Jun 18, 2009, at 7:05 PM, John Burger wrote: > > Catharine Oertel wrote: > > I have a huge problem aligning my source and target language and I > > would appreciate your advice very much. > > The sentence length ratio of my source and target language is in > > average about 9:1. So I have much more words in my source language > > than I do have in my target language. I found that the intersect > > alignment method is working much better for me than the grow-diag- > > final. However, I do not get satisfactory results which I assume has > > also to do with the occurrence of ERROR 2. > > That is a fairly large ratio - if you tell us your language pair, we might > have suggestions for different ways to cast the problem. > > By "ERROR 2", do you mean type II errors, that is, false negatives? > > - John D. Burger > > MITRE > > > I am not sure whether it is an type II error. In the log file it says :" > ERROR2: nan nan nanN: " > > Catharine > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
