mmm... but the others were optimised altogether, without the new ones I'm giving a weight zero...
On Wed, 25 Jul 2012, Miles Osborne wrote: > if you have non-zero feature values at training time, but they become > zero at test time then you may have a problem. > > the reason for this is that all weights are optimised together. you > can think of this as the system trying to work-out how best to > translate, using everything. if some are zero, then you are forcing > the rest to do the work that they were not optimised for. > > Miles > > On 25 July 2012 17:51, Cristina <[email protected]> wrote: > > > > Thanks for the quick answer! > > > > I think that the problem here cannot be in the development step, it > > must be more related to decoding. > > > > Regardless the way weights are estimated, translation changes when I add > > new features with zero weight (not in development but in test). They > > shouldn't contribute to score the final translation, right? > > > > Cristina > > > > > > On Wed, 25 Jul 2012, Miles Osborne wrote: > > > >> this is a fairly typical result for MERT. i notice you are using > >> MIRA, which is claimed to be more reliable. see > >> > >> http://www.aclweb.org/anthology/N/N09/N09-1025.pdf > >> > >> note that getting MIRA to work takes a lot of tweaking, so read the > >> fine print carefully > >> > >> Miles > >> > >> On 25 July 2012 17:24, Cristina <[email protected]> wrote: > >> > > >> > Dear all, > >> > > >> > We are doing some experiments by adding new features at phrase level in > >> > the translation table. We have done a first experiment to see the effects > >> > and they are quite weird: > >> > > >> > * We build a translation table with 9 features and a similar translation > >> > table with 18 features (the same 9 features + 9 new features) > >> > > >> > * We run MERT (or MIRA) on a dev set using the first translation table > >> > (9 > >> > features) > >> > > >> > * We translate a test set with 2 configurations: > >> > - MERT on 9 features using the translation table with 9 features > >> > - MERT on 9 features using the translation table with 18 features (9 + > >> > 9) where the weight for the 9 extra features is set to 0 > >> > > >> > We loose more than 3 points of BLEU with the second configuration with > >> > respect to the first one. (Using MERT on the 18 features gives similar > >> > results to the second configuration) > >> > > >> > Does anyone know if there is some penalty when adding more features? Or > >> > has anyone encountered the same problem? > >> > Thanks in advance! > >> > > >> > Best, > >> > > >> > Cristina > >> > _______________________________________________ > >> > Moses-support mailing list > >> > [email protected] > >> > http://mailman.mit.edu/mailman/listinfo/moses-support > >> > >> > >> > >> -- > >> The University of Edinburgh is a charitable body, registered in > >> Scotland, with registration number SC005336. > >> > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
