then something is wrong

Miles

On 25 July 2012 19:42, Cristina <[email protected]> wrote:
> mmm... but the others were optimised altogether, without the new ones I'm
> giving a weight zero...
>
> On Wed, 25 Jul 2012, Miles Osborne wrote:
>
>> if you have non-zero feature values at training time, but they become
>> zero at test time then you may have a problem.
>>
>> the reason for this is that all weights are optimised together.  you
>> can think of this as the system trying to work-out how best to
>> translate, using everything. if some are zero, then you are forcing
>> the rest to do the work that they were not optimised for.
>>
>> Miles
>>
>> On 25 July 2012 17:51, Cristina <[email protected]> wrote:
>> >
>> > Thanks for the quick answer!
>> >
>> > I think that the problem here cannot be in the development step, it
>> > must be more related to decoding.
>> >
>> > Regardless the way weights are estimated, translation changes when I add
>> > new features with zero weight (not in development but in test). They
>> > shouldn't contribute to score the final translation, right?
>> >
>> > Cristina
>> >
>> >
>> > On Wed, 25 Jul 2012, Miles Osborne wrote:
>> >
>> >> this is a fairly typical result for MERT.  i notice you are using
>> >> MIRA, which is claimed to be more reliable.  see
>> >>
>> >> http://www.aclweb.org/anthology/N/N09/N09-1025.pdf
>> >>
>> >> note that getting MIRA to work takes a lot of tweaking, so read the
>> >> fine print carefully
>> >>
>> >> Miles
>> >>
>> >> On 25 July 2012 17:24, Cristina <[email protected]> wrote:
>> >> >
>> >> > Dear all,
>> >> >
>> >> > We are doing some experiments by adding new features at phrase level in
>> >> > the translation table. We have done a first experiment to see the 
>> >> > effects
>> >> > and they are quite weird:
>> >> >
>> >> >  * We build a translation table with 9 features and a similar 
>> >> > translation
>> >> > table with 18 features (the same 9 features + 9 new features)
>> >> >
>> >> >  * We run MERT (or MIRA) on a dev set using the first translation table 
>> >> > (9
>> >> > features)
>> >> >
>> >> >  * We translate a test set with 2 configurations:
>> >> >   - MERT on 9 features using the translation table with 9 features
>> >> >   - MERT on 9 features using the translation table with 18 features (9 +
>> >> > 9) where the weight for the 9 extra features is set to 0
>> >> >
>> >> > We loose more than 3 points of BLEU with the second configuration with
>> >> > respect to the first one. (Using MERT on the 18 features gives similar
>> >> > results to the second configuration)
>> >> >
>> >> > Does anyone know if there is some penalty when adding more features? Or
>> >> > has anyone encountered the same problem?
>> >> > Thanks in advance!
>> >> >
>> >> > Best,
>> >> >
>> >> >  Cristina
>> >> > _______________________________________________
>> >> > Moses-support mailing list
>> >> > [email protected]
>> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>
>> >>
>> >>
>> >> --
>> >> The University of Edinburgh is a charitable body, registered in
>> >> Scotland, with registration number SC005336.
>> >>
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to