Re: [Moses-support] sparse features with batch mira?

Cherry, Colin Thu, 07 Feb 2013 01:46:58 -0800

Hi Alex,

I'm afraid it does not, but I could certainly hack something in.

I would be a little nervous about what this would do to MIRA. During MIRA 
training, the scale of the features can change dramatically - I always start by 
normalizing the weight vector to squared norm=1, and by the time I'm done a 
passing through the n-best lists 60 times, the squared norm may have gotten 
much larger. If I keep a feature fixed, it may quickly fall out of scale and 
become irrelevant. Or maybe MIRA will mathmagically work to keep the other 
features in scale. It's not clear to me without checking the literature. I 
think Brian Roark held a single feature fixed in some of his perceptron work 
for speech recognition, so that would be a place to start.

Is there an alternative to holding specific weights constant? If there is a  
group of features to be fixed (say the decoder's dense features), then I would 
suggest presenting their weighted sum to MIRA as a single feature, which MIRA 
can continue to scale appropriately using the meta-feature's single weight. 
After training, the "fixed" features' weights would be the product of the 
single meta-weight and the original fixed weight, which can go back in the 
decoder.

I hope that makes sense! I'm willing to add the weight-fixing feature, it's 
easy enough to do, but I thought it would be worth having this conversation 
first.

-- Colin

On 2013-02-06, at 11:43 AM, Alexander Fraser wrote:

> Another batch MIRA question, perhaps for Colin this time: does kbmira
> support only optimizing some feature weights (i.e., holding the other
> weights constant)?
> 
> Cheers, Alex
> 
> 
> On Mon, Feb 4, 2013 at 3:06 PM, Alexander Fraser
> <[email protected]> wrote:
>> That's great - thanks!
>> 
>> On Mon, Feb 4, 2013 at 2:29 PM, Barry Haddow <[email protected]> 
>> wrote:
>>> Hi Alex
>>> 
>>> Yes, you can use batch mira for training sparse features, it works the same
>>> way as PRO does in Moses.
>>> 
>>> Unfortunately documentation on sparse features is, well, sparse... But the
>>> n-best format is much the same as for dense features, ie
>>> 
>>> name_1: value_1 name_2: value_2 ...
>>> 
>>> Sparse features only get reported in the nbest if they are named in the
>>> -report-sparse-features argument, otherwise their weighted sum will be
>>> reported.
>>> 
>>> cheers - Barry
>>> 
>>> 
>>> On 04/02/13 13:13, Alexander Fraser wrote:
>>>> 
>>>> Hi Folks,
>>>> 
>>>> Can sparse features be used together with batch mira?
>>>> 
>>>> Is there documentation for the n-best format of sparse features somewhere?
>>>> 
>>>> Thanks!
>>>> 
>>>> Cheers, Alex
>>>> 
>>> 
>>> 
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>> 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] sparse features with batch mira?

Reply via email to