Yes, passing --dense-init worked. Although, it seems to ignore the feature 
names: so long as I have enough lines matching the number of dense parameters, 
it works, and it always outputs the following:

    477/3000 updates, avg loss = 0.36341, BLEU = 0.356527
    F0 3.663
    F1 0.221152
    F2 0.186323
    F3 1.41851
    F4 2.38853
    F5 -0.162657
    F6 0.430753
    F7 3.93281

Does that sound correct?


> On Mar 5, 2015, at 10:34 AM, Barry Haddow <[email protected]> wrote:
> 
> Hi Matt
> 
> This was part of the changes to support hypergraph mira, since the 
> hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they don't 
> differentiate between sparse and dense features.
> 
> Does it work correctly when you use the --dense-init paramater?
> 
> cheers - Barry
> 
> On 05/03/15 15:18, Matt Post wrote:
>> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>> 
>> It seems that the names of features in the header line 
>> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would output 
>> dense feature weights using names F1..FN, which I would then re-map back to 
>> the list in the header. In kbmira 3.0, it uses the file passed in, as Barry 
>> pointed out.
>> 
>> Thanks for your help!
>> 
>> matt
>> 
>> 
>>> On Feb 27, 2015, at 1:21 PM, Matt Post <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Although, those old successful runs might have been with the old Moses 
>>> kbmira. I'll look into this and report back.
>>> 
>>> matt
>>> 
>>> 
>>>> On Feb 27, 2015, at 12:19 PM, Matt Post <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Hi Barry — Thanks for the response. I don't think that's it, because I use 
>>>> the exact same approach for lots of other tuning runs. Isn't it the header 
>>>> line of the features file that lists dense features? I've been using this 
>>>> format, where dense features are listed in each header line, and then 
>>>> sparse features in the individual lines:
>>>> 
>>>> FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2 
>>>> WordPenalty PhrasePenalty Distortion
>>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
>>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
>>>> OOVPenalty=-100
>>>> 
>>>> This works in lots of places (although, it also raises a separate 
>>>> question, of whether kbmira actually distinguishes between sparse and 
>>>> dense features? I seem to remember Colin once saying that there is a 
>>>> single group weight between the two groups, but I've never been able to 
>>>> find this in the code).
>>>> 
>>>> matt
>>>> 
>>>> 
>>>>> On Feb 26, 2015, at 5:35 PM, Barry Haddow <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>> 
>>>>> Hi Matt
>>>>> 
>>>>> When mert-moses.pl runs kbmira, it always supplies a list of the dense 
>>>>> features (and their initial values) using the --dense-init parameter. I 
>>>>> think this is your problem. I've attached a typical file used for this 
>>>>> feature list.
>>>>> 
>>>>> Of course, kbmira should have a sensible message rather than a segfault. 
>>>>> This is probably my doing,
>>>>> 
>>>>> cheers - Barry
>>>>> 
>>>>> On 26/02/15 22:18, Matt Post wrote:
>>>>>> kbmira segfaults on the following command:
>>>>>> 
>>>>>> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o mert.out
>>>>>> 
>>>>>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
>>>>>> downloaded here:
>>>>>> 
>>>>>> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
>>>>>> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>>>>>> 
>>>>>> I tracked it down to this line of mert/FeatureStats.cpp.
>>>>>> 
>>>>>> std::string SparseVector::decode(std::size_t id)
>>>>>> {
>>>>>> return m_id_to_name[id];
>>>>>> }
>>>>>> 
>>>>>> Any obvious ideas before I go down this rabbit hole? I verified there 
>>>>>> are no blank lines or anything else funny with the formatting, at least 
>>>>>> as far as I can tell (all dense features, plus one sparse feature, 
>>>>>> OOVPenalty=-100, showing up occasionally).
>>>>>> 
>>>>>> matt
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>> 
>>>>> <run1.dense>
>>>> 
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> 
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to