Yes, passing --dense-init worked. Although, it seems to ignore the feature
names: so long as I have enough lines matching the number of dense parameters,
it works, and it always outputs the following:
477/3000 updates, avg loss = 0.36341, BLEU = 0.356527
F0 3.663
F1 0.221152
F2 0.186323
F3 1.41851
F4 2.38853
F5 -0.162657
F6 0.430753
F7 3.93281
Does that sound correct?
> On Mar 5, 2015, at 10:34 AM, Barry Haddow <[email protected]> wrote:
>
> Hi Matt
>
> This was part of the changes to support hypergraph mira, since the
> hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they don't
> differentiate between sparse and dense features.
>
> Does it work correctly when you use the --dense-init paramater?
>
> cheers - Barry
>
> On 05/03/15 15:18, Matt Post wrote:
>> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>>
>> It seems that the names of features in the header line
>> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would output
>> dense feature weights using names F1..FN, which I would then re-map back to
>> the list in the header. In kbmira 3.0, it uses the file passed in, as Barry
>> pointed out.
>>
>> Thanks for your help!
>>
>> matt
>>
>>
>>> On Feb 27, 2015, at 1:21 PM, Matt Post <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Although, those old successful runs might have been with the old Moses
>>> kbmira. I'll look into this and report back.
>>>
>>> matt
>>>
>>>
>>>> On Feb 27, 2015, at 12:19 PM, Matt Post <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>> Hi Barry — Thanks for the response. I don't think that's it, because I use
>>>> the exact same approach for lots of other tuning runs. Isn't it the header
>>>> line of the features file that lists dense features? I've been using this
>>>> format, where dense features are listed in each header line, and then
>>>> sparse features in the individual lines:
>>>>
>>>> FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2
>>>> WordPenalty PhrasePenalty Distortion
>>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
>>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
>>>> OOVPenalty=-100
>>>>
>>>> This works in lots of places (although, it also raises a separate
>>>> question, of whether kbmira actually distinguishes between sparse and
>>>> dense features? I seem to remember Colin once saying that there is a
>>>> single group weight between the two groups, but I've never been able to
>>>> find this in the code).
>>>>
>>>> matt
>>>>
>>>>
>>>>> On Feb 26, 2015, at 5:35 PM, Barry Haddow <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>> Hi Matt
>>>>>
>>>>> When mert-moses.pl runs kbmira, it always supplies a list of the dense
>>>>> features (and their initial values) using the --dense-init parameter. I
>>>>> think this is your problem. I've attached a typical file used for this
>>>>> feature list.
>>>>>
>>>>> Of course, kbmira should have a sensible message rather than a segfault.
>>>>> This is probably my doing,
>>>>>
>>>>> cheers - Barry
>>>>>
>>>>> On 26/02/15 22:18, Matt Post wrote:
>>>>>> kbmira segfaults on the following command:
>>>>>>
>>>>>> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o mert.out
>>>>>>
>>>>>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be
>>>>>> downloaded here:
>>>>>>
>>>>>> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
>>>>>> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>>>>>>
>>>>>> I tracked it down to this line of mert/FeatureStats.cpp.
>>>>>>
>>>>>> std::string SparseVector::decode(std::size_t id)
>>>>>> {
>>>>>> return m_id_to_name[id];
>>>>>> }
>>>>>>
>>>>>> Any obvious ideas before I go down this rabbit hole? I verified there
>>>>>> are no blank lines or anything else funny with the formatting, at least
>>>>>> as far as I can tell (all dense features, plus one sparse feature,
>>>>>> OOVPenalty=-100, showing up occasionally).
>>>>>>
>>>>>> matt
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>> <run1.dense>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support