Hi Matt

This was part of the changes to support hypergraph mira, since the 
hypergraphs don't have the FEATURES_TXT_BEGIN_0 sections. In fact they 
don't differentiate between sparse and dense features.

Does it work correctly when you use the --dense-init paramater?

cheers - Barry

On 05/03/15 15:18, Matt Post wrote:
> Okay, the old kbmira works, so this must be part of the 3.0 changes.
>
> It seems that the names of features in the header line 
> (FEATURES_TXT_BEGIN_0) are ignored entirely. The 2.1 kbmira would 
> output dense feature weights using names F1..FN, which I would then 
> re-map back to the list in the header. In kbmira 3.0, it uses the file 
> passed in, as Barry pointed out.
>
> Thanks for your help!
>
> matt
>
>
>> On Feb 27, 2015, at 1:21 PM, Matt Post <[email protected] 
>> <mailto:[email protected]>> wrote:
>>
>> Although, those old successful runs might have been with the old 
>> Moses kbmira. I'll look into this and report back.
>>
>> matt
>>
>>
>>> On Feb 27, 2015, at 12:19 PM, Matt Post <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>
>>> Hi Barry — Thanks for the response. I don't think that's it, because 
>>> I use the exact same approach for lots of other tuning runs. Isn't 
>>> it the header line of the features file that lists dense features? 
>>> I've been using this format, where dense features are listed in each 
>>> header line, and then sparse features in the individual lines:
>>>
>>> FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 
>>> tm_pt_2 WordPenalty PhrasePenalty Distortion
>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8
>>> -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
>>> OOVPenalty=-100
>>>
>>> This works in lots of places (although, it also raises a separate 
>>> question, of whether kbmira actually distinguishes between sparse 
>>> and dense features? I seem to remember Colin once saying that there 
>>> is a single group weight between the two groups, but I've never been 
>>> able to find this in the code).
>>>
>>> matt
>>>
>>>
>>>> On Feb 26, 2015, at 5:35 PM, Barry Haddow 
>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>
>>>> Hi Matt
>>>>
>>>> When mert-moses.pl runs kbmira, it always supplies a list of the 
>>>> dense features (and their initial values) using the --dense-init 
>>>> parameter. I think this is your problem. I've attached a typical 
>>>> file used for this feature list.
>>>>
>>>> Of course, kbmira should have a sensible message rather than a 
>>>> segfault. This is probably my doing,
>>>>
>>>> cheers - Barry
>>>>
>>>> On 26/02/15 22:18, Matt Post wrote:
>>>>> kbmira segfaults on the following command:
>>>>>
>>>>> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o 
>>>>> mert.out
>>>>>
>>>>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
>>>>> downloaded here:
>>>>>
>>>>> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
>>>>> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>>>>>
>>>>> I tracked it down to this line of mert/FeatureStats.cpp.
>>>>>
>>>>> std::string SparseVector::decode(std::size_t id)
>>>>> {
>>>>> return m_id_to_name[id];
>>>>> }
>>>>>
>>>>> Any obvious ideas before I go down this rabbit hole? I verified 
>>>>> there are no blank lines or anything else funny with the 
>>>>> formatting, at least as far as I can tell (all dense features, 
>>>>> plus one sparse feature, OOVPenalty=-100, showing up occasionally).
>>>>>
>>>>> matt
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>> <run1.dense>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to