Hi Barry — Thanks for the response. I don't think that's it, because I use the 
exact same approach for lots of other tuning runs. Isn't it the header line of 
the features file that lists dense features? I've been using this format, where 
dense features are listed in each header line, and then sparse features in the 
individual lines:

        FEATURES_TXT_BEGIN_0 0 300 9 lm_0 lm_1 tm_pt_1 tm_pt_3 tm_pt_0 tm_pt_2 
WordPenalty PhrasePenalty Distortion 
        -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
        -82.183 -72.639 -79.162 -41.493 -60.118 -28.509 -10.857 19 -8 
OOVPenalty=-100

This works in lots of places (although, it also raises a separate question, of 
whether kbmira actually distinguishes between sparse and dense features? I seem 
to remember Colin once saying that there is a single group weight between the 
two groups, but I've never been able to find this in the code).

matt


> On Feb 26, 2015, at 5:35 PM, Barry Haddow <[email protected]> wrote:
> 
> Hi Matt
> 
> When mert-moses.pl runs kbmira, it always supplies a list of the dense 
> features (and their initial values) using the --dense-init parameter. I think 
> this is your problem. I've attached a typical file used for this feature list.
> 
> Of course, kbmira should have a sensible message rather than a segfault. This 
> is probably my doing,
> 
> cheers - Barry
> 
> On 26/02/15 22:18, Matt Post wrote:
>> kbmira segfaults on the following command:
>> 
>> 
>>         kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o 
>> mert.out
>> 
>> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be 
>> downloaded here:
>> 
>> 
>>         https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0 
>> <https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0>
>> 
>>         https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0 
>> <https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0>
>> 
>> I tracked it down to this line of mert/FeatureStats.cpp.
>> 
>> std::string SparseVector::decode(std::size_t id)
>> {
>>   return m_id_to_name[id];
>> }
>> 
>> Any obvious ideas before I go down this rabbit hole? I verified there are no 
>> blank lines or anything else funny with the formatting, at least as far as I 
>> can tell (all dense features, plus one sparse feature, OOVPenalty=-100, 
>> showing up occasionally).
>> 
>> matt
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]>
>> http://mailman.mit.edu/mailman/listinfo/moses-support 
>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
> 
> <run1.dense>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to