I think there is something wrong in the printing of sparse feature
parameters. I can reproduce it on two computers, about one error out of
ten batches. I checked the n-best output using attached script.

2017-04-12 22:25, Dingyuan Wang:
> I don't find anything wrong of this sentence in the test set. Other
> candidates of this sentence is good in the same batch of output. This
> problem occurs randomly (random sentence and candidate) during tuning.
> 
> 2017-04-12 21:48, Hieu Hoang:
>> It looks like there is a phrase that is length 0, hence ' = 1'.
>>
>> Check your data has been cleaned and encoded correctly
>>
>> * Looking for MT/NLP opportunities *
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 12 April 2017 at 13:36, Dingyuan Wang <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Dear all,
>>
>>     I come across the exactly same problem a year ago (follow the thread):
>>
>>     https://www.mail-archive.com/[email protected]/msg13673.html
>>     <https://www.mail-archive.com/[email protected]/msg13673.html>
>>
>>     The moses constantly and randomly output corrupted best100 out that
>>     crashes further kbmira tuning. Like:
>>
>>     45 ||| “ 愿 以 车 马 衣 裘 等 皆 与 朋 友 共 分 共 , 则 皆 敝 之 亦
>>     无 所 恨
>>     。 ”  ||| LexicalReordering0= -6.1176 0 0 -6.58298 0 0 Distortion0= 0
>>     LM0= -115.094 TWI_,= 0 SWD_OTHER= 2 WT_都~皆= 2 WT_OTHER~OTHER= 12
>>     WT_也~OTHER= 1 WT_把~以= 1 WT_,~,= 1 WT_等~OTHER= 1 WT_OTHER~所= 1
>>     WT_OTHER~无= 0 WT_OTHER~以= 1 WT_没有~无= 1 WT_了~之= 1 WT_”~”= 1
>>     WT_。~。= 1
>>     WT_也~而= 0 = 1 WT_OTHER~则= 1 WT_和~与= 1 WT_了~OTHER= 0 PL_t2= 5
>>     PL_s2= 4
>>     PL_1,2= 2 PL_3,4= 0 PL_s3= 1 WordPenalty0= -26 PhrasePenalty0= 21
>>     TranslationModel0= -66.0904 -70.4587 -24.5341 -28.4086 ||| -15.012
>>
>>     There is an error in "WT_也~而= 0 = 1". Then kbmira:
>>
>>     kbmira with c=0.01 decay=0.999 no_shuffle=0
>>     Initialising random seed from system clock
>>     terminate called after throwing an instance of
>>     'MosesTuning::FileFormatException'
>>       what():  Error in line "-6.1176 0 0 -6.58298 0 0 0 -115.094 1 -26 21
>>     -66.0904 -70.4587 -24.5341 -28.4086 SWD_OTHER=2 WT_,~,=1
>>     WT_OTHER~OTHER=12 PL_t2=5 PL_s3=1 PL_s2=4 PL_1,2=2 WT_”~”=1 WT_。~。=1
>>     WT_没有~无=1 WT_了~之=1 WT_OTHER~以=1 WT_都~皆=2 WT_OTHER~所=1
>>     WT_OTHER~则=1
>>     WT_把~以=1 WT_和~与=1 WT_等~OTHER=1 WT_也~OTHER=1 " of run1.features.dat
>>     Aborted (core dumped)
>>
>>     System is Debian 9 (stretch/testing) with GCC 6.3.0, moses latest git
>>     checkout.
>>
>>     --
>>     Dingyuan Wang
>>     _______________________________________________
>>     Moses-support mailing list
>>     [email protected] <mailto:[email protected]>
>>     http://mailman.mit.edu/mailman/listinfo/moses-support
>>     <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>
>>
> 

-- 
Dingyuan Wang
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import sys

for k, ln in enumerate(sys.stdin, 1):
    parts = [s.strip() for s in ln.split(' ||| ')]
    for token in parts[2].split():
        if token[-1] == '=':
            assert len(token) > 1, '[%d] %s' % (k, ln.strip())
        else:
            float(token)
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to