Hi Barry,

The whole file is too large to attach. The error message is

kbmira with c=0.01 decay=0.999 no_shuffle=0
Initialising random seed from system clock
Found 14118 initial sparse features
terminate called after throwing an instance of
'MosesTuning::FileFormatException'
  what():  Error in line "-5.44027 0 0 -5.34901 0 0 0 -224.872 1 1 1 -39
18 -26.2331 -40.6736 -44.3698 -82.5072 WT_,~,=3 WT_:~:=1 WT_“~“=1
WT_”~”=1 WT_曰~说=1 PL_s3=5 PL_3,2=2 PL_3,3=3 PL_2,3=4 PL_t3=7 PL_s1=5
PL_1,2=2 PL_1,1=3 PL_t1=4 PL_2,2=3 PL_t2=7 PL_s2=8 PL_2,1=1 WT_有~有=1
WT_!~!=1 WT_其~的=1 WT_其~他=1 WT_不~也=1 WT_不~没=1 WT_而~而=1 WT_而~
却=1 WT_祖逖~逖=1 WT_祖逖~祖=1 WT_逖~祖=1 WT_逖~逖=1 WT_大~大江=1 WT_者~
的=1 WT_者~人=1 WT_江~大江=1 WT_渡~渡过=1 WT_复~又=1 WT_余~有=1 WT_誓~发
誓=1 WT_楫~木=1 WT_江~长江=1 WT_击~击=1 WT_将~带领=1 WT_济~成功=1 WT_中
原~中原=1 WT_清~廓清=1 WT_如~像=1 WT_楫~戢=1 WT_能~能=1 WT_中~中流=1 WT_
流~中流=1 WT_部曲~部下=1 " of run7.features.dat
Aborted

I don't think there is any weird characters in it. I'm always using UTF-8.


在 2016年01月18日 16:43, Barry Haddow 写道:
> Hi Dingyuan
>
> Is it possible to attach the features.dat file that is causing the
> error? Almost certainly Moses is failing to parse the line because of
> the Asian characters in the feature names,
>
> cheers - Barry
>
> On 16/01/16 15:58, Dingyuan Wang wrote:
>> I ran
>>
>> ~/software/moses/bin/kbmira -J 75  --dense-init run7.dense --sparse-init
>> run7.sparse-weights  --ffile run1.features.dat --ffile run2.features.dat
>> --ffile run3.features.dat --ffile run4.features.dat --ffile
>> run5.features.dat --ffile run6.features.dat --ffile run7.features.dat
>> --scfile run1.scores.dat --scfile run2.scores.dat --scfile
>> run3.scores.dat --scfile run4.scores.dat --scfile run5.scores.dat
>> --scfile run6.scores.dat --scfile run7.scores.dat -o /tmp/mert.out
>>
>> in the tuning/tmp.1 directory, which will certainly replicate the error.
>>
>> 在 2016年01月16日 23:42, Hieu Hoang 写道:
>>> The mert script prints out every command it runs. You should be able to
>>> replicate the error by running the last command
>>>
>>> On 16 Jan 2016 14:18, "Dingyuan Wang" <abcdoyle...@gmail.com
>>> <mailto:abcdoyle...@gmail.com>> wrote:
>>>
>>>      Sorry, but I can't reliably replicate the same problem when
>>> running
>>>      TUNING_tune.1 alone. There is no character '_' in the test set
>>> or top50
>>>      list.
>>>
>>>      I'm using sparse-features = "target-word-insertion top 50,
>>>      source-word-deletion top 50, word-translation top 50 50,
>>> phrase-length"
>>>
>>>      I've attached some related files from EMS and the EMS config.
>>>
>>>     
>>> https://mega.nz/#!xs0SFKxL!M_RTBp1JGX24-b4xlYYLP-bLXKiC_Sl-p96x55avAB4
>>>
>>>      在 2016年01月16日 02:45, Hieu Hoang 写道:
>>>      > could you make your model files available for download so I can
>>>      > replicate this problem.
>>>      >
>>>      > it seems like you're using a feature function with sparse
>>> scores. I
>>>      > think the character '_' must be escaped.
>>>      >
>>>      >
>>>      > On 12/01/16 04:00, Dingyuan Wang wrote:
>>>      >> Hi all,
>>>      >>
>>>      >> I'm using EMS for doing experiments. Every time the kbmira
>>> died with
>>>      >> SIGABRT when turning on one direction, while tuning on the
>>> opposite
>>>      >> direction (same config and test set) was successful.
>>>      >>
>>>      >> The mert.log (stderr) shows follows:
>>>      >>
>>>      >>
>>>      >> kbmira with c=0.01 decay=0.999 no_shuffle=0
>>>      >> Initialising random seed from system clock
>>>      >> Found 15323 initial sparse features
>>>      >> ....terminate called after throwing an instance of
>>>      >> 'MosesTuning::FileFormatException'
>>>      >>    what():  Error in line "-4.51933 0 0 -6.09733 0 0 0
>>> -121.556 2
>>>      -20 12
>>>      >> -31.6201 -38.5211 -26.5112 -60.6166 WT_,~,=2 WT_?~?=1
>>> PL_s1=4
>>>      >> PL_s3=1 PL_3,3=1 PL_2,2=3 PL_1,2=1 PL_2,1=3 PL_t1=6 PL_t2=4
>>> PL_t3=2
>>>      >> PL_2,3=1 PL_s2=7 PL_1,1=3 WT_未~没有=1 WT_何~怎么=1 WT_何~能=1
>>>      WT_方~正
>>>      >> 在=1 WT_又~还=1 WT_君~您=2 WT_趣~向=1 WT_趣~奔=1 WT_有~没
>>> 有=1 WT_
>>>      往~去=1
>>>      >> WT_官~官员=1 WT_假~借=1 WT_檄~檄文=1 WT_文~文告=1 WT_上~上
>>> 级=1 WT_为~
>>>      >> 呢=1 WT_在~正在=1 " of run7.features.dat
>>>      >> Aborted
>>>      >>
>>>      >>
>>>      >> I think since run7.scores.dat is generated by some scripts, I
>>>      wouldn't
>>>      >> be responsible for making the bad format. Last time it also
>>> died, I
>>>      >> removed the likely offending line in the test set, but this
>>> time
>>>      another
>>>      >> line appears.
>>>      >>
>>>      >> --
>>>      >> Dingyuan Wang
>>>      >> _______________________________________________
>>>      >> Moses-support mailing list
>>>      >> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>      >> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>      >
>>>
>>>      --
>>>      Dingyuan Wang (gumblex)
>>>
>
>

-- 
Dingyuan Wang (gumblex)

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to