Hi Hieu Hoang,
     Thanks for your so professional quick answers. After double check
again, I found such error in my training.out, "Error: unequal numbers of
non-terminals. Make sure the text does not contain words in square brackets
(like [xxx])."
      When I deep analysis why my 100,000 corpus can generate final bleus,
but my 300,000 corpus can not, I found the 300,000 corpus has "[xxx]" like
strings. It seems this 300,000 corpus has not been correctly tokenized.
      I am re-running my program after the corpus tokenized. I hope it is
ok this time.
      Thanks for your so kindly help.

Merry Christmas~!
-Lang Jun

2011/12/26 Hieu Hoang <[email protected]>

> hi lang
>
> this could arise for a number of reasons. TO solve it:
>
> 1. make sure the phrase-table.half were sorted, and make sure that you set
>      LC_ALL=C
>   when you sorted the extract files AND the phrase-table.half files.
>
> 2. double check that you don't have control characters in your corpus, and
> that you haven't run out of disk space.
>
> 3. check for any odd characters in the lines in files phrase-table.half
> preceeding the error line 1167912.
>
> On Mon, Dec 26, 2011 at 7:49 AM, Bill_Lang(Gmail) 
> <[email protected]>wrote:
>
>> Hi moses friends,
>>       I am running moses hierarchical training scripts several times.
>> When I used small corpus (100,000 sentence pairs), everything was ok with
>> the final bleus. But when I used one bigger curpus (about 300,000 sentence
>> pairs), there were the same errors as follows:
>> ------------------------------------------------------
>> (6.6) consolidating the two halves @ Sun Dec 25 19:39:38 SGT 2011
>> Executing: mosesdecoder/scripts/training/phrase-extract/consolidate
>> training/model/rule-table.half.f2e
>> training/model/rule-table.half.e2f.sorted training/model/rule-table
>> --Hierarchical
>> Consolidate v2.0 written by Philipp Koehn
>> consolidating direct and indirect rule tables
>> processing hierarchical rules
>> ...........ERROR: source phrase does not match in line 1167912: '[X][X]
>> and the republic of korea [rok] [X]' != '[X][X] and the rok [X]'
>> Exit code: 1
>> ERROR: Consolidating the two phrase table halves failed at
>> mosesdecoder/scripts/training/train-model.perl line 1473.
>> ------------------------------------------------------
>>
>> I used 8 and 5 for -max-phrase-length to train the hierarchical model,
>> but the error was still the same.
>>
>> Is there any friend ever encounter the same question? Thanks in advance.
>>
>> Merry Christmas.
>>
>> --
>> Best Regards,
>> Lang Jun
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>


-- 
Best Regards,
Lang Jun http://www1.i2r.a-star.edu.sg/~jlang/
Research Scientist I, Ph.D.
Human Language Technology Department (HLT)
Institute for Infocomm Research (I2R)
Agency for Science, Technology and Research (ASTAR), Singapore
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to