Thanks philippe
I already cleaned my data set, I'm using EMS.
Hi Matthias,
thank you for your reply.
I only have alignment in one direction. why?
there are many warning messages in giza log-file, are they related to the
problem? (the maximum sentence size is only 80)
490000
WARNING: Model2 viterbi alignment has zero score.
Here are the different elements that made this alignment probability zero
500000
WARNING: already 41 iterations in hillclimb: 1.95402 1 17 64
WARNING: already 42 iterations in hillclimb: 1.80881 2 34 8
WARNING: already 43 iterations in hillclimb: 2.19253 2 66 8
WARNING: already 44 iterations in hillclimb: 2.61934 1 35 66
WARNING: already 45 iterations in hillclimb: 1.00471 1 62 64
WARNING: already 46 iterations in hillclimb: 1.00001 0 62 64
WARNING: already 41 iterations in hillclimb: 1.12453 2 55 2
WARNING: already 42 iterations in hillclimb: 1.11522 2 26 2
WARNING: already 43 iterations in hillclimb: 5.19799 2 30 2
On Thursday, October 9, 2014 4:08 PM, Philipp Koehn <[email protected]> wrote:
Hi,
this may be also caused by having too long / empty / length-mismatched sentences
when running GIZA. Make sure to run the clean-corpus-n.perl script first.
-phi
On Thu, Oct 9, 2014 at 10:49 AM, Matthias Huck <[email protected]> wrote:
Hi Arefeh,
>
>Have you been able to resolve that issue? Maybe one of your GIZA
>alignments is flawed, for instance because the GIZA process was
>terminated before is finished. Did you check that both the standard and
>the inverse alignment files have the same number of lines?
>
>Check it like this:
>
>$ zcat training/giza.1/de-en.A3.final.gz | wc -l; zcat
>training/giza-inverse.1/en-de.A3.final.gz | wc -l
>900000
>501713
>
>In that case there would be a problem and you'd have to rerun GIZA in
>the inverse direction. If you get the same number of lines and it
>matches what you expect to get from your corpus, then it's a different
>issue and you have to investigate further.
>
>Cheers,
>Matthias
>
>
>On Mon, 2014-10-06 at 03:13 -0700, Arefeh Kazemi wrote:
>> Hi
>> I have re-installed moses on my system but I have a problem with giza
>> - symmetrize step.
>> it gets some errors of this type:
>> Sentence mismatch error! Line #501714
>> Sentence mismatch error! Line #501715
>> .
>> .
>> .
>> Sentence mismatch error! Line #900000
>>
>>
>> all of my data files are in utf8 format and I have run moses
>> successfully on these files before.
>>
>>
>> any suggestion to fix the problem would be appreciated.
>>
>>
>> Regards
>> Arefeh
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>--
>The University of Edinburgh is a charitable body, registered in
>Scotland, with registration number SC005336.
>
>
>_______________________________________________
>Moses-support mailing list
>[email protected]
>http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support