Model 3/4 uses HMM/Model2 to bootstrap the alignment. First Viterbi alignment is computed using Model2 or HMM model, and then a hillclimbing algorithm is used to find optimal alignment using model 3/4. So even the initial alignment has some problem you _may_ still get good alignment result. So I usually not too worried about the zero prob viterbi alignment problem.
For the reason of the problem, usually it is because of underflow. Unfortunately giza does not use log probability, but true probability. So it can underflow if the sentence is too long. That's why the sentence length limit is set to 100. In our in-house experiment, changing float and double to long double can reduce the chance of underflow, and can support much longer sentences. We tried 500 before. However, the cost you pay is speed. --Q On Tue, Jul 3, 2012 at 5:46 AM, Patricia Helmich < [email protected]> wrote: > Hi, > > I am trying to train phrase models for several language pairs. Before > training the phrase models, I cleaned the corpora with the moses clean > script, so sentences with a length >60 were filtered out. This worked for > several corpora. For a few corpora, I got "WARNING: Model2 viterbi > alignment has zero score." > I found that another person solved the problem by reducing the length of > the sentences, so I reduced the length of the sentences to 50 for these > corpora. This worked for the problematic corpora except for one corpora > pair. For this corpora pair, I had to reduce the length of the sentences to > 30, so that it finally worked. By reducing the length to 30, I'm loosing a > high number of sentences of my corpora. That's why I was wondering which > is the reason for this warning and why for some language pairs it works > with longer sentences and for others it doesn't. > I also checked the ratio of 9:1. > Can you imagine any reason for this warning? And, since it is marked as a > warning, not as an error, is it necessary to remove it? > It would be very kind if you could give me some information about this > problem. > > Thank you, > Patricia > > > Extract from the logfile: > > 406 THTo3: Iteration 1 > 407 Reading more sentence pairs into memory ... > 408 WARNING: Model2 viterbi alignment has zero score. > 409 Here are the different elements that made this alignment > probability zero > 410 Source length 4 target length 35 > 411 best: fs[1] 1 : es[3] 3 , a: 0.13803 t: 0.870283 score 0.120125 > product : 0.120125 ss 0 > 412 best: fs[2] 2 : es[1] 1 , a: 0.350718 t: 0.221544 score > 0.0776995 product : 0.00933363 ss 0 > 413 best: fs[3] 3 : es[1] 1 , a: 0.150805 t: 0.324392 score > 0.0489198 product : 0.000456599 ss 0 > 414 best: fs[4] 4 : es[1] 1 , a: 0.0606276 t: 0.324392 score > 0.0196671 product : 8.97998e-06 ss 0 > 415 best: fs[5] 5 : es[1] 1 , a: 0.037479 t: 0.324392 score > 0.0121579 product : 1.09178e-07 ss 0 > 416 best: fs[6] 6 : es[1] 1 , a: 0.021535 t: 0.324392 score > 0.0069858 product : 7.62692e-10 ss 0 > 417 best: fs[7] 7 : es[1] 1 , a: 0.041835 t: 0.324392 score > 0.0135709 product : 1.03505e-11 ss 0 > 418 best: fs[8] 8 : es[1] 1 , a: 0.12501 t: 0.324392 score 0.0405522 > product : 4.19734e-13 ss 0 > 419 best: fs[9] 9 : es[1] 1 , a: 0.333332 t: 0.324392 score 0.10813 > product : 4.5386e-14 ss 0 > 420 best: fs[10] 10 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.47228e-14 ss 0 > 421 best: fs[11] 11 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 4.77594e-15 ss 0 > 422 best: fs[12] 12 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.54927e-15 ss 0 > 423 best: fs[13] 13 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 5.0257e-16 ss 0 > 424 best: fs[14] 14 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.63029e-16 ss 0 > 425 best: fs[15] 15 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 5.28852e-17 ss 0 > 426 best: fs[16] 16 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.71555e-17 ss 0 > 427 best: fs[17] 17 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 5.56508e-18 ss 0 > 428 best: fs[18] 18 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.80526e-18 ss 0 > 429 best: fs[19] 19 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 5.85611e-19 ss 0 > 430 best: fs[20] 20 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.89967e-19 ss 0 > 431 best: fs[21] 21 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 6.16235e-20 ss 0 > 432 best: fs[22] 22 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 1.99901e-20 ss 0 > 433 best: fs[23] 23 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 6.48461e-21 ss 0 > 434 best: fs[24] 24 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.10355e-21 ss 0 > 435 best: fs[25] 25 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 6.82372e-22 ss 0 > 436 best: fs[26] 26 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.21355e-22 ss 0 > 437 best: fs[27] 27 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 7.18057e-23 ss 0 > 438 best: fs[28] 28 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.32931e-23 ss 0 > 439 best: fs[29] 29 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 7.55608e-24 ss 0 > 440 best: fs[30] 30 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.45112e-24 ss 0 > 441 best: fs[31] 31 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 7.95122e-25 ss 0 > 442 best: fs[32] 32 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.5793e-25 ss 0 > 443 best: fs[33] 33 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 8.36703e-26 ss 0 > 444 best: fs[34] 34 : es[1] 1 , a: 0.999996 t: 0.324392 score > 0.324391 product : 2.71419e-26 ss 0 > 445 best: fs[35] 35 : es[1] 1 , a: 0.99992 t: 0.0101365 score > 0.0101357 product : 2.75101e-28 ss 0 > 446 Fert[0] selected 9 > 447 Fert[1] selected 9 > 448 Fert[2] selected 0 > 449 Fert[3] selected 9 > 450 Fert[4] selected 8 > 451 10000 > 452 20000 > 453 30000 > 454 40000 > 455 50000 > 456 Reading more sentence pairs into memory ... > 457 Reading more sentence pairs into memory ... > 458 #centers(pre/hillclimbed/real): 1 1 1 #al: 1075.58 > #alsophisticatedcountcollection: 0 #hcsteps: 0 > 459 #peggingImprovements: 0 > 460 A/D table contains 104118 parameters. > 461 A/D table contains 104094 parameters. > 462 NTable contains 397690 parameter. > 463 p0_count is 1.09339e+06 and p1 is 113340; p0 is 0.999 p1: 0.001 > 464 THTo3: TRAIN CROSS-ENTROPY 4.26144 PERPLEXITY 19.1788 > 465 THTo3: (1) TRAIN VITERBI CROSS-ENTROPY 4.34002 PERPLEXITY 20.2523 > 466 > 467 THTo3 Viterbi Iteration : 1 took: 44 seconds > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
