Hi - I have recently run GIZA twice on the exact same input data, on the same machine, with very different results. In the one case, it finished normally, in the other, I got hillclimbing warnings:
WARNING: already 41 iterations in hillclimb: 1.10041 2 33 26 WARNING: already 42 iterations in hillclimb: 1.00001 0 33 26 and then the dreaded NaNs: THTo3: Iteration 1 Reading more sentence pairs into memory ... #centers(pre/hillclimbed/real): nan nan nan #al: nan #alsophisticatedcountcollection: nan #hcsteps: nan #peggingImprovements: nan A/D table contains 0 parameters. A/D table contains 0 parameters. p0_count is 0 and p1 is 0; p0 is 0.999 p1: 0.001 THTo3: TRAIN CROSS-ENTROPY nan PERPLEXITY nan As far as I can tell, these runs should be =exactly= the same - same GIZA executable, same input data, same config file. And, indeed, the cross-entropy and perplexity figures that GIZA prints out after each model iteration match exactly ... until they don't, on the THTo3 line above. Has anyone else experienced this? Is there a random component to the model 3 code that might explain this? Sorry if this is something everybody already knows about. Thanks for any info you can provide. - John D. Burger MITRE _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
