There's been a recent release of GIZA (July 8) that fixes some potential sources of non-determinism, specifically relating to how distortion models (model 2 or the HMM) get initialized.
When did you download it from http://code.google.com/p/giza-pp/ ? --Chris On Wed, Jul 16, 2008 at 6:35 PM, John D. Burger <[EMAIL PROTECTED]> wrote: > Hi - > > I have recently run GIZA twice on the exact same input data, on the > same machine, with very different results. In the one case, it > finished normally, in the other, I got hillclimbing warnings: > > WARNING: already 41 iterations in hillclimb: 1.10041 2 33 26 > WARNING: already 42 iterations in hillclimb: 1.00001 0 33 26 > > and then the dreaded NaNs: > > THTo3: Iteration 1 > Reading more sentence pairs into memory ... > #centers(pre/hillclimbed/real): nan nan nan #al: nan > #alsophisticatedcountcollection: nan #hcsteps: nan > #peggingImprovements: nan > A/D table contains 0 parameters. > A/D table contains 0 parameters. > p0_count is 0 and p1 is 0; p0 is 0.999 p1: 0.001 > THTo3: TRAIN CROSS-ENTROPY nan PERPLEXITY nan > > As far as I can tell, these runs should be =exactly= the same - same > GIZA executable, same input data, same config file. And, indeed, the > cross-entropy and perplexity figures that GIZA prints out after each > model iteration match exactly ... until they don't, on the THTo3 line > above. > > Has anyone else experienced this? Is there a random component to the > model 3 code that might explain this? > > Sorry if this is something everybody already knows about. Thanks for > any info you can provide. > > - John D. Burger > MITRE > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
