[Moses-support] Non-deterministic GIZA?

John D. Burger Wed, 16 Jul 2008 11:36:38 -0700

Hi -

I have recently run GIZA twice on the exact same input data, on the  
same machine, with very different results.  In the one case, it  
finished normally, in the other, I got hillclimbing warnings:


   WARNING: already 41 iterations in hillclimb: 1.10041 2 33 26
   WARNING: already 42 iterations in hillclimb: 1.00001 0 33 26

and then the dreaded NaNs:

   THTo3: Iteration 1
   Reading more sentence pairs into memory ...
   #centers(pre/hillclimbed/real): nan nan nan  #al: nan  
#alsophisticatedcountcollection: nan #hcsteps: nan
   #peggingImprovements: nan
   A/D table contains 0 parameters.
   A/D table contains 0 parameters.
   p0_count is 0 and p1 is 0; p0 is 0.999 p1: 0.001
   THTo3: TRAIN CROSS-ENTROPY nan PERPLEXITY nan

As far as I can tell, these runs should be =exactly= the same - same  
GIZA executable, same input data, same config file.  And, indeed, the  
cross-entropy and perplexity figures that GIZA prints out after each  
model iteration match exactly ... until they don't, on the THTo3 line  
above.

Has anyone else experienced this?  Is there a random component to the  
model 3 code that might explain this?

Sorry if this is something everybody already knows about.  Thanks for  
any info you can provide.

- John D. Burger
   MITRE


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Non-deterministic GIZA?

Reply via email to