There's been a recent release of GIZA (July 8) that fixes some
potential sources of non-determinism, specifically relating to how
distortion models (model 2 or the HMM) get initialized.

When did you download it from http://code.google.com/p/giza-pp/ ?

--Chris

On Wed, Jul 16, 2008 at 6:35 PM, John D. Burger <[EMAIL PROTECTED]> wrote:
> Hi -
>
> I have recently run GIZA twice on the exact same input data, on the
> same machine, with very different results.  In the one case, it
> finished normally, in the other, I got hillclimbing warnings:
>
>   WARNING: already 41 iterations in hillclimb: 1.10041 2 33 26
>   WARNING: already 42 iterations in hillclimb: 1.00001 0 33 26
>
> and then the dreaded NaNs:
>
>   THTo3: Iteration 1
>   Reading more sentence pairs into memory ...
>   #centers(pre/hillclimbed/real): nan nan nan  #al: nan
> #alsophisticatedcountcollection: nan #hcsteps: nan
>   #peggingImprovements: nan
>   A/D table contains 0 parameters.
>   A/D table contains 0 parameters.
>   p0_count is 0 and p1 is 0; p0 is 0.999 p1: 0.001
>   THTo3: TRAIN CROSS-ENTROPY nan PERPLEXITY nan
>
> As far as I can tell, these runs should be =exactly= the same - same
> GIZA executable, same input data, same config file.  And, indeed, the
> cross-entropy and perplexity figures that GIZA prints out after each
> model iteration match exactly ... until they don't, on the THTo3 line
> above.
>
> Has anyone else experienced this?  Is there a random component to the
> model 3 code that might explain this?
>
> Sorry if this is something everybody already knows about.  Thanks for
> any info you can provide.
>
> - John D. Burger
>   MITRE
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to