Re: [Moses-support] EM Model 1 question

James Read Mon, 27 Jul 2009 14:18:52 -0700

Hi,

this seems to be pretty much what I implemented. What exactly do you  
mean by these three lines?:


\STATE count($e|f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
\STATE total($f$)   += $\frac{t(e|f)}{\text{s-total}(e)}$
\STATE $t(e|f)$ = $\frac{\text{count}(e|f)}{\text{total}(f)}$

What do you mean by $\frac? The pseudocode I was using shows these  
lines as a simple division and this is what my code does. i.e

t(e|f) = count(e|f) / total(f)

In C code something like:

for ( f = 0; f < size_source; f++ )
{
   for ( e = 0; e < size_target; e++ )
   {
     t[f][e] = count[f][e] / total[f];
   }
}


Is this the kind of thing you mean?

Thanks
James

Quoting Philipp Koehn <[email protected]>:

> Hi,
>
> I think there was a flaw in some versions of the pseudo code.
> The probabilities certainly need to add up to one. There are
> two normalizations going on in the algorithm: one on the sentence
> level (so the probability of all alignments add up to one) and
> one on the word level.
>
> Here the most recent version:
>
> \REQUIRE set of sentence pairs $(\text{\bf e},\text{\bf f})$
> \ENSURE translation prob. $t(e|f)$
> \STATE initialize $t(e|f)$ uniformly
> \WHILE{not converged}
>   \STATE \COMMENT{initialize}
>   \STATE count($e|f$) = 0 {\bf for all} $e,f$
>   \STATE total($f$) = 0 {\bf for all} $f$
>   \FORALL{sentence pairs ({\bf e},{\bf f})}
>     \STATE \COMMENT{compute normalization}
>     \FORALL{words $e$ in {\bf e}}
>       \STATE s-total($e$) = 0
>       \FORALL{words $f$ in {\bf f}}
>         \STATE s-total($e$) += $t(e|f)$
>       \ENDFOR
>     \ENDFOR
>     \STATE \COMMENT{collect counts}
>     \FORALL{words $e$ in {\bf e}}
>       \FORALL{words $f$ in {\bf f}}
>         \STATE count($e|f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
>         \STATE total($f$)   += $\frac{t(e|f)}{\text{s-total}(e)}$
>       \ENDFOR
>     \ENDFOR
>   \ENDFOR
>   \STATE \COMMENT{estimate probabilities}
>   \FORALL{foreign words $f$}
>     \FORALL{English words $e$}
>       \STATE $t(e|f)$ = $\frac{\text{count}(e|f)}{\text{total}(f)}$
>     \ENDFOR
>   \ENDFOR
> \ENDWHILE
>
> -phi
>
>
>
> On Sun, Jul 26, 2009 at 5:24 PM, James Read <[email protected]> wrote:
>
>> Hi,
>>
>> I have implemented the EM Model 1 algorithm as outlined in Koehn's
>> lecture notes. I was surprised to find the raw output of the algorithm
>> gives a translation table that given any particular source word the
>> sum of the probabilities of each possible target word is far greater
>> than 1.
>>
>> Is this normal?
>>
>> Thanks
>> James
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] EM Model 1 question

Reply via email to