Hi,
this is LaTex {algorithmic} code.
count($e|f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
means
count(e|f) += t(e|f) / s-total(e)
So, you got that right.
-phi
On Mon, Jul 27, 2009 at 10:18 PM, James Read<[email protected]> wrote:
> Hi,
>
> this seems to be pretty much what I implemented. What exactly do you mean by
> these three lines?:
>
> \STATE count($e|f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
> \STATE total($f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
> \STATE $t(e|f)$ = $\frac{\text{count}(e|f)}{\text{total}(f)}$
>
> What do you mean by $\frac? The pseudocode I was using shows these lines as
> a simple division and this is what my code does. i.e
>
> t(e|f) = count(e|f) / total(f)
>
> In C code something like:
>
> for ( f = 0; f < size_source; f++ )
> {
> for ( e = 0; e < size_target; e++ )
> {
> t[f][e] = count[f][e] / total[f];
> }
> }
>
>
> Is this the kind of thing you mean?
>
> Thanks
> James
>
> Quoting Philipp Koehn <[email protected]>:
>
>> Hi,
>>
>> I think there was a flaw in some versions of the pseudo code.
>> The probabilities certainly need to add up to one. There are
>> two normalizations going on in the algorithm: one on the sentence
>> level (so the probability of all alignments add up to one) and
>> one on the word level.
>>
>> Here the most recent version:
>>
>> \REQUIRE set of sentence pairs $(\text{\bf e},\text{\bf f})$
>> \ENSURE translation prob. $t(e|f)$
>> \STATE initialize $t(e|f)$ uniformly
>> \WHILE{not converged}
>> \STATE \COMMENT{initialize}
>> \STATE count($e|f$) = 0 {\bf for all} $e,f$
>> \STATE total($f$) = 0 {\bf for all} $f$
>> \FORALL{sentence pairs ({\bf e},{\bf f})}
>> \STATE \COMMENT{compute normalization}
>> \FORALL{words $e$ in {\bf e}}
>> \STATE s-total($e$) = 0
>> \FORALL{words $f$ in {\bf f}}
>> \STATE s-total($e$) += $t(e|f)$
>> \ENDFOR
>> \ENDFOR
>> \STATE \COMMENT{collect counts}
>> \FORALL{words $e$ in {\bf e}}
>> \FORALL{words $f$ in {\bf f}}
>> \STATE count($e|f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
>> \STATE total($f$) += $\frac{t(e|f)}{\text{s-total}(e)}$
>> \ENDFOR
>> \ENDFOR
>> \ENDFOR
>> \STATE \COMMENT{estimate probabilities}
>> \FORALL{foreign words $f$}
>> \FORALL{English words $e$}
>> \STATE $t(e|f)$ = $\frac{\text{count}(e|f)}{\text{total}(f)}$
>> \ENDFOR
>> \ENDFOR
>> \ENDWHILE
>>
>> -phi
>>
>>
>>
>> On Sun, Jul 26, 2009 at 5:24 PM, James Read <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I have implemented the EM Model 1 algorithm as outlined in Koehn's
>>> lecture notes. I was surprised to find the raw output of the algorithm
>>> gives a translation table that given any particular source word the
>>> sum of the probabilities of each possible target word is far greater
>>> than 1.
>>>
>>> Is this normal?
>>>
>>> Thanks
>>> James
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support