The counts are written in the 5th column in the phrase table.
http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
This is for debugging purposes only, they don't influence decoding in
anyway.
IF you want to know more about how it works - the counts are stored in
the file extract.*.sorted.gz and extract.*.inv.sorted.gz. The counts are
summed and the probability is calculated by the score program. The
source code for the score program is in
phrase-extract/score-main.cpp
On 08/07/2015 18:05, Harshit Gupta wrote:
Hi, I am currently working on Moses platform and in the phrase tables,
I am interested in the counts of phrases instead of phrase translation
probabilities. Can I get to know this counts ?
In the Moses manual, it is mentioned that in training process in
calculating phrase scores that
"To estimate the phrase translation probability φ(e|f) we proceed as
follows: First, the extract file is sorted. This ensures that all
English phrase translations for an foreign phrase are next to each
other in the file. Thus, we can process the file, one foreign phrase
at a time, *collect counts* and compute φ(e|f) for that foreign phrase f."
Where are these counts collected ? Where can I get these counts ?
Regards
Harshit
--
Harshit Gupta
Third Year Undergraduate
Electrical Engineering
IIT Madras
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support