The counts are written in the 5th column in the phrase table.
   http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
This is for debugging purposes only, they don't influence decoding in anyway.

IF you want to know more about how it works - the counts are stored in the file extract.*.sorted.gz and extract.*.inv.sorted.gz. The counts are summed and the probability is calculated by the score program. The source code for the score program is in
   phrase-extract/score-main.cpp


On 08/07/2015 18:05, Harshit Gupta wrote:
Hi, I am currently working on Moses platform and in the phrase tables, I am interested in the counts of phrases instead of phrase translation probabilities. Can I get to know this counts ? In the Moses manual, it is mentioned that in training process in calculating phrase scores that "To estimate the phrase translation probability φ(e|f) we proceed as follows: First, the extract file is sorted. This ensures that all English phrase translations for an foreign phrase are next to each other in the file. Thus, we can process the file, one foreign phrase at a time, *collect counts* and compute φ(e|f) for that foreign phrase f."

Where are these counts collected ? Where can I get these counts ?

Regards
Harshit

--
Harshit Gupta
Third Year Undergraduate
Electrical Engineering
IIT Madras


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to