Re: [Moses-support] analysis.perl output files

Philipp Koehn Wed, 28 Sep 2011 06:29:45 -0700

Hi,

On Mon, Sep 26, 2011 at 5:40 PM, marco turchi <[email protected]> wrote:
> corpus-coverage-summary and ttable-coverage-summary:
> what does each column represent?


- n-gram order
- number of occurrences in corpus/t-table
- distinct number of phrases in test set with this number of
occurrences ("type")
- total number of phrases in test set with this number of occurrences ("token")

For the low occurrence counts, this is reported on the web page on the top.

> ttable-coverage-by-phrase:
> I suppose that the second column is the number of source phrases in the tt
> table where that particular phrase appears, but what is it the third column?
> is the translation entropy?

Yes, translation entropy based on normalized forward phrase
translation probability.

> input-annotation:
> which information is reported after each sentence?

For each span over the input sentence:
- span range
- count in corpus
- count in ttable (number of distinct translations)
- translation table entropy

This is the basis of the colorful visualization over the input
sentence on the web page.

-phi
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] analysis.perl output files

Reply via email to