Hi all,

I'm running experiments with Moses and different limits on the size of
the phrases extracted (parameter 'max-phrase-length' during training).
As expected, I get larger phrase tables as I increase the maximum size
allowed for the phrases.

By filtering these phrase tables for the decoding a certain test set
(using the script 'filter-model-given-input'), I would expect the
filtered phrase tables to have a larger number of entries for larger
maximum sizes of phrases, but this is not what is happening. For
example, given three phrase tables were the limits on phrase sizes are
2, 3, and 4, I get the following numbers of entries in the filtered
versions (all with the same training, dev, test set):

--max-phrase-length = 2 --> 3,348,416 entries
--max-phrase-length = 3 --> 2,549,971 entries
--max-phrase-length = 4 --> 3,176,313 entries

As far as I know, the filtering script simply checks all the possible
adjacent n-grams in the input sentences (with maximun n = 10) and
extracts from the phrase table only the phrases matching these ngrams.
Does it do anything else?

Thanks, Lucia
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to