Hi Rico,
cmph is not probabilistic, but the check for phrases outside the set of 
training phrases (false positives) is. For encodings others than None, 
the compression also serves as a test, so the fp rate is much smaller 
then. You can also modifiy the fp probability by increasing the number 
of fingerprint bits, for instance to 20 (-fingerprint 20), default is 
16. The probability of a FP slipping through per query is 1/2^b where b 
is the number of fingerprint bits. Using b=16 and -encoding PREnc 
(default) I have never seen a false positive, so just stick to the 
defaults :)

W dniu 19.11.2012 12:01, Rico Sennrich pisze:
> A quick followup: with the current commit (da39cff), the problem
> persists, but only if using "-encoding None". Since you fixed the
> alignment problem, I'll use the default encoding and hope for the
> best... :)


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to