Hi Rico, cmph is not probabilistic, but the check for phrases outside the set of training phrases (false positives) is. For encodings others than None, the compression also serves as a test, so the fp rate is much smaller then. You can also modifiy the fp probability by increasing the number of fingerprint bits, for instance to 20 (-fingerprint 20), default is 16. The probability of a FP slipping through per query is 1/2^b where b is the number of fingerprint bits. Using b=16 and -encoding PREnc (default) I have never seen a false positive, so just stick to the defaults :)
W dniu 19.11.2012 12:01, Rico Sennrich pisze: > A quick followup: with the current commit (da39cff), the problem > persists, but only if using "-encoding None". Since you fixed the > alignment problem, I'll use the default encoding and hope for the > best... :) _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
