On Thu, 2015-01-15 at 13:54 +0800, HOANG Cong Duy Vu wrote:
> - tune & test > (based on source) > size of overlap set = 624 > (based on target) > size of overlap set = 386 > > (tune & test have high overlapping parts based on source sentences, > but half of them have different target sentences) Does this mean that there are hundreds of sentences in your original tuning and test sets that are equal on the source side but have different references? That sounds a bit odd. Maybe it indicates that something about your data is generally problematic. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
