On Thu, 2015-01-15 at 13:54 +0800, HOANG Cong Duy Vu wrote:

> - tune & test
> (based on source)
> size of overlap set = 624
> (based on target)
> size of overlap set = 386

> 
> (tune & test have high overlapping parts based on source sentences,
> but half of them have different target sentences)



Does this mean that there are hundreds of sentences in your original
tuning and test sets that are equal on the source side but have
different references? That sounds a bit odd. Maybe it indicates that
something about your data is generally problematic.



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to