We typically try to increase the tuning set in order to obtain more reliable sparse feature weights. But in your case it's rather the test set that seems a bit small for trusting the BLEU scores.
Do the sparse features give you any large improvement on the tuning set? On Thu, 2015-01-15 at 13:54 +0800, HOANG Cong Duy Vu wrote: > I used sparse features such as: TargetWordInsertionFeature, > SourceWordDeletionFeature, WordTranslationFeature, > PhraseLengthFeature. > Sparse features are used only for top source and target words (100, > 150, 200, 250, ....). > > > My parallel data include: train(201K); tune(6214); test(641). > > Is there any way to prevent over-fitting when applying the sparse > features? Or in this case, sparse features will not generalize well > over "unseen" data? -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
